# Pydantic AI > GenAI Agent Framework, the Pydantic way Pydantic AI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI. # Introduction # Pydantic AI *GenAI Agent Framework, the Pydantic way* Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI. FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of [Pydantic Validation](https://docs.pydantic.dev) and modern Python features like type hints. Yet despite virtually every Python agent framework and LLM library using Pydantic Validation, when we began to use LLMs in [Pydantic Logfire](https://pydantic.dev/logfire), we couldn't find anything that gave us the same feeling. We built Pydantic AI with one simple aim: to bring that FastAPI feeling to GenAI app and agent development. ## Why use Pydantic AI 1. **Built by the Pydantic Team**: [Pydantic Validation](https://docs.pydantic.dev/latest/) is the validation layer of the OpenAI SDK, the Google ADK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more. *Why use the derivative when you can go straight to the source?* 1. **Model-agnostic**: Supports virtually every [model](https://ai.pydantic.dev/models/overview/index.md) and provider: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, and Perplexity; Azure AI Foundry, Amazon Bedrock, Google Vertex AI, Ollama, LiteLLM, Groq, OpenRouter, Together AI, Fireworks AI, Cerebras, Hugging Face, GitHub, Heroku, Vercel, Nebius, OVHcloud, Alibaba Cloud, SambaNova, and Outlines. If your favorite model or provider is not listed, you can easily implement a [custom model](https://ai.pydantic.dev/models/overview/#custom-models). 1. **Seamless Observability**: Tightly [integrates](https://ai.pydantic.dev/logfire/index.md) with [Pydantic Logfire](https://pydantic.dev/logfire), our general-purpose OpenTelemetry observability platform, for real-time debugging, evals-based performance monitoring, and behavior, tracing, and cost tracking. If you already have an observability platform that supports OTel, you can [use that too](https://ai.pydantic.dev/logfire/#alternative-observability-backends). 1. **Fully Type-safe**: Designed to give your IDE or AI coding agent as much context as possible for auto-completion and [type checking](https://ai.pydantic.dev/agent/#static-type-checking), moving entire classes of errors from runtime to write-time for a bit of that Rust "if it compiles, it works" feel. 1. **Powerful Evals**: Enables you to systematically test and [evaluate](https://ai.pydantic.dev/evals/index.md) the performance and accuracy of the agentic systems you build, and monitor the performance over time in Pydantic Logfire. 1. **MCP, A2A, and UI**: Integrates the [Model Context Protocol](https://ai.pydantic.dev/mcp/overview/index.md), [Agent2Agent](https://ai.pydantic.dev/a2a/index.md), and various [UI event stream](https://ai.pydantic.dev/ui/overview/index.md) standards to give your agent access to external tools and data, let it interoperate with other agents, and build interactive applications with streaming event-based communication. 1. **Human-in-the-Loop Tool Approval**: Easily lets you flag that certain tool calls [require approval](https://ai.pydantic.dev/deferred-tools/#human-in-the-loop-tool-approval) before they can proceed, possibly depending on tool call arguments, conversation history, or user preferences. 1. **Durable Execution**: Enables you to build [durable agents](https://ai.pydantic.dev/durable_execution/overview/index.md) that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. 1. **Streamed Outputs**: Provides the ability to [stream](https://ai.pydantic.dev/output/#streamed-results) structured output continuously, with immediate validation, ensuring real time access to generated data. 1. **Graph Support**: Provides a powerful way to define [graphs](https://ai.pydantic.dev/graph/index.md) using type hints, for use in complex applications where standard control flow can degrade to spaghetti code. Realistically though, no list is going to be as convincing as [giving it a try](#next-steps) and seeing how it makes you feel! **Sign up for our newsletter, *The Pydantic Stack*, with updates & tutorials on Pydantic AI, Logfire, and Pydantic:** Subscribe ## Hello World Example Here's a minimal example of Pydantic AI: [Learn about Gateway](https://ai.pydantic.dev/gateway) hello_world.py ```python from pydantic_ai import Agent agent = Agent( # (1)! 'gateway/anthropic:claude-sonnet-4-5', instructions='Be concise, reply with one sentence.', # (2)! ) result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. We configure the agent to use [Anthropic's Claude Sonnet 4.5](https://ai.pydantic.dev/api/models/anthropic/index.md) model, but you can also set the model when running the agent. 1. Register static [instructions](https://ai.pydantic.dev/agent/#instructions) using a keyword argument to the agent. 1. [Run the agent](https://ai.pydantic.dev/agent/#running-agents) synchronously, starting a conversation with the LLM. hello_world.py ```python from pydantic_ai import Agent agent = Agent( # (1)! 'anthropic:claude-sonnet-4-5', instructions='Be concise, reply with one sentence.', # (2)! ) result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. We configure the agent to use [Anthropic's Claude Sonnet 4.5](https://ai.pydantic.dev/api/models/anthropic/index.md) model, but you can also set the model when running the agent. 1. Register static [instructions](https://ai.pydantic.dev/agent/#instructions) using a keyword argument to the agent. 1. [Run the agent](https://ai.pydantic.dev/agent/#running-agents) synchronously, starting a conversation with the LLM. *(This example is complete, it can be run "as is", assuming you've [installed the `pydantic_ai` package](https://ai.pydantic.dev/install/index.md))* The exchange will be very short: Pydantic AI will send the instructions and the user prompt to the LLM, and the model will return a text response. Not very interesting yet, but we can easily add [tools](https://ai.pydantic.dev/tools/index.md), [dynamic instructions](https://ai.pydantic.dev/agent/#instructions), and [structured outputs](https://ai.pydantic.dev/output/index.md) to build more powerful agents. ## Tools & Dependency Injection Example Here is a concise example using Pydantic AI to build a support agent for a bank: [Learn about Gateway](https://ai.pydantic.dev/gateway) bank_support.py ```python from dataclasses import dataclass from pydantic import BaseModel, Field from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn @dataclass class SupportDependencies: # (3)! customer_id: int db: DatabaseConn # (12)! class SupportOutput(BaseModel): # (13)! support_advice: str = Field(description='Advice returned to the customer') block_card: bool = Field(description="Whether to block the customer's card") risk: int = Field(description='Risk level of query', ge=0, le=10) support_agent = Agent( # (1)! 'gateway/openai:gpt-5.2', # (2)! deps_type=SupportDependencies, output_type=SupportOutput, # (9)! instructions=( # (4)! 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) @support_agent.instructions # (5)! async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool # (6)! async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> float: """Returns the customer's current account balance.""" # (7)! return await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) ... # (11)! async def main(): deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = await support_agent.run('What is my balance?', deps=deps) # (8)! print(result.output) # (10)! """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = await support_agent.run('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` 1. This [agent](https://ai.pydantic.dev/agent/index.md) will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of output they return. In this case, the support agent has type `Agent[SupportDependencies, SupportOutput]`. 1. Here we configure the agent to use [OpenAI's GPT-5 model](https://ai.pydantic.dev/api/models/openai/index.md), you can also set the model when running the agent. 1. The `SupportDependencies` dataclass is used to pass data, connections, and logic into the model that will be needed when running [instructions](https://ai.pydantic.dev/agent/#instructions) and [tool](https://ai.pydantic.dev/tools/index.md) functions. Pydantic AI's system of dependency injection provides a [type-safe](https://ai.pydantic.dev/agent/#static-type-checking) way to customise the behavior of your agents, and can be especially useful when running [unit tests](https://ai.pydantic.dev/testing/index.md) and evals. 1. Static [instructions](https://ai.pydantic.dev/agent/#instructions) can be registered with the instructions keyword argument to the agent. 1. Dynamic [instructions](https://ai.pydantic.dev/agent/#instructions) can be registered with the @agent.instructions decorator, and can make use of dependency injection. Dependencies are carried via the RunContext argument, which is parameterized with the `deps_type` from above. If the type annotation here is wrong, static type checkers will catch it. 1. The [`@agent.tool`](https://ai.pydantic.dev/tools/index.md) decorator let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via RunContext, any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry. 1. The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are [extracted](https://ai.pydantic.dev/tools/#function-tools-and-schema) from the docstring and added to the parameter schema sent to the LLM. 1. [Run the agent](https://ai.pydantic.dev/agent/#running-agents) asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve an output. 1. The response from the agent will be guaranteed to be a `SupportOutput`. If validation fails [reflection](https://ai.pydantic.dev/agent/#reflection-and-self-correction), the agent is prompted to try again. 1. The output will be validated with Pydantic to guarantee it is a `SupportOutput`, since the agent is generic, it'll also be typed as a `SupportOutput` to aid with static type checking. 1. In a real use case, you'd add more tools and longer instructions to the agent to extend the context it's equipped with and support it can provide. 1. This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. 1. This [Pydantic](https://docs.pydantic.dev) model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run. bank_support.py ```python from dataclasses import dataclass from pydantic import BaseModel, Field from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn @dataclass class SupportDependencies: # (3)! customer_id: int db: DatabaseConn # (12)! class SupportOutput(BaseModel): # (13)! support_advice: str = Field(description='Advice returned to the customer') block_card: bool = Field(description="Whether to block the customer's card") risk: int = Field(description='Risk level of query', ge=0, le=10) support_agent = Agent( # (1)! 'openai:gpt-5.2', # (2)! deps_type=SupportDependencies, output_type=SupportOutput, # (9)! instructions=( # (4)! 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) @support_agent.instructions # (5)! async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool # (6)! async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> float: """Returns the customer's current account balance.""" # (7)! return await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) ... # (11)! async def main(): deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = await support_agent.run('What is my balance?', deps=deps) # (8)! print(result.output) # (10)! """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = await support_agent.run('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` 1. This [agent](https://ai.pydantic.dev/agent/index.md) will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of output they return. In this case, the support agent has type `Agent[SupportDependencies, SupportOutput]`. 1. Here we configure the agent to use [OpenAI's GPT-5 model](https://ai.pydantic.dev/api/models/openai/index.md), you can also set the model when running the agent. 1. The `SupportDependencies` dataclass is used to pass data, connections, and logic into the model that will be needed when running [instructions](https://ai.pydantic.dev/agent/#instructions) and [tool](https://ai.pydantic.dev/tools/index.md) functions. Pydantic AI's system of dependency injection provides a [type-safe](https://ai.pydantic.dev/agent/#static-type-checking) way to customise the behavior of your agents, and can be especially useful when running [unit tests](https://ai.pydantic.dev/testing/index.md) and evals. 1. Static [instructions](https://ai.pydantic.dev/agent/#instructions) can be registered with the instructions keyword argument to the agent. 1. Dynamic [instructions](https://ai.pydantic.dev/agent/#instructions) can be registered with the @agent.instructions decorator, and can make use of dependency injection. Dependencies are carried via the RunContext argument, which is parameterized with the `deps_type` from above. If the type annotation here is wrong, static type checkers will catch it. 1. The [`@agent.tool`](https://ai.pydantic.dev/tools/index.md) decorator let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via RunContext, any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry. 1. The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are [extracted](https://ai.pydantic.dev/tools/#function-tools-and-schema) from the docstring and added to the parameter schema sent to the LLM. 1. [Run the agent](https://ai.pydantic.dev/agent/#running-agents) asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve an output. 1. The response from the agent will be guaranteed to be a `SupportOutput`. If validation fails [reflection](https://ai.pydantic.dev/agent/#reflection-and-self-correction), the agent is prompted to try again. 1. The output will be validated with Pydantic to guarantee it is a `SupportOutput`, since the agent is generic, it'll also be typed as a `SupportOutput` to aid with static type checking. 1. In a real use case, you'd add more tools and longer instructions to the agent to extend the context it's equipped with and support it can provide. 1. This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. 1. This [Pydantic](https://docs.pydantic.dev) model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run. Complete `bank_support.py` example The code included here is incomplete for the sake of brevity (the definition of `DatabaseConn` is missing); you can find the complete `bank_support.py` example [here](https://ai.pydantic.dev/examples/bank-support/index.md). ## Instrumentation with Pydantic Logfire Even a simple agent with just a handful of tools can result in a lot of back-and-forth with the LLM, making it nearly impossible to be confident of what's going on just from reading the code. To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire. To do this, we need to [set up Logfire](https://ai.pydantic.dev/logfire/#using-logfire), and add the following to our code: [Learn about Gateway](https://ai.pydantic.dev/gateway) bank_support_with_logfire.py ```python ... from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn import logfire logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! logfire.instrument_sqlite3() # (3)! ... support_agent = Agent( 'gateway/openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) ``` 1. Configure the Logfire SDK, this will fail if project is not set up. 1. This will instrument all Pydantic AI agents used from here on out. If you want to instrument only a specific agent, you can pass the instrument=True keyword argument to the agent. 1. In our demo, `DatabaseConn` uses sqlite3 to connect to a PostgreSQL database, so [`logfire.instrument_sqlite3()`](https://logfire.pydantic.dev/docs/integrations/databases/sqlite3/) is used to log the database queries. bank_support_with_logfire.py ```python ... from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn import logfire logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! logfire.instrument_sqlite3() # (3)! ... support_agent = Agent( 'openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) ``` 1. Configure the Logfire SDK, this will fail if project is not set up. 1. This will instrument all Pydantic AI agents used from here on out. If you want to instrument only a specific agent, you can pass the instrument=True keyword argument to the agent. 1. In our demo, `DatabaseConn` uses sqlite3 to connect to a PostgreSQL database, so [`logfire.instrument_sqlite3()`](https://logfire.pydantic.dev/docs/integrations/databases/sqlite3/) is used to log the database queries. That's enough to get the following view of your agent in action: Logfire instrumentation for the bank agent — [View in Logfire](https://logfire-eu.pydantic.dev/public-trace/a2957caa-b7b7-4883-a529-777742649004?spanId=31aade41ab896144) See [Monitoring and Performance](https://ai.pydantic.dev/logfire/index.md) to learn more. ## `llms.txt` The Pydantic AI documentation is available in the [llms.txt](https://llmstxt.org/) format. This format is defined in Markdown and suited for LLMs and AI coding assistants and agents. Two formats are available: - [`llms.txt`](https://ai.pydantic.dev/llms.txt): a file containing a brief description of the project, along with links to the different sections of the documentation. The structure of this file is described in details [here](https://llmstxt.org/#format). - [`llms-full.txt`](https://ai.pydantic.dev/llms-full.txt): Similar to the `llms.txt` file, but every link content is included. Note that this file may be too large for some LLMs. As of today, these files are not automatically leveraged by IDEs or coding agents, but they will use it if you provide a link or the full text. ## Next Steps To try Pydantic AI for yourself, [install it](https://ai.pydantic.dev/install/index.md) and follow the instructions [in the examples](https://ai.pydantic.dev/examples/setup/index.md). Read the [docs](https://ai.pydantic.dev/agent/index.md) to learn more about building applications with Pydantic AI. Read the [API Reference](https://ai.pydantic.dev/api/agent/index.md) to understand Pydantic AI's interface. Join [:simple-slack: Slack](https://logfire.pydantic.dev/docs/join-slack/) or file an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues) if you have any questions. # Installation Pydantic AI is available on PyPI as [`pydantic-ai`](https://pypi.org/project/pydantic-ai/) so installation is as simple as: ```bash pip install pydantic-ai ``` ```bash uv add pydantic-ai ``` (Requires Python 3.10+) This installs the `pydantic_ai` package, core dependencies, and libraries required to use all the models included in Pydantic AI. If you want to install only those dependencies required to use a specific model, you can install the ["slim"](#slim-install) version of Pydantic AI. ## Use with Pydantic Logfire Pydantic AI has an excellent (but completely optional) integration with [Pydantic Logfire](https://pydantic.dev/logfire) to help you view and understand agent runs. Logfire comes included with `pydantic-ai` (but not the ["slim" version](#slim-install)), so you can typically start using it immediately by following the [Logfire setup docs](https://ai.pydantic.dev/logfire/#using-logfire). ## Running Examples We distribute the [`pydantic_ai_examples`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples) directory as a separate PyPI package ([`pydantic-ai-examples`](https://pypi.org/project/pydantic-ai-examples/)) to make examples extremely easy to customize and run. To install examples, use the `examples` optional group: ```bash pip install "pydantic-ai[examples]" ``` ```bash uv add "pydantic-ai[examples]" ``` To run the examples, follow instructions in the [examples docs](https://ai.pydantic.dev/examples/setup/index.md). ## Slim Install If you know which model you're going to use and want to avoid installing superfluous packages, you can use the [`pydantic-ai-slim`](https://pypi.org/project/pydantic-ai-slim/) package. For example, if you're using just OpenAIChatModel, you would run: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` `pydantic-ai-slim` has the following optional groups: - `logfire` — installs [Pydantic Logfire](https://ai.pydantic.dev/logfire/index.md) dependency `logfire` [PyPI ↗](https://pypi.org/project/logfire) - `evals` — installs [Pydantic Evals](https://ai.pydantic.dev/evals/index.md) dependency `pydantic-evals` [PyPI ↗](https://pypi.org/project/pydantic-evals) - `openai` — installs [OpenAI Model](https://ai.pydantic.dev/models/openai/index.md) dependency `openai` [PyPI ↗](https://pypi.org/project/openai) - `vertexai` — installs `GoogleVertexProvider` dependencies `google-auth` [PyPI ↗](https://pypi.org/project/google-auth) and `requests` [PyPI ↗](https://pypi.org/project/requests) - `google` — installs [Google Model](https://ai.pydantic.dev/models/google/index.md) dependency `google-genai` [PyPI ↗](https://pypi.org/project/google-genai) - `anthropic` — installs [Anthropic Model](https://ai.pydantic.dev/models/anthropic/index.md) dependency `anthropic` [PyPI ↗](https://pypi.org/project/anthropic) - `groq` — installs [Groq Model](https://ai.pydantic.dev/models/groq/index.md) dependency `groq` [PyPI ↗](https://pypi.org/project/groq) - `mistral` — installs [Mistral Model](https://ai.pydantic.dev/models/mistral/index.md) dependency `mistralai` [PyPI ↗](https://pypi.org/project/mistralai) - `cohere` - installs [Cohere Model](https://ai.pydantic.dev/models/cohere/index.md) dependency `cohere` [PyPI ↗](https://pypi.org/project/cohere) - `bedrock` - installs [Bedrock Model](https://ai.pydantic.dev/models/bedrock/index.md) dependency `boto3` [PyPI ↗](https://pypi.org/project/boto3) - `huggingface` - installs [Hugging Face Model](https://ai.pydantic.dev/models/huggingface/index.md) dependency `huggingface-hub[inference]` [PyPI ↗](https://pypi.org/project/huggingface-hub) - `outlines-transformers` - installs [Outlines Model](https://ai.pydantic.dev/models/outlines/index.md) dependency `outlines[transformers]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-llamacpp` - installs [Outlines Model](https://ai.pydantic.dev/models/outlines/index.md) dependency `outlines[llamacpp]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-mlxlm` - installs [Outlines Model](https://ai.pydantic.dev/models/outlines/index.md) dependency `outlines[mlxlm]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-sglang` - installs [Outlines Model](https://ai.pydantic.dev/models/outlines/index.md) dependency `outlines[sglang]` [PyPI ↗](https://pypi.org/project/outlines) - `outlines-vllm-offline` - installs [Outlines Model](https://ai.pydantic.dev/models/outlines/index.md) dependencies `outlines` [PyPI ↗](https://pypi.org/project/outlines) and `vllm` [PyPI ↗](https://pypi.org/project/vllm) - `duckduckgo` - installs [DuckDuckGo Search Tool](https://ai.pydantic.dev/common-tools/#duckduckgo-search-tool) dependency `ddgs` [PyPI ↗](https://pypi.org/project/ddgs) - `tavily` - installs [Tavily Search Tool](https://ai.pydantic.dev/common-tools/#tavily-search-tool) dependency `tavily-python` [PyPI ↗](https://pypi.org/project/tavily-python) - `exa` - installs [Exa Search Tool](https://ai.pydantic.dev/common-tools/#exa-search-tool) dependency `exa-py` [PyPI ↗](https://pypi.org/project/exa-py) - `cli` - installs [CLI](https://ai.pydantic.dev/cli/index.md) dependencies `rich` [PyPI ↗](https://pypi.org/project/rich), `prompt-toolkit` [PyPI ↗](https://pypi.org/project/prompt-toolkit), and `argcomplete` [PyPI ↗](https://pypi.org/project/argcomplete) - `mcp` - installs [MCP](https://ai.pydantic.dev/mcp/client/index.md) dependency `mcp` [PyPI ↗](https://pypi.org/project/mcp) - `fastmcp` - installs [FastMCP](https://ai.pydantic.dev/mcp/fastmcp-client/index.md) dependency `fastmcp` [PyPI ↗](https://pypi.org/project/fastmcp) - `a2a` - installs [A2A](https://ai.pydantic.dev/a2a/index.md) dependency `fasta2a` [PyPI ↗](https://pypi.org/project/fasta2a) - `ui` - installs [UI Event Streams](https://ai.pydantic.dev/ui/overview/index.md) dependency `starlette` [PyPI ↗](https://pypi.org/project/starlette) - `ag-ui` - installs [AG-UI Event Stream Protocol](https://ai.pydantic.dev/ui/ag-ui/index.md) dependencies `ag-ui-protocol` [PyPI ↗](https://pypi.org/project/ag-ui-protocol) and `starlette` [PyPI ↗](https://pypi.org/project/starlette) - `dbos` - installs [DBOS Durable Execution](https://ai.pydantic.dev/durable_execution/dbos/index.md) dependency `dbos` [PyPI ↗](https://pypi.org/project/dbos) - `prefect` - installs [Prefect Durable Execution](https://ai.pydantic.dev/durable_execution/prefect/index.md) dependency `prefect` [PyPI ↗](https://pypi.org/project/prefect) You can also install dependencies for multiple models and use cases, for example: ```bash pip install "pydantic-ai-slim[openai,google,logfire]" ``` ```bash uv add "pydantic-ai-slim[openai,google,logfire]" ``` # Getting Help If you need help getting started with Pydantic AI or with advanced usage, the following sources may be useful. ## :simple-slack: Slack Join the `#pydantic-ai` channel in the [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) to ask questions, get help, and chat about Pydantic AI. There's also channels for Pydantic, Logfire, and FastUI. If you're on a [Logfire](https://pydantic.dev/logfire) Pro plan, you can also get a dedicated private slack collab channel with us. ## GitHub Issues The [Pydantic AI GitHub Issues](https://github.com/pydantic/pydantic-ai/issues) are a great place to ask questions and give us feedback. # Troubleshooting Below are suggestions on how to fix some common errors you might encounter while using Pydantic AI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the [Pydantic Slack](https://ai.pydantic.dev/help/index.md) or create an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues). ## Jupyter Notebook Errors ### `RuntimeError: This event loop is already running` **Modern Jupyter/IPython (7.0+)**: This environment supports top-level `await` natively. You can use Agent.run() directly in notebook cells without additional setup: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2') result = await agent.run('Who let the dogs out?') ``` ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result = await agent.run('Who let the dogs out?') ``` **Legacy environments or specific integrations**: If you encounter event loop conflicts, use [`nest-asyncio`](https://pypi.org/project/nest-asyncio/): [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python import nest_asyncio from pydantic_ai import Agent nest_asyncio.apply() agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('Who let the dogs out?') ``` ```python import nest_asyncio from pydantic_ai import Agent nest_asyncio.apply() agent = Agent('openai:gpt-5.2') result = agent.run_sync('Who let the dogs out?') ``` **Note**: This also applies to Google Colab and [Marimo](https://github.com/marimo-team/marimo) environments. ## API Key Configuration ### `UserError: API key must be provided or set in the [MODEL]_API_KEY environment variable` If you're running into issues with setting the API key for your model, visit the [Models](https://ai.pydantic.dev/models/overview/index.md) page to learn more about how to set an environment variable and/or pass in an `api_key` argument. ## Monitoring HTTPX Requests You can use custom `httpx` clients in your models in order to access specific requests, responses, and headers at runtime. It's particularly helpful to use `logfire`'s [HTTPX integration](https://ai.pydantic.dev/logfire/#monitoring-http-requests) to monitor the above. # Concepts documentation # Agent2Agent (A2A) Protocol The [Agent2Agent (A2A) Protocol](https://google.github.io/A2A/) is an open standard introduced by Google that enables communication and interoperability between AI agents, regardless of the framework or vendor they are built on. At Pydantic, we built the [FastA2A](#fasta2a) library to make it easier to implement the A2A protocol in Python. We also built a convenience method that expose Pydantic AI agents as A2A servers - let's have a quick look at how to use it: [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_to_a2a.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` agent_to_a2a.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` *You can run the example with `uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000`* This will expose the agent as an A2A server, and you can start sending requests to it. See more about [exposing Pydantic AI agents as A2A servers](#pydantic-ai-agent-to-a2a-server). ## FastA2A **FastA2A** is an agentic framework agnostic implementation of the A2A protocol in Python. The library is designed to be used with any agentic framework, and is **not exclusive to Pydantic AI**. ### Design **FastA2A** is built on top of [Starlette](https://www.starlette.io), which means it's fully compatible with any ASGI server. Given the nature of the A2A protocol, it's important to understand the design before using it, as a developer you'll need to provide some components: - Storage: to save and load tasks, as well as store context for conversations - Broker: to schedule tasks - Worker: to execute tasks Let's have a look at how those components fit together: ``` flowchart TB Server["HTTP Server"] <--> |Sends Requests/
Receives Results| TM subgraph CC[Core Components] direction RL TM["TaskManager
(coordinates)"] --> |Schedules Tasks| Broker TM <--> Storage Broker["Broker
(queues & schedules)"] <--> Storage["Storage
(persistence)"] Broker --> |Delegates Execution| Worker end Worker["Worker
(implementation)"] ``` FastA2A allows you to bring your own Storage, Broker and Worker. #### Understanding Tasks and Context In the A2A protocol: - **Task**: Represents one complete execution of an agent. When a client sends a message to the agent, a new task is created. The agent runs until completion (or failure), and this entire execution is considered one task. The final output is stored as a task artifact. - **Context**: Represents a conversation thread that can span multiple tasks. The A2A protocol uses a `context_id` to maintain conversation continuity: - When a new message is sent without a `context_id`, the server generates a new one - Subsequent messages can include the same `context_id` to continue the conversation - All tasks sharing the same `context_id` have access to the complete message history #### Storage Architecture The Storage component serves two purposes: 1. **Task Storage**: Stores tasks in A2A protocol format, including their status, artifacts, and message history 1. **Context Storage**: Stores conversation context in a format optimized for the specific agent implementation This design allows for agents to store rich internal state (e.g., tool calls, reasoning traces) as well as store task-specific A2A-formatted messages and artifacts. For example, a Pydantic AI agent might store its complete internal message format (including tool calls and responses) in the context storage, while storing only the A2A-compliant messages in the task history. ### Installation FastA2A is available on PyPI as [`fasta2a`](https://pypi.org/project/fasta2a/) so installation is as simple as: ```bash pip install fasta2a ``` ```bash uv add fasta2a ``` The only dependencies are: - [starlette](https://www.starlette.io): to expose the A2A server as an [ASGI application](https://asgi.readthedocs.io/en/latest/) - [pydantic](https://pydantic.dev): to validate the request/response messages - [opentelemetry-api](https://opentelemetry-python.readthedocs.io/en/latest): to provide tracing capabilities You can install Pydantic AI with the `a2a` extra to include **FastA2A**: ```bash pip install 'pydantic-ai-slim[a2a]' ``` ```bash uv add 'pydantic-ai-slim[a2a]' ``` ### Pydantic AI Agent to A2A Server To expose a Pydantic AI agent as an A2A server, you can use the `to_a2a` method: [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_to_a2a.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` agent_to_a2a.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = agent.to_a2a() ``` Since `app` is an ASGI application, it can be used with any ASGI server. ```bash uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000 ``` Since the goal of `to_a2a` is to be a convenience method, it accepts the same arguments as the FastA2A constructor. When using `to_a2a()`, Pydantic AI automatically: - Stores the complete conversation history (including tool calls and responses) in the context storage - Ensures that subsequent messages with the same `context_id` have access to the full conversation history - Persists agent results as A2A artifacts: - String results become `TextPart` artifacts and also appear in the message history - Structured data (Pydantic models, dataclasses, tuples, etc.) become `DataPart` artifacts with the data wrapped as `{"result": }` - Artifacts include metadata with type information and JSON schema when available # Built-in Tools Built-in tools are native tools provided by LLM providers that can be used to enhance your agent's capabilities. Unlike [common tools](https://ai.pydantic.dev/common-tools/index.md), which are custom implementations that Pydantic AI executes, built-in tools are executed directly by the model provider. ## Overview Pydantic AI supports the following built-in tools: - **WebSearchTool**: Allows agents to search the web - **CodeExecutionTool**: Enables agents to execute code in a secure environment - **ImageGenerationTool**: Enables agents to generate images - **WebFetchTool**: Enables agents to fetch web pages - **MemoryTool**: Enables agents to use memory - **MCPServerTool**: Enables agents to use remote MCP servers with communication handled by the model provider - **FileSearchTool**: Enables agents to search through uploaded files using vector search (RAG) These tools are passed to the agent via the `builtin_tools` parameter and are executed by the model provider's infrastructure. Provider Support Not all model providers support built-in tools. If you use a built-in tool with an unsupported provider, Pydantic AI will raise a UserError when you try to run the agent. If a provider supports a built-in tool that is not currently supported by Pydantic AI, please file an issue. ## Dynamic Configuration Sometimes you need to configure a built-in tool dynamically based on the run context (e.g., user dependencies), or conditionally omit it. You can achieve this by passing a function to `builtin_tools` that takes RunContext as an argument and returns an AbstractBuiltinTool or `None`. This is particularly useful for tools like WebSearchTool where you might want to set the user's location based on the current request, or disable the tool if the user provides no location. [Learn about Gateway](https://ai.pydantic.dev/gateway) dynamic_builtin_tool.py ```python from pydantic_ai import Agent, RunContext, WebSearchTool async def prepared_web_search(ctx: RunContext[dict]) -> WebSearchTool | None: if not ctx.deps.get('location'): return None return WebSearchTool( user_location={'city': ctx.deps['location']}, ) agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[prepared_web_search], deps_type=dict, ) # Run with location result = agent.run_sync( 'What is the weather like?', deps={'location': 'London'}, ) print(result.output) #> It's currently raining in London. # Run without location (tool will be omitted) result = agent.run_sync( 'What is the capital of France?', deps={'location': None}, ) print(result.output) #> The capital of France is Paris. ``` dynamic_builtin_tool.py ```python from pydantic_ai import Agent, RunContext, WebSearchTool async def prepared_web_search(ctx: RunContext[dict]) -> WebSearchTool | None: if not ctx.deps.get('location'): return None return WebSearchTool( user_location={'city': ctx.deps['location']}, ) agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[prepared_web_search], deps_type=dict, ) # Run with location result = agent.run_sync( 'What is the weather like?', deps={'location': 'London'}, ) print(result.output) #> It's currently raining in London. # Run without location (tool will be omitted) result = agent.run_sync( 'What is the capital of France?', deps={'location': None}, ) print(result.output) #> The capital of France is Paris. ``` ## Web Search Tool The WebSearchTool allows your agent to search the web, making it ideal for queries that require up-to-date data. ### Provider Support | Provider | Supported | Notes | | ----------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | OpenAI Responses | ✅ | Full feature support. To include search results on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls, enable the OpenAIResponsesModelSettings.openai_include_web_search_sources [model setting](https://ai.pydantic.dev/agent/#model-run-settings). | | Anthropic | ✅ | Full feature support | | Google | ✅ | No parameter support. No BuiltinToolCallPart or BuiltinToolReturnPart is generated when streaming. Using built-in tools and function tools (including [output tools](https://ai.pydantic.dev/output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](https://ai.pydantic.dev/output/#prompted-output) instead. | | xAI | ✅ | Supports `blocked_domains` and `allowed_domains` parameters. | | Groq | ✅ | Limited parameter support. To use web search capabilities with Groq, you need to use the [compound models](https://console.groq.com/docs/compound). | | OpenAI Chat Completions | ❌ | Not supported | | Bedrock | ❌ | Not supported | | Mistral | ❌ | Not supported | | Cohere | ❌ | Not supported | | HuggingFace | ❌ | Not supported | | Outlines | ❌ | Not supported | ### Usage [Learn about Gateway](https://ai.pydantic.dev/gateway) web_search_anthropic.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('gateway/anthropic:claude-sonnet-4-5', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` web_search_anthropic.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('anthropic:claude-sonnet-4-5', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` *(This example is complete, it can be run "as is")* With OpenAI, you must use their Responses API to access the web search tool. [Learn about Gateway](https://ai.pydantic.dev/gateway) web_search_openai.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('gateway/openai-responses:gpt-5.2', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` web_search_openai.py ```python from pydantic_ai import Agent, WebSearchTool agent = Agent('openai-responses:gpt-5.2', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` *(This example is complete, it can be run "as is")* ### Configuration Options The `WebSearchTool` supports several configuration parameters: [Learn about Gateway](https://ai.pydantic.dev/gateway) web_search_configured.py ```python from pydantic_ai import Agent, WebSearchTool, WebSearchUserLocation agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', builtin_tools=[ WebSearchTool( search_context_size='high', user_location=WebSearchUserLocation( city='San Francisco', country='US', region='CA', timezone='America/Los_Angeles', ), blocked_domains=['example.com', 'spam-site.net'], allowed_domains=None, # Cannot use both blocked_domains and allowed_domains with Anthropic max_uses=5, # Anthropic only: limit tool usage ) ], ) result = agent.run_sync('Use the web to get the current time.') print(result.output) #> In San Francisco, it's 8:21:41 pm PDT on Wednesday, August 6, 2025. ``` web_search_configured.py ```python from pydantic_ai import Agent, WebSearchTool, WebSearchUserLocation agent = Agent( 'anthropic:claude-sonnet-4-5', builtin_tools=[ WebSearchTool( search_context_size='high', user_location=WebSearchUserLocation( city='San Francisco', country='US', region='CA', timezone='America/Los_Angeles', ), blocked_domains=['example.com', 'spam-site.net'], allowed_domains=None, # Cannot use both blocked_domains and allowed_domains with Anthropic max_uses=5, # Anthropic only: limit tool usage ) ], ) result = agent.run_sync('Use the web to get the current time.') print(result.output) #> In San Francisco, it's 8:21:41 pm PDT on Wednesday, August 6, 2025. ``` *(This example is complete, it can be run "as is")* #### Provider Support | Parameter | OpenAI | Anthropic | xAI | Groq | | --------------------- | ------ | --------- | --- | ---- | | `search_context_size` | ✅ | ❌ | ❌ | ❌ | | `user_location` | ✅ | ✅ | ❌ | ❌ | | `blocked_domains` | ❌ | ✅ | ✅ | ✅ | | `allowed_domains` | ✅ | ✅ | ✅ | ✅ | | `max_uses` | ❌ | ✅ | ❌ | ❌ | Anthropic Domain Filtering With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both. ## Code Execution Tool The CodeExecutionTool enables your agent to execute code in a secure environment, making it perfect for computational tasks, data analysis, and mathematical operations. ### Provider Support | Provider | Supported | Notes | | ----------- | --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | OpenAI | ✅ | To include code execution output on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls, enable the OpenAIResponsesModelSettings.openai_include_code_execution_outputs [model setting](https://ai.pydantic.dev/agent/#model-run-settings). If the code execution generated images, like charts, they will be available on ModelResponse.images as BinaryImage objects. The generated image can also be used as [image output](https://ai.pydantic.dev/output/#image-output) for the agent run. | | Google | ✅ | Using built-in tools and function tools (including [output tools](https://ai.pydantic.dev/output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](https://ai.pydantic.dev/output/#prompted-output) instead. | | Anthropic | ✅ | | | xAI | ✅ | Full feature support. | | Groq | ❌ | | | Bedrock | ✅ | Only available for Nova 2.0 models. | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | | Outlines | ❌ | | ### Usage [Learn about Gateway](https://ai.pydantic.dev/gateway) code_execution_basic.py ```python from pydantic_ai import Agent, CodeExecutionTool agent = Agent('gateway/anthropic:claude-sonnet-4-5', builtin_tools=[CodeExecutionTool()]) result = agent.run_sync('Calculate the factorial of 15.') print(result.output) #> The factorial of 15 is **1,307,674,368,000**. print(result.response.builtin_tool_calls) """ [ ( BuiltinToolCallPart( tool_name='code_execution', args={ 'code': 'import math\n\n# Calculate factorial of 15\nresult = math.factorial(15)\nprint(f"15! = {result}")\n\n# Let\'s also show it in a more readable format with commas\nprint(f"15! = {result:,}")' }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', provider_name='anthropic', ), BuiltinToolReturnPart( tool_name='code_execution', content={ 'content': [], 'return_code': 0, 'stderr': '', 'stdout': '15! = 1307674368000\n15! = 1,307,674,368,000', 'type': 'code_execution_result', }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', timestamp=datetime.datetime(...), provider_name='anthropic', ), ) ] """ ``` code_execution_basic.py ```python from pydantic_ai import Agent, CodeExecutionTool agent = Agent('anthropic:claude-sonnet-4-5', builtin_tools=[CodeExecutionTool()]) result = agent.run_sync('Calculate the factorial of 15.') print(result.output) #> The factorial of 15 is **1,307,674,368,000**. print(result.response.builtin_tool_calls) """ [ ( BuiltinToolCallPart( tool_name='code_execution', args={ 'code': 'import math\n\n# Calculate factorial of 15\nresult = math.factorial(15)\nprint(f"15! = {result}")\n\n# Let\'s also show it in a more readable format with commas\nprint(f"15! = {result:,}")' }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', provider_name='anthropic', ), BuiltinToolReturnPart( tool_name='code_execution', content={ 'content': [], 'return_code': 0, 'stderr': '', 'stdout': '15! = 1307674368000\n15! = 1,307,674,368,000', 'type': 'code_execution_result', }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', timestamp=datetime.datetime(...), provider_name='anthropic', ), ) ] """ ``` *(This example is complete, it can be run "as is")* In addition to text output, code execution with OpenAI can generate images as part of their response. Accessing this image via ModelResponse.images or [image output](https://ai.pydantic.dev/output/#image-output) requires the OpenAIResponsesModelSettings.openai_include_code_execution_outputs [model setting](https://ai.pydantic.dev/agent/#model-run-settings) to be enabled. [Learn about Gateway](https://ai.pydantic.dev/gateway) code_execution_openai.py ```python from pydantic_ai import Agent, BinaryImage, CodeExecutionTool from pydantic_ai.models.openai import OpenAIResponsesModelSettings agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[CodeExecutionTool()], output_type=BinaryImage, model_settings=OpenAIResponsesModelSettings(openai_include_code_execution_outputs=True), ) result = agent.run_sync('Generate a chart of y=x^2 for x=-5 to 5.') assert isinstance(result.output, BinaryImage) ``` code_execution_openai.py ```python from pydantic_ai import Agent, BinaryImage, CodeExecutionTool from pydantic_ai.models.openai import OpenAIResponsesModelSettings agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[CodeExecutionTool()], output_type=BinaryImage, model_settings=OpenAIResponsesModelSettings(openai_include_code_execution_outputs=True), ) result = agent.run_sync('Generate a chart of y=x^2 for x=-5 to 5.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* ## Image Generation Tool The ImageGenerationTool enables your agent to generate images. ### Provider Support | Provider | Supported | Notes | | ---------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | OpenAI Responses | ✅ | Full feature support. Only supported by models newer than `gpt-5.2`. Metadata about the generated image, like the [`revised_prompt`](https://platform.openai.com/docs/guides/tools-image-generation#revised-prompt) sent to the underlying image model, is available on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls. | | Google | ✅ | Limited parameter support. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-3-pro-image-preview` and `gemini-3-pro-image-preview`. These models do not support [function tools](https://ai.pydantic.dev/tools/index.md) and will always have the option of generating images, even if this built-in tool is not explicitly specified. | | Anthropic | ❌ | | | xAI | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage Generated images are available on ModelResponse.images as BinaryImage objects: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_openai.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent('gateway/openai-responses:gpt-5.2', builtin_tools=[ImageGenerationTool()]) result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` image_generation_openai.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent('openai-responses:gpt-5.2', builtin_tools=[ImageGenerationTool()]) result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` *(This example is complete, it can be run "as is")* Image generation with Google [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) does not require the `ImageGenerationTool` built-in tool to be explicitly specified: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_google.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('gateway/gemini:gemini-3-pro-image-preview') result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` image_generation_google.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('google-gla:gemini-3-pro-image-preview') result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` *(This example is complete, it can be run "as is")* The `ImageGenerationTool` can be used together with `output_type=BinaryImage` to get [image output](https://ai.pydantic.dev/output/#image-output). If the `ImageGenerationTool` built-in tool is not explicitly specified, it will be enabled automatically: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('gateway/openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` image_generation_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* ### Configuration Options The `ImageGenerationTool` supports several configuration parameters: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_configured.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[ ImageGenerationTool( background='transparent', input_fidelity='high', moderation='low', output_compression=100, output_format='png', partial_images=3, quality='high', size='1024x1024', ) ], output_type=BinaryImage, ) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` image_generation_configured.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ ImageGenerationTool( background='transparent', input_fidelity='high', moderation='low', output_compression=100, output_format='png', partial_images=3, quality='high', size='1024x1024', ) ], output_type=BinaryImage, ) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* OpenAI Responses models also respect the `aspect_ratio` parameter. Because the OpenAI API only exposes discrete image sizes, Pydantic AI maps `'1:1'` -> `1024x1024`, `'2:3'` -> `1024x1536`, and `'3:2'` -> `1536x1024`. Providing any other aspect ratio results in an error, and if you also set `size` it must match the computed value. To control the aspect ratio when using Gemini image models, include the `ImageGenerationTool` explicitly: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_google_aspect_ratio.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'gateway/gemini:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9')], output_type=BinaryImage, ) result = agent.run_sync('Generate a wide illustration of an axolotl city skyline.') assert isinstance(result.output, BinaryImage) ``` image_generation_google_aspect_ratio.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'google-gla:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9')], output_type=BinaryImage, ) result = agent.run_sync('Generate a wide illustration of an axolotl city skyline.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* To control the image resolution with Google image generation models (starting with Gemini 3 Pro Image), use the `size` parameter: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_generation_google_resolution.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'gateway/gemini:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9', size='4K')], output_type=BinaryImage, ) result = agent.run_sync('Generate a high-resolution wide landscape illustration of an axolotl.') assert isinstance(result.output, BinaryImage) ``` image_generation_google_resolution.py ```python from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'google-gla:gemini-3-pro-image-preview', builtin_tools=[ImageGenerationTool(aspect_ratio='16:9', size='4K')], output_type=BinaryImage, ) result = agent.run_sync('Generate a high-resolution wide landscape illustration of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* For more details, check the API documentation. #### Provider Support | Parameter | OpenAI | Google | | -------------------- | ---------------------------------------------------- | -------------------------------------------------------- | | `background` | ✅ | ❌ | | `input_fidelity` | ✅ | ❌ | | `moderation` | ✅ | ❌ | | `output_compression` | ✅ (100 (default), jpeg or webp only) | ✅ (75 (default), jpeg only, Vertex AI only) | | `output_format` | ✅ | ✅ (Vertex AI only) | | `partial_images` | ✅ | ❌ | | `quality` | ✅ | ❌ | | `size` | ✅ (auto (default), 1024x1024, 1024x1536, 1536x1024) | ✅ (1K (default), 2K, 4K) | | `aspect_ratio` | ✅ (1:1, 2:3, 3:2) | ✅ (1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9) | Notes - **OpenAI**: `auto` lets the model select the value. - **Google (Vertex AI)**: Setting `output_compression` will default `output_format` to `jpeg` if not specified. ## Web Fetch Tool The WebFetchTool enables your agent to pull URL contents into its context, allowing it to pull up-to-date information from the web. ### Provider Support | Provider | Supported | Notes | | ----------- | --------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Anthropic | ✅ | Full feature support. Uses Anthropic's [Web Fetch Tool](https://docs.claude.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) internally to retrieve URL contents. | | Google | ✅ | No parameter support. The limits are fixed at 20 URLs per request with a maximum of 34MB per URL. Using built-in tools and function tools (including [output tools](https://ai.pydantic.dev/output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](https://ai.pydantic.dev/output/#prompted-output) instead. | | xAI | ❌ | Web browsing is implemented as part of [`WebSearchTool`](#web-search-tool) with xAI. | | OpenAI | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | | Outlines | ❌ | | ### Usage [Learn about Gateway](https://ai.pydantic.dev/gateway) web_fetch_basic.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent('gateway/gemini:gemini-3-flash-preview', builtin_tools=[WebFetchTool()]) result = agent.run_sync('What is this? https://ai.pydantic.dev') print(result.output) #> A Python agent framework for building Generative AI applications. ``` web_fetch_basic.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent('google-gla:gemini-3-flash-preview', builtin_tools=[WebFetchTool()]) result = agent.run_sync('What is this? https://ai.pydantic.dev') print(result.output) #> A Python agent framework for building Generative AI applications. ``` *(This example is complete, it can be run "as is")* ### Configuration Options The `WebFetchTool` supports several configuration parameters: [Learn about Gateway](https://ai.pydantic.dev/gateway) web_fetch_configured.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', builtin_tools=[ WebFetchTool( allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'], max_uses=10, enable_citations=True, max_content_tokens=50000, ) ], ) result = agent.run_sync( 'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev' ) print(result.output) """ Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience. """ ``` web_fetch_configured.py ```python from pydantic_ai import Agent, WebFetchTool agent = Agent( 'anthropic:claude-sonnet-4-5', builtin_tools=[ WebFetchTool( allowed_domains=['ai.pydantic.dev', 'docs.pydantic.dev'], max_uses=10, enable_citations=True, max_content_tokens=50000, ) ], ) result = agent.run_sync( 'Compare the documentation at https://ai.pydantic.dev and https://docs.pydantic.dev' ) print(result.output) """ Both sites provide comprehensive documentation for Pydantic projects. ai.pydantic.dev focuses on PydanticAI, a framework for building AI agents, while docs.pydantic.dev covers Pydantic, the data validation library. They share similar documentation styles and both emphasize type safety and developer experience. """ ``` *(This example is complete, it can be run "as is")* #### Provider Support | Parameter | Anthropic | Google | | -------------------- | --------- | ------ | | `max_uses` | ✅ | ❌ | | `allowed_domains` | ✅ | ❌ | | `blocked_domains` | ✅ | ❌ | | `enable_citations` | ✅ | ❌ | | `max_content_tokens` | ✅ | ❌ | Anthropic Domain Filtering With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both. ## Memory Tool The MemoryTool enables your agent to use memory. ### Provider Support | Provider | Supported | Notes | | ----------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Anthropic | ✅ | Requires a tool named `memory` to be defined that implements [specific sub-commands](https://docs.claude.com/en/docs/agents-and-tools/tool-use/memory-tool#tool-commands). You can use a subclass of [`anthropic.lib.tools.BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) as documented below. | | Google | ❌ | | | OpenAI | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage The Anthropic SDK provides an abstract [`BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) class that you can subclass to create your own memory storage solution (e.g., database, cloud storage, encrypted files, etc.). Their [`LocalFilesystemMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py) example can serve as a starting point. The following example uses a subclass that hard-codes a specific memory. The bits specific to Pydantic AI are the `MemoryTool` built-in tool and the `memory` tool definition that forwards commands to the `call` method of the `BetaAbstractMemoryTool` subclass. [Learn about Gateway](https://ai.pydantic.dev/gateway) anthropic_memory.py ```python from typing import Any from anthropic.lib.tools import BetaAbstractMemoryTool from anthropic.types.beta import ( BetaMemoryTool20250818CreateCommand, BetaMemoryTool20250818DeleteCommand, BetaMemoryTool20250818InsertCommand, BetaMemoryTool20250818RenameCommand, BetaMemoryTool20250818StrReplaceCommand, BetaMemoryTool20250818ViewCommand, ) from pydantic_ai import Agent, MemoryTool class FakeMemoryTool(BetaAbstractMemoryTool): def view(self, command: BetaMemoryTool20250818ViewCommand) -> str: return 'The user lives in Mexico City.' def create(self, command: BetaMemoryTool20250818CreateCommand) -> str: return f'File created successfully at {command.path}' def str_replace(self, command: BetaMemoryTool20250818StrReplaceCommand) -> str: return f'File {command.path} has been edited' def insert(self, command: BetaMemoryTool20250818InsertCommand) -> str: return f'Text inserted at line {command.insert_line} in {command.path}' def delete(self, command: BetaMemoryTool20250818DeleteCommand) -> str: return f'File deleted: {command.path}' def rename(self, command: BetaMemoryTool20250818RenameCommand) -> str: return f'Renamed {command.old_path} to {command.new_path}' def clear_all_memory(self) -> str: return 'All memory cleared' fake_memory = FakeMemoryTool() agent = Agent('gateway/anthropic:claude-sonnet-4-5', builtin_tools=[MemoryTool()]) @agent.tool_plain def memory(**command: Any) -> Any: return fake_memory.call(command) result = agent.run_sync('Remember that I live in Mexico City') print(result.output) """ Got it! I've recorded that you live in Mexico City. I'll remember this for future reference. """ result = agent.run_sync('Where do I live?') print(result.output) #> You live in Mexico City. ``` anthropic_memory.py ```python from typing import Any from anthropic.lib.tools import BetaAbstractMemoryTool from anthropic.types.beta import ( BetaMemoryTool20250818CreateCommand, BetaMemoryTool20250818DeleteCommand, BetaMemoryTool20250818InsertCommand, BetaMemoryTool20250818RenameCommand, BetaMemoryTool20250818StrReplaceCommand, BetaMemoryTool20250818ViewCommand, ) from pydantic_ai import Agent, MemoryTool class FakeMemoryTool(BetaAbstractMemoryTool): def view(self, command: BetaMemoryTool20250818ViewCommand) -> str: return 'The user lives in Mexico City.' def create(self, command: BetaMemoryTool20250818CreateCommand) -> str: return f'File created successfully at {command.path}' def str_replace(self, command: BetaMemoryTool20250818StrReplaceCommand) -> str: return f'File {command.path} has been edited' def insert(self, command: BetaMemoryTool20250818InsertCommand) -> str: return f'Text inserted at line {command.insert_line} in {command.path}' def delete(self, command: BetaMemoryTool20250818DeleteCommand) -> str: return f'File deleted: {command.path}' def rename(self, command: BetaMemoryTool20250818RenameCommand) -> str: return f'Renamed {command.old_path} to {command.new_path}' def clear_all_memory(self) -> str: return 'All memory cleared' fake_memory = FakeMemoryTool() agent = Agent('anthropic:claude-sonnet-4-5', builtin_tools=[MemoryTool()]) @agent.tool_plain def memory(**command: Any) -> Any: return fake_memory.call(command) result = agent.run_sync('Remember that I live in Mexico City') print(result.output) """ Got it! I've recorded that you live in Mexico City. I'll remember this for future reference. """ result = agent.run_sync('Where do I live?') print(result.output) #> You live in Mexico City. ``` *(This example is complete, it can be run "as is")* ## MCP Server Tool The MCPServerTool allows your agent to use remote MCP servers with communication handled by the model provider. This requires the MCP server to live at a public URL the provider can reach and does not support many of the advanced features of Pydantic AI's agent-side [MCP support](https://ai.pydantic.dev/mcp/client/index.md), but can result in optimized context use and caching, and faster performance due to the lack of a round-trip back to Pydantic AI. ### Provider Support | Provider | Supported | Notes | | ----------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | OpenAI Responses | ✅ | Full feature support. [Connectors](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) can be used by specifying a special `x-openai-connector:` URL. | | Anthropic | ✅ | Full feature support | | xAI | ✅ | Full feature support | | Google | ❌ | Not supported | | Groq | ❌ | Not supported | | OpenAI Chat Completions | ❌ | Not supported | | Bedrock | ❌ | Not supported | | Mistral | ❌ | Not supported | | Cohere | ❌ | Not supported | | HuggingFace | ❌ | Not supported | ### Usage [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server_anthropic.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. mcp_server_anthropic.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'anthropic:claude-sonnet-4-5', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. *(This example is complete, it can be run "as is")* With OpenAI, you must use their Responses API to access the MCP server tool: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server_openai.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. mcp_server_openai.py ```python from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='deepwiki', url='https://mcp.deepwiki.com/mcp', # (1) ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [DeepWiki MCP server](https://docs.devin.ai/work-with-devin/deepwiki-mcp) does not require authorization. *(This example is complete, it can be run "as is")* ### Configuration Options The `MCPServerTool` supports several configuration parameters for custom MCP servers: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server_configured_url.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='github', url='https://api.githubcopilot.com/mcp/', authorization_token=os.getenv('GITHUB_ACCESS_TOKEN', 'mock-access-token'), # (1) allowed_tools=['search_repositories', 'list_commits'], description='GitHub MCP server', headers={'X-Custom-Header': 'custom-value'}, ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [GitHub MCP server](https://github.com/github/github-mcp-server) requires an authorization token. mcp_server_configured_url.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='github', url='https://api.githubcopilot.com/mcp/', authorization_token=os.getenv('GITHUB_ACCESS_TOKEN', 'mock-access-token'), # (1) allowed_tools=['search_repositories', 'list_commits'], description='GitHub MCP server', headers={'X-Custom-Header': 'custom-value'}, ) ] ) result = agent.run_sync('Tell me about the pydantic/pydantic-ai repo.') print(result.output) """ The pydantic/pydantic-ai repo is a Python agent framework for building Generative AI applications. """ ``` 1. The [GitHub MCP server](https://github.com/github/github-mcp-server) requires an authorization token. For OpenAI Responses, you can use a [connector](https://platform.openai.com/docs/guides/tools-connectors-mcp#connectors) by specifying a special `x-openai-connector:` URL: *(This example is complete, it can be run "as is")* [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server_configured_connector_id.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'gateway/openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='google-calendar', url='x-openai-connector:connector_googlecalendar', authorization_token=os.getenv('GOOGLE_API_KEY', 'mock-api-key'), # (1) ) ] ) result = agent.run_sync('What do I have on my calendar today?') print(result.output) #> You're going to spend all day playing with Pydantic AI. ``` 1. OpenAI's Google Calendar connector requires an [authorization token](https://platform.openai.com/docs/guides/tools-connectors-mcp#authorizing-a-connector). mcp_server_configured_connector_id.py ```python import os from pydantic_ai import Agent, MCPServerTool agent = Agent( 'openai-responses:gpt-5.2', builtin_tools=[ MCPServerTool( id='google-calendar', url='x-openai-connector:connector_googlecalendar', authorization_token=os.getenv('GOOGLE_API_KEY', 'mock-api-key'), # (1) ) ] ) result = agent.run_sync('What do I have on my calendar today?') print(result.output) #> You're going to spend all day playing with Pydantic AI. ``` 1. OpenAI's Google Calendar connector requires an [authorization token](https://platform.openai.com/docs/guides/tools-connectors-mcp#authorizing-a-connector). *(This example is complete, it can be run "as is")* #### Provider Support | Parameter | OpenAI | Anthropic | xAI | | --------------------- | ------ | --------- | --- | | `authorization_token` | ✅ | ✅ | ✅ | | `allowed_tools` | ✅ | ✅ | ✅ | | `description` | ✅ | ❌ | ✅ | | `headers` | ✅ | ❌ | ✅ | ## File Search Tool The FileSearchTool enables your agent to search through uploaded files using vector search, providing a fully managed Retrieval-Augmented Generation (RAG) system. This tool handles file storage, chunking, embedding generation, and context injection into prompts. ### Provider Support | Provider | Supported | Notes | | ----------------------- | ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | OpenAI Responses | ✅ | Full feature support. Requires files to be uploaded to vector stores via the [OpenAI Files API](https://platform.openai.com/docs/api-reference/files). To include search results on the BuiltinToolReturnPart available via ModelResponse.builtin_tool_calls, enable the OpenAIResponsesModelSettings.openai_include_file_search_results [model setting](https://ai.pydantic.dev/agent/#model-run-settings). | | Google (Gemini) | ✅ | Requires files to be uploaded via the [Gemini Files API](https://ai.google.dev/gemini-api/docs/files). Files are automatically deleted after 48 hours. Supports up to 2 GB per file and 20 GB per project. Using built-in tools and function tools (including [output tools](https://ai.pydantic.dev/output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](https://ai.pydantic.dev/output/#prompted-output) instead. | | | Google (Vertex AI) | ❌ | | Anthropic | ❌ | Not supported | | Groq | ❌ | Not supported | | OpenAI Chat Completions | ❌ | Not supported | | Bedrock | ❌ | Not supported | | Mistral | ❌ | Not supported | | Cohere | ❌ | Not supported | | HuggingFace | ❌ | Not supported | | Outlines | ❌ | Not supported | ### Usage #### OpenAI Responses With OpenAI, you need to first [upload files to a vector store](https://platform.openai.com/docs/assistants/tools/file-search), then reference the vector store IDs when using the `FileSearchTool`. file_search_openai_upload.py ```python import asyncio from pydantic_ai import Agent, FileSearchTool from pydantic_ai.models.openai import OpenAIResponsesModel async def main(): model = OpenAIResponsesModel('gpt-5.2') with open('my_document.txt', 'rb') as f: file = await model.client.files.create(file=f, purpose='assistants') vector_store = await model.client.vector_stores.create(name='my-docs') await model.client.vector_stores.files.create( vector_store_id=vector_store.id, file_id=file.id ) agent = Agent( model, builtin_tools=[FileSearchTool(file_store_ids=[vector_store.id])] ) result = await agent.run('What information is in my documents about pydantic?') print(result.output) #> Based on your documents, Pydantic is a data validation library for Python... asyncio.run(main()) ``` #### Google (Gemini) With Gemini, you need to first [create a file search store via the Files API](https://ai.google.dev/gemini-api/docs/files), then reference the file search store names. file_search_google_upload.py ```python import asyncio from pydantic_ai import Agent, FileSearchTool from pydantic_ai.models.google import GoogleModel async def main(): model = GoogleModel('gemini-3-flash-preview') store = await model.client.aio.file_search_stores.create( config={'display_name': 'my-docs'} ) with open('my_document.txt', 'rb') as f: await model.client.aio.file_search_stores.upload_to_file_search_store( file_search_store_name=store.name, file=f, config={'mime_type': 'text/plain'} ) agent = Agent( model, builtin_tools=[FileSearchTool(file_store_ids=[store.name])] ) result = await agent.run('Summarize the key points from my uploaded documents.') print(result.output) #> The documents discuss the following key points: ... asyncio.run(main()) ``` ## API Reference For complete API documentation, see the [API Reference](https://ai.pydantic.dev/api/builtin_tools/index.md). # Dependencies Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](https://ai.pydantic.dev/agent/#system-prompts), [tools](https://ai.pydantic.dev/tools/index.md) and [output validators](https://ai.pydantic.dev/output/#output-validator-functions). Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable, easier to test, and ultimately easier to deploy in production. ## Defining Dependencies Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), dataclasses are generally a convenient container when your dependencies included multiple objects. Here's an example of defining an agent that requires dependencies. (**Note:** dependencies aren't actually used in this example, see [Accessing Dependencies](#accessing-dependencies) below) [Learn about Gateway](https://ai.pydantic.dev/gateway) unused_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent @dataclass class MyDeps: # (1)! api_key: str http_client: httpx.AsyncClient agent = Agent( 'gateway/openai:gpt-5.2', deps_type=MyDeps, # (2)! ) async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run( 'Tell me a joke.', deps=deps, # (3)! ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Define a dataclass to hold dependencies. 1. Pass the dataclass type to the `deps_type` argument of the Agent constructor. **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. 1. When running the agent, pass an instance of the dataclass to the `deps` parameter. unused_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent @dataclass class MyDeps: # (1)! api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, # (2)! ) async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run( 'Tell me a joke.', deps=deps, # (3)! ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Define a dataclass to hold dependencies. 1. Pass the dataclass type to the `deps_type` argument of the Agent constructor. **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. 1. When running the agent, pass an instance of the dataclass to the `deps` parameter. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Accessing Dependencies Dependencies are accessed through the RunContext type, this should be the first parameter of system prompt functions etc. [Learn about Gateway](https://ai.pydantic.dev/gateway) system_prompt_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'gateway/openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt # (1)! async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = await ctx.deps.http_client.get( # (3)! 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)! ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. RunContext may optionally be passed to a system_prompt function as the only argument. 1. RunContext is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. 1. Access dependencies through the .deps attribute. 1. Access dependencies through the .deps attribute. system_prompt_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt # (1)! async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = await ctx.deps.http_client.get( # (3)! 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)! ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. RunContext may optionally be passed to a system_prompt function as the only argument. 1. RunContext is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. 1. Access dependencies through the .deps attribute. 1. Access dependencies through the .deps attribute. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Asynchronous vs. Synchronous dependencies [System prompt functions](https://ai.pydantic.dev/agent/#system-prompts), [function tools](https://ai.pydantic.dev/tools/index.md) and [output validators](https://ai.pydantic.dev/output/#output-validator-functions) are all run in the async context of an agent run. If these functions are not coroutines (e.g. `async def`) they are called with run_in_executor in a thread pool. It's therefore marginally preferable to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too. `run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies Whether you use synchronous or asynchronous dependencies is completely independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper around `run` and agents are always run in an async context. Here's the same example as above, but with a synchronous dependency: [Learn about Gateway](https://ai.pydantic.dev/gateway) sync_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.Client # (1)! agent = Agent( 'gateway/openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = ctx.deps.http_client.get( 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'} ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): deps = MyDeps('foobar', httpx.Client()) result = await agent.run( 'Tell me a joke.', deps=deps, ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`. 1. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine. sync_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.Client # (1)! agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = ctx.deps.http_client.get( 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'} ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): deps = MyDeps('foobar', httpx.Client()) result = await agent.run( 'Tell me a joke.', deps=deps, ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`. 1. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Full Example As well as system prompts, dependencies can be used in [tools](https://ai.pydantic.dev/tools/index.md) and [output validators](https://ai.pydantic.dev/output/#output-validator-functions). [Learn about Gateway](https://ai.pydantic.dev/gateway) full_example.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'gateway/openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: response = await ctx.deps.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' @agent.tool # (1)! async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str: response = await ctx.deps.http_client.get( 'https://example.com#jokes', params={'subject': subject}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text @agent.output_validator # (2)! async def validate_output(ctx: RunContext[MyDeps], output: str) -> str: response = await ctx.deps.http_client.post( 'https://example.com#validate', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, params={'query': output}, ) if response.status_code == 400: raise ModelRetry(f'invalid response: {response.text}') response.raise_for_status() return output async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. To pass `RunContext` to a tool, use the tool decorator. 1. `RunContext` may optionally be passed to a output_validator function as the first argument. full_example.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-5.2', deps_type=MyDeps, ) @agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: response = await ctx.deps.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' @agent.tool # (1)! async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str: response = await ctx.deps.http_client.get( 'https://example.com#jokes', params={'subject': subject}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text @agent.output_validator # (2)! async def validate_output(ctx: RunContext[MyDeps], output: str) -> str: response = await ctx.deps.http_client.post( 'https://example.com#validate', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, params={'query': output}, ) if response.status_code == 400: raise ModelRetry(f'invalid response: {response.text}') response.raise_for_status() return output async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. To pass `RunContext` to a tool, use the tool decorator. 1. `RunContext` may optionally be passed to a output_validator function as the first argument. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Overriding Dependencies When testing agents, it's useful to be able to customise dependencies. While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent. This is done via the override method on the agent. [Learn about Gateway](https://ai.pydantic.dev/gateway) joke_app.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient async def system_prompt_factory(self) -> str: # (1)! response = await self.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' joke_agent = Agent('gateway/openai:gpt-5.2', deps_type=MyDeps) @joke_agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: return await ctx.deps.system_prompt_factory() # (2)! async def application_code(prompt: str) -> str: # (3)! ... ... # now deep within application code we call our agent async with httpx.AsyncClient() as client: app_deps = MyDeps('foobar', client) result = await joke_agent.run(prompt, deps=app_deps) # (4)! return result.output ``` 1. Define a method on the dependency to make the system prompt easier to customise. 1. Call the system prompt factory from within the system prompt function. 1. Application code that calls the agent, in a real application this might be an API endpoint. 1. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden. joke_app.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient async def system_prompt_factory(self) -> str: # (1)! response = await self.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' joke_agent = Agent('openai:gpt-5.2', deps_type=MyDeps) @joke_agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: return await ctx.deps.system_prompt_factory() # (2)! async def application_code(prompt: str) -> str: # (3)! ... ... # now deep within application code we call our agent async with httpx.AsyncClient() as client: app_deps = MyDeps('foobar', client) result = await joke_agent.run(prompt, deps=app_deps) # (4)! return result.output ``` 1. Define a method on the dependency to make the system prompt easier to customise. 1. Call the system prompt factory from within the system prompt function. 1. Application code that calls the agent, in a real application this might be an API endpoint. 1. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden. *(This example is complete, it can be run "as is")* test_joke_app.py ```python from joke_app import MyDeps, application_code, joke_agent class TestMyDeps(MyDeps): # (1)! async def system_prompt_factory(self) -> str: return 'test prompt' async def test_application_code(): test_deps = TestMyDeps('test_key', None) # (2)! with joke_agent.override(deps=test_deps): # (3)! joke = await application_code('Tell me a joke.') # (4)! assert joke.startswith('Did you hear about the toothpaste scandal?') ``` 1. Define a subclass of `MyDeps` in tests to customise the system prompt factory. 1. Create an instance of the test dependency, we don't need to pass an `http_client` here as it's not used. 1. Override the dependencies of the agent for the duration of the `with` block, `test_deps` will be used when the agent is run. 1. Now we can safely call our application code, the agent will use the overridden dependencies. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Weather Agent](https://ai.pydantic.dev/examples/weather-agent/index.md) - [SQL Generation](https://ai.pydantic.dev/examples/sql-gen/index.md) - [RAG](https://ai.pydantic.dev/examples/rag/index.md) # Deferred Tools There are a few scenarios where the model should be able to call a tool that should not or cannot be executed during the same agent run inside the same Python process: - it may need to be approved by the user first - it may depend on an upstream service, frontend, or user to provide the result - the result could take longer to generate than it's reasonable to keep the agent process running To support these use cases, Pydantic AI provides the concept of deferred tools, which come in two flavors documented below: - tools that [require approval](#human-in-the-loop-tool-approval) - tools that are [executed externally](#external-tool-execution) When the model calls a deferred tool, the agent run will end with a DeferredToolRequests output object containing information about the deferred tool calls. Once the approvals and/or results are ready, a new agent run can then be started with the original run's [message history](https://ai.pydantic.dev/message-history/index.md) plus a DeferredToolResults object holding results for each tool call in `DeferredToolRequests`, which will continue the original run where it left off. Note that handling deferred tool calls requires `DeferredToolRequests` to be in the `Agent`'s [`output_type`](https://ai.pydantic.dev/output/#structured-output) so that the possible types of the agent run output are correctly inferred. If your agent can also be used in a context where no deferred tools are available and you don't want to deal with that type everywhere you use the agent, you can instead pass the `output_type` argument when you run the agent using agent.run(), agent.run_sync(), agent.run_stream(), or agent.iter(). Note that the run-time `output_type` overrides the one specified at construction time (for type inference reasons), so you'll need to include the original output type explicitly. ## Human-in-the-Loop Tool Approval If a tool function always requires approval, you can pass the `requires_approval=True` argument to the @agent.tool decorator, @agent.tool_plain decorator, Tool class, FunctionToolset.tool decorator, or FunctionToolset.add_function() method. Inside the function, you can then assume that the tool call has been approved. If whether a tool function requires approval depends on the tool call arguments or the agent run context (e.g. [dependencies](https://ai.pydantic.dev/dependencies/index.md) or message history), you can raise the ApprovalRequired exception from the tool function. The RunContext.tool_call_approved property will be `True` if the tool call has already been approved. To require approval for calls to tools provided by a [toolset](https://ai.pydantic.dev/toolsets/index.md) (like an [MCP server](https://ai.pydantic.dev/mcp/client/index.md)), see the [`ApprovalRequiredToolset` documentation](https://ai.pydantic.dev/toolsets/#requiring-tool-approval). When the model calls a tool that requires approval, the agent run will end with a DeferredToolRequests output object with an `approvals` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID. Once you've gathered the user's approvals or denials, you can build a DeferredToolResults object with an `approvals` dictionary that maps each tool call ID to a boolean, a ToolApproved object (with optional `override_args`), or a ToolDenied object (with an optional custom `message` to provide to the model). You can also provide a `metadata` dictionary on `DeferredToolResults` that maps each tool call ID to a dictionary of metadata that will be available in the tool's RunContext.tool_call_metadata attribute. This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](https://ai.pydantic.dev/message-history/index.md). Here's an example that shows how to require approval for all file deletions, and for updates of specific protected files: [Learn about Gateway](https://ai.pydantic.dev/gateway) tool_requires_approval.py ```python from pydantic_ai import ( Agent, ApprovalRequired, DeferredToolRequests, DeferredToolResults, RunContext, ToolDenied, ) agent = Agent('gateway/openai:gpt-5.2', output_type=[str, DeferredToolRequests]) PROTECTED_FILES = {'.env'} @agent.tool def update_file(ctx: RunContext, path: str, content: str) -> str: if path in PROTECTED_FILES and not ctx.tool_call_approved: raise ApprovalRequired(metadata={'reason': 'protected'}) # (1)! return f'File {path!r} updated: {content!r}' @agent.tool_plain(requires_approval=True) def delete_file(path: str) -> str: return f'File {path!r} deleted' result = agent.run_sync('Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ], metadata={'update_file_dotenv': {'reason': 'protected'}}, ) """ results = DeferredToolResults() for call in requests.approvals: result = False if call.tool_name == 'update_file': # Approve all updates result = True elif call.tool_name == 'delete_file': # deny all deletes result = ToolDenied('Deleting files is not allowed') results.approvals[call.tool_call_id] = result result = agent.run_sync( 'Now create a backup of README.md', # (2)! message_history=messages, deferred_tool_results=results, ) print(result.output) """ Here's what I've done: - Attempted to delete __init__.py, but deletion is not allowed. - Updated README.md with: Hello, world! - Cleared .env (set to empty). - Created a backup at README.md.bak containing: Hello, world! If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know. """ print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ToolCallPart( tool_name='update_file', args={'path': 'README.md', 'content': 'Hello, world!'}, tool_call_id='update_file_readme', ), ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ], usage=RequestUsage(input_tokens=63, output_tokens=21), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md' updated: 'Hello, world!'", tool_call_id='update_file_readme', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File '.env' updated: ''", tool_call_id='update_file_dotenv', timestamp=datetime.datetime(...), ), ToolReturnPart( tool_name='delete_file', content='Deleting files is not allowed', tool_call_id='delete_file', timestamp=datetime.datetime(...), ), UserPromptPart( content='Now create a backup of README.md', timestamp=datetime.datetime(...), ), ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='update_file', args={'path': 'README.md.bak', 'content': 'Hello, world!'}, tool_call_id='update_file_backup', ) ], usage=RequestUsage(input_tokens=86, output_tokens=31), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md.bak' updated: 'Hello, world!'", tool_call_id='update_file_backup', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content="Here's what I've done:\n- Attempted to delete __init__.py, but deletion is not allowed.\n- Updated README.md with: Hello, world!\n- Cleared .env (set to empty).\n- Created a backup at README.md.bak containing: Hello, world!\n\nIf you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know." ) ], usage=RequestUsage(input_tokens=93, output_tokens=89), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` 1. The optional `metadata` parameter can attach arbitrary context to deferred tool calls, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. 1. This second agent run continues from where the first run left off, providing the tool approval results and optionally a new `user_prompt` to give the model additional instructions alongside the deferred results. tool_requires_approval.py ```python from pydantic_ai import ( Agent, ApprovalRequired, DeferredToolRequests, DeferredToolResults, RunContext, ToolDenied, ) agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests]) PROTECTED_FILES = {'.env'} @agent.tool def update_file(ctx: RunContext, path: str, content: str) -> str: if path in PROTECTED_FILES and not ctx.tool_call_approved: raise ApprovalRequired(metadata={'reason': 'protected'}) # (1)! return f'File {path!r} updated: {content!r}' @agent.tool_plain(requires_approval=True) def delete_file(path: str) -> str: return f'File {path!r} deleted' result = agent.run_sync('Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ], metadata={'update_file_dotenv': {'reason': 'protected'}}, ) """ results = DeferredToolResults() for call in requests.approvals: result = False if call.tool_name == 'update_file': # Approve all updates result = True elif call.tool_name == 'delete_file': # deny all deletes result = ToolDenied('Deleting files is not allowed') results.approvals[call.tool_call_id] = result result = agent.run_sync( 'Now create a backup of README.md', # (2)! message_history=messages, deferred_tool_results=results, ) print(result.output) """ Here's what I've done: - Attempted to delete __init__.py, but deletion is not allowed. - Updated README.md with: Hello, world! - Cleared .env (set to empty). - Created a backup at README.md.bak containing: Hello, world! If you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know. """ print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ToolCallPart( tool_name='update_file', args={'path': 'README.md', 'content': 'Hello, world!'}, tool_call_id='update_file_readme', ), ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ], usage=RequestUsage(input_tokens=63, output_tokens=21), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md' updated: 'Hello, world!'", tool_call_id='update_file_readme', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File '.env' updated: ''", tool_call_id='update_file_dotenv', timestamp=datetime.datetime(...), ), ToolReturnPart( tool_name='delete_file', content='Deleting files is not allowed', tool_call_id='delete_file', timestamp=datetime.datetime(...), ), UserPromptPart( content='Now create a backup of README.md', timestamp=datetime.datetime(...), ), ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='update_file', args={'path': 'README.md.bak', 'content': 'Hello, world!'}, tool_call_id='update_file_backup', ) ], usage=RequestUsage(input_tokens=86, output_tokens=31), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md.bak' updated: 'Hello, world!'", tool_call_id='update_file_backup', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content="Here's what I've done:\n- Attempted to delete __init__.py, but deletion is not allowed.\n- Updated README.md with: Hello, world!\n- Cleared .env (set to empty).\n- Created a backup at README.md.bak containing: Hello, world!\n\nIf you want a different backup name or format (e.g., timestamped like README_2025-11-24.bak), let me know." ) ], usage=RequestUsage(input_tokens=93, output_tokens=89), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` 1. The optional `metadata` parameter can attach arbitrary context to deferred tool calls, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. 1. This second agent run continues from where the first run left off, providing the tool approval results and optionally a new `user_prompt` to give the model additional instructions alongside the deferred results. *(This example is complete, it can be run "as is")* ## External Tool Execution When the result of a tool call cannot be generated inside the same agent run in which it was called, the tool is considered to be external. Examples of external tools are client-side tools implemented by a web or app frontend, and slow tasks that are passed off to a background worker or external service instead of keeping the agent process running. If whether a tool call should be executed externally depends on the tool call arguments, the agent run context (e.g. [dependencies](https://ai.pydantic.dev/dependencies/index.md) or message history), or how long the task is expected to take, you can define a tool function and conditionally raise the CallDeferred exception. Before raising the exception, the tool function would typically schedule some background task and pass along the RunContext.tool_call_id so that the result can be matched to the deferred tool call later. If a tool is always executed externally and its definition is provided to your code along with a JSON schema for its arguments, you can use an [`ExternalToolset`](https://ai.pydantic.dev/toolsets/#external-toolset). If the external tools are known up front and you don't have the arguments JSON schema handy, you can also define a tool function with the appropriate signature that does nothing but raise the CallDeferred exception. When the model calls an external tool, the agent run will end with a DeferredToolRequests output object with a `calls` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID. Once the tool call results are ready, you can build a DeferredToolResults object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](https://ai.pydantic.dev/tools-advanced/#advanced-tool-returns) object, or a ModelRetry exception in case the tool call failed and the model should [try again](https://ai.pydantic.dev/tools-advanced/#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](https://ai.pydantic.dev/message-history/index.md). Here's an example that shows how to move a task that takes a while to complete to the background and return the result to the model once the task is complete: [Learn about Gateway](https://ai.pydantic.dev/gateway) external_tool.py ```python import asyncio from dataclasses import dataclass from typing import Any from pydantic_ai import ( Agent, CallDeferred, DeferredToolRequests, DeferredToolResults, ModelRetry, RunContext, ) @dataclass class TaskResult: task_id: str result: Any async def calculate_answer_task(task_id: str, question: str) -> TaskResult: await asyncio.sleep(1) return TaskResult(task_id=task_id, result=42) agent = Agent('gateway/openai:gpt-5.2', output_type=[str, DeferredToolRequests]) tasks: list[asyncio.Task[TaskResult]] = [] @agent.tool async def calculate_answer(ctx: RunContext, question: str) -> str: task_id = f'task_{len(tasks)}' # (1)! task = asyncio.create_task(calculate_answer_task(task_id, question)) tasks.append(task) raise CallDeferred(metadata={'task_id': task_id}) # (2)! async def main(): result = await agent.run('Calculate the answer to the ultimate question of life, the universe, and everything') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], approvals=[], metadata={'pyd_ai_tool_call_id': {'task_id': 'task_0'}}, ) """ done, _ = await asyncio.wait(tasks) # (3)! task_results = [task.result() for task in done] task_results_by_task_id = {result.task_id: result.result for result in task_results} results = DeferredToolResults() for call in requests.calls: try: task_id = requests.metadata[call.tool_call_id]['task_id'] result = task_results_by_task_id[task_id] except KeyError: result = ModelRetry('No result for this tool call was found.') results.calls[call.tool_call_id] = result result = await agent.run(message_history=messages, deferred_tool_results=results) print(result.output) #> The answer to the ultimate question of life, the universe, and everything is 42. print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Calculate the answer to the ultimate question of life, the universe, and everything', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=63, output_tokens=13), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='calculate_answer', content=42, tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content='The answer to the ultimate question of life, the universe, and everything is 42.' ) ], usage=RequestUsage(input_tokens=64, output_tokens=28), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` 1. Generate a task ID that can be tracked independently of the tool call ID. 1. The optional `metadata` parameter passes the `task_id` so it can be matched with results later, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. 1. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete. external_tool.py ```python import asyncio from dataclasses import dataclass from typing import Any from pydantic_ai import ( Agent, CallDeferred, DeferredToolRequests, DeferredToolResults, ModelRetry, RunContext, ) @dataclass class TaskResult: task_id: str result: Any async def calculate_answer_task(task_id: str, question: str) -> TaskResult: await asyncio.sleep(1) return TaskResult(task_id=task_id, result=42) agent = Agent('openai:gpt-5.2', output_type=[str, DeferredToolRequests]) tasks: list[asyncio.Task[TaskResult]] = [] @agent.tool async def calculate_answer(ctx: RunContext, question: str) -> str: task_id = f'task_{len(tasks)}' # (1)! task = asyncio.create_task(calculate_answer_task(task_id, question)) tasks.append(task) raise CallDeferred(metadata={'task_id': task_id}) # (2)! async def main(): result = await agent.run('Calculate the answer to the ultimate question of life, the universe, and everything') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], approvals=[], metadata={'pyd_ai_tool_call_id': {'task_id': 'task_0'}}, ) """ done, _ = await asyncio.wait(tasks) # (3)! task_results = [task.result() for task in done] task_results_by_task_id = {result.task_id: result.result for result in task_results} results = DeferredToolResults() for call in requests.calls: try: task_id = requests.metadata[call.tool_call_id]['task_id'] result = task_results_by_task_id[task_id] except KeyError: result = ModelRetry('No result for this tool call was found.') results.calls[call.tool_call_id] = result result = await agent.run(message_history=messages, deferred_tool_results=results) print(result.output) #> The answer to the ultimate question of life, the universe, and everything is 42. print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Calculate the answer to the ultimate question of life, the universe, and everything', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=63, output_tokens=13), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='calculate_answer', content=42, tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ), ModelResponse( parts=[ TextPart( content='The answer to the ultimate question of life, the universe, and everything is 42.' ) ], usage=RequestUsage(input_tokens=64, output_tokens=28), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` 1. Generate a task ID that can be tracked independently of the tool call ID. 1. The optional `metadata` parameter passes the `task_id` so it can be matched with results later, accessible in `DeferredToolRequests.metadata` keyed by `tool_call_id`. 1. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## See Also - [Function Tools](https://ai.pydantic.dev/tools/index.md) - Basic tool concepts and registration - [Advanced Tool Features](https://ai.pydantic.dev/tools-advanced/index.md) - Custom schemas, dynamic tools, and execution details - [Toolsets](https://ai.pydantic.dev/toolsets/index.md) - Managing collections of tools, including `ExternalToolset` for external tools - [Message History](https://ai.pydantic.dev/message-history/index.md) - Understanding how to work with message history for deferred tools # Direct Model Requests The `direct` module provides low-level methods for making imperative requests to LLMs where the only abstraction is input and output schema translation, enabling you to use all models with the same API. These methods are thin wrappers around the Model implementations, offering a simpler interface when you don't need the full functionality of an Agent. The following functions are available: - model_request: Make a non-streamed async request to a model - model_request_sync: Make a non-streamed synchronous request to a model - model_request_stream: Make a streamed async request to a model - model_request_stream_sync: Make a streamed sync request to a model ## Basic Example Here's a simple example demonstrating how to use the direct API to make a basic request: direct_basic.py ```python from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] ) print(model_response.parts[0].content) #> The capital of France is Paris. print(model_response.usage) #> RequestUsage(input_tokens=56, output_tokens=7) ``` *(This example is complete, it can be run "as is")* ## Advanced Example with Tool Calling You can also use the direct API to work with function/tool calling. Even here we can use Pydantic to generate the JSON schema for the tool: ```python from typing import Literal from pydantic import BaseModel from pydantic_ai import ModelRequest, ToolDefinition from pydantic_ai.direct import model_request from pydantic_ai.models import ModelRequestParameters class Divide(BaseModel): """Divide two numbers.""" numerator: float denominator: float on_inf: Literal['error', 'infinity'] = 'infinity' async def main(): # Make a request to the model with tool access model_response = await model_request( 'openai:gpt-5-nano', [ModelRequest.user_text_prompt('What is 123 / 456?')], model_request_parameters=ModelRequestParameters( function_tools=[ ToolDefinition( name=Divide.__name__.lower(), description=Divide.__doc__, parameters_json_schema=Divide.model_json_schema(), ) ], allow_text_output=True, # Allow model to either use tools or respond directly ), ) print(model_response) """ ModelResponse( parts=[ ToolCallPart( tool_name='divide', args={'numerator': '123', 'denominator': '456'}, tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b', ) ], usage=RequestUsage(input_tokens=55, output_tokens=7), model_name='gpt-5-nano', timestamp=datetime.datetime(...), ) """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## When to Use the direct API vs Agent The direct API is ideal when: 1. You need more direct control over model interactions 1. You want to implement custom behavior around model requests 1. You're building your own abstractions on top of model interactions For most application use cases, the higher-level Agent API provides a more convenient interface with additional features such as built-in tool execution, retrying, structured output parsing, and more. ## OpenTelemetry or Logfire Instrumentation As with agents, you can enable OpenTelemetry/Logfire instrumentation with just a few extra lines direct_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() logfire.instrument_pydantic_ai() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')], ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` *(This example is complete, it can be run "as is")* You can also enable OpenTelemetry on a per call basis: direct_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')], instrument=True ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` See [Debugging and Monitoring](https://ai.pydantic.dev/logfire/index.md) for more details, including how to instrument with plain OpenTelemetry without Logfire. # Embeddings Embeddings are vector representations of text that capture semantic meaning. They're essential for building: - **Semantic search** — Find documents based on meaning, not just keyword matching - **RAG (Retrieval-Augmented Generation)** — Retrieve relevant context for your AI agents - **Similarity detection** — Find similar documents, detect duplicates, or cluster content - **Classification** — Use embeddings as features for downstream ML models Pydantic AI provides a unified interface for generating embeddings across multiple providers. ## Quick Start The Embedder class is the high-level interface for generating embeddings: embeddings_quickstart.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Embed a search query result = await embedder.embed_query('What is machine learning?') print(f'Embedding dimensions: {len(result.embeddings[0])}') #> Embedding dimensions: 1536 # Embed multiple documents at once docs = [ 'Machine learning is a subset of AI.', 'Deep learning uses neural networks.', 'Python is a programming language.', ] result = await embedder.embed_documents(docs) print(f'Embedded {len(result.embeddings)} documents') #> Embedded 3 documents ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Queries vs Documents Some embedding models optimize differently for queries and documents. Use embed_query() for search queries and embed_documents() for content you're indexing. ## Embedding Result All embed methods return an EmbeddingResult containing the embeddings along with useful metadata. For convenience, you can access embeddings either by index (`result[0]`) or by the original input text (`result['Hello world']`). embedding_result.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('Hello world') # Access embeddings - each is a sequence of floats embedding = result.embeddings[0] # By index via .embeddings embedding = result[0] # Or directly via __getitem__ embedding = result['Hello world'] # Or by original input text print(f'Dimensions: {len(embedding)}') #> Dimensions: 1536 # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 2 # Calculate cost (requires `genai-prices` to have pricing data for the model) cost = result.cost() print(f'Cost: ${cost.total_price:.6f}') #> Cost: $0.000000 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Providers ### OpenAI OpenAIEmbeddingModel works with OpenAI's embeddings API and any [OpenAI-compatible provider](https://ai.pydantic.dev/models/openai/#openai-compatible-models). #### Install To use OpenAI embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` #### Configuration To use `OpenAIEmbeddingModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable: ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use the model: openai_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1536 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available models. #### Dimension Control OpenAI's `text-embedding-3-*` models support dimension reduction via the `dimensions` setting: openai_dimensions.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings embedder = Embedder( 'openai:text-embedding-3-small', settings=EmbeddingSettings(dimensions=256), ) async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 256 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* #### OpenAI-Compatible Providers Since OpenAIEmbeddingModel uses the same provider system as OpenAIChatModel, you can use it with any [OpenAI-compatible provider](https://ai.pydantic.dev/models/openai/#openai-compatible-models): openai_compatible_embeddings.py ```python # Using Azure OpenAI from openai import AsyncAzureOpenAI from pydantic_ai import Embedder from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel from pydantic_ai.providers.openai import OpenAIProvider azure_client = AsyncAzureOpenAI( azure_endpoint='https://your-resource.openai.azure.com', api_version='2024-02-01', api_key='your-azure-key', ) model = OpenAIEmbeddingModel( 'text-embedding-3-small', provider=OpenAIProvider(openai_client=azure_client), ) embedder = Embedder(model) # Using any OpenAI-compatible API model = OpenAIEmbeddingModel( 'your-model-name', provider=OpenAIProvider( base_url='https://your-provider.com/v1', api_key='your-api-key', ), ) embedder = Embedder(model) ``` For providers with dedicated provider classes (like OllamaProvider or AzureProvider), you can use the shorthand syntax: ```python from pydantic_ai import Embedder embedder = Embedder('azure:text-embedding-3-small') embedder = Embedder('ollama:nomic-embed-text') ``` See [OpenAI-compatible Models](https://ai.pydantic.dev/models/openai/#openai-compatible-models) for the full list of supported providers. ### Google GoogleEmbeddingModel works with Google's embedding models via the Gemini API (Google AI Studio) or Vertex AI. #### Install To use Google embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: ```bash pip install "pydantic-ai-slim[google]" ``` ```bash uv add "pydantic-ai-slim[google]" ``` #### Configuration To use `GoogleEmbeddingModel` with the Gemini API, go to [aistudio.google.com](https://aistudio.google.com/) and generate an API key. Once you have the API key, you can set it as an environment variable: ```bash export GOOGLE_API_KEY='your-api-key' ``` You can then use the model: google_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('google-gla:gemini-embedding-001') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 3072 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models. ##### Vertex AI To use Google's embedding models via Vertex AI instead of the Gemini API, use the `google-vertex` provider prefix: google_vertex_embeddings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.google import GoogleEmbeddingModel from pydantic_ai.providers.google import GoogleProvider # Using provider prefix embedder = Embedder('google-vertex:gemini-embedding-001') # Or with explicit provider configuration model = GoogleEmbeddingModel( 'gemini-embedding-001', provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'), ) embedder = Embedder(model) ``` See the [Google provider documentation](https://ai.pydantic.dev/models/google/#vertex-ai-enterprisecloud) for more details on Vertex AI authentication options, including application default credentials, service accounts, and API keys. #### Dimension Control Google's embedding models support dimension reduction via the `dimensions` setting: google_dimensions.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings embedder = Embedder( 'google-gla:gemini-embedding-001', settings=EmbeddingSettings(dimensions=768), ) async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 768 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* #### Google-Specific Settings Google models support additional settings via GoogleEmbeddingSettings: google_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.google import GoogleEmbeddingSettings embedder = Embedder( 'google-gla:gemini-embedding-001', settings=GoogleEmbeddingSettings( dimensions=768, google_task_type='SEMANTIC_SIMILARITY', # Optimize for similarity comparison ), ) ``` See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available task types. By default, `embed_query()` uses `RETRIEVAL_QUERY` and `embed_documents()` uses `RETRIEVAL_DOCUMENT`. ### Cohere CohereEmbeddingModel provides access to Cohere's embedding models, which offer multilingual support and various model sizes. #### Install To use Cohere embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: ```bash pip install "pydantic-ai-slim[cohere]" ``` ```bash uv add "pydantic-ai-slim[cohere]" ``` #### Configuration To use `CohereEmbeddingModel`, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. Once you have the API key, you can set it as an environment variable: ```bash export CO_API_KEY='your-api-key' ``` You can then use the model: cohere_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('cohere:embed-v4.0') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models. #### Cohere-Specific Settings Cohere models support additional settings via CohereEmbeddingSettings: cohere_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.cohere import CohereEmbeddingSettings embedder = Embedder( 'cohere:embed-v4.0', settings=CohereEmbeddingSettings( dimensions=512, cohere_truncate='END', # Truncate long inputs instead of erroring cohere_max_tokens=256, # Limit tokens per input ), ) ``` ### VoyageAI VoyageAIEmbeddingModel provides access to VoyageAI's embedding models, which are optimized for retrieval with specialized models for code, finance, and legal domains. #### Install To use VoyageAI embedding models, you need to install `pydantic-ai-slim` with the `voyageai` optional group: ```bash pip install "pydantic-ai-slim[voyageai]" ``` ```bash uv add "pydantic-ai-slim[voyageai]" ``` #### Configuration To use `VoyageAIEmbeddingModel`, go to [dash.voyageai.com](https://dash.voyageai.com/) to generate an API key. Once you have the API key, you can set it as an environment variable: ```bash export VOYAGE_API_KEY='your-api-key' ``` You can then use the model: voyageai_embeddings.py ```python from pydantic_ai import Embedder embedder = Embedder('voyageai:voyage-3.5') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* See the [VoyageAI Embeddings documentation](https://docs.voyageai.com/docs/embeddings) for available models. #### VoyageAI-Specific Settings VoyageAI models support additional settings via VoyageAIEmbeddingSettings: voyageai_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingSettings embedder = Embedder( 'voyageai:voyage-3.5', settings=VoyageAIEmbeddingSettings( dimensions=512, # Reduce output dimensions voyageai_input_type='document', # Override input type for all requests ), ) ``` ### Bedrock BedrockEmbeddingModel provides access to embedding models through AWS Bedrock, including Amazon Titan, Cohere, and Amazon Nova models. #### Install To use Bedrock embedding models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: ```bash pip install "pydantic-ai-slim[bedrock]" ``` ```bash uv add "pydantic-ai-slim[bedrock]" ``` #### Configuration Authentication with AWS Bedrock uses standard AWS credentials. See the [Bedrock provider documentation](https://ai.pydantic.dev/models/bedrock/#environment-variables) for details on configuring credentials via environment variables, AWS credentials file, or IAM roles. Ensure your AWS account has access to the Bedrock embedding models you want to use. See [AWS Bedrock model access](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html) for details. #### Basic Usage bedrock_embeddings.py ```python from pydantic_ai import Embedder # Using Amazon Titan embedder = Embedder('bedrock:amazon.titan-embed-text-v2:0') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 1024 ``` *(This example requires AWS credentials configured)* #### Supported Models Bedrock supports three families of embedding models. See the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for the full list of available models. **Amazon Titan:** - `amazon.titan-embed-text-v1` — 1536 dimensions (fixed), 8K tokens - `amazon.titan-embed-text-v2:0` — 256/384/1024 dimensions (configurable, default: 1024), 8K tokens **Cohere Embed:** - `cohere.embed-english-v3` — English-only, 1024 dimensions (fixed), 512 tokens - `cohere.embed-multilingual-v3` — Multilingual, 1024 dimensions (fixed), 512 tokens - `cohere.embed-v4:0` — 256/512/1024/1536 dimensions (configurable, default: 1536), 128K tokens **Amazon Nova:** - `amazon.nova-2-multimodal-embeddings-v1:0` — 256/384/1024/3072 dimensions (configurable, default: 3072), 8K tokens #### Titan-Specific Settings Titan v2 supports vector normalization for direct similarity calculations via `bedrock_titan_normalize` (default: `True`). Titan v1 does not support this setting. bedrock_titan.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:amazon.titan-embed-text-v2:0', settings=BedrockEmbeddingSettings( dimensions=512, bedrock_titan_normalize=True, ), ) ``` Note Titan models do not support the `truncate` setting. The `dimensions` setting is only supported by Titan v2. #### Cohere-Specific Settings Cohere models on Bedrock support additional settings via BedrockEmbeddingSettings: - `bedrock_cohere_input_type` — By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`. - `bedrock_cohere_truncate` — Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting. - `bedrock_cohere_max_tokens` — Limits tokens per input (default: 128000). Only supported by Cohere v4. bedrock_cohere.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:cohere.embed-v4:0', settings=BedrockEmbeddingSettings( dimensions=512, bedrock_cohere_max_tokens=1000, bedrock_cohere_truncate='END', ), ) ``` Note The `dimensions` and `bedrock_cohere_max_tokens` settings are only supported by Cohere v4. Cohere v3 models have fixed 1024 dimensions. #### Nova-Specific Settings Nova models on Bedrock support additional settings via BedrockEmbeddingSettings: - `bedrock_nova_truncate` — Fine-grained truncation control: `'NONE'` (default, error on overflow), `'START'`, or `'END'`. Overrides the base `truncate` setting. - `bedrock_nova_embedding_purpose` — By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`. bedrock_nova.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings embedder = Embedder( 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0', settings=BedrockEmbeddingSettings( dimensions=1024, bedrock_nova_embedding_purpose='TEXT_RETRIEVAL', truncate=True, ), ) ``` #### Concurrency Settings Models that don't support batch embedding (Titan and Nova) make individual API requests for each input text. By default, these requests run concurrently with a maximum of 5 parallel requests. You can adjust this with the `bedrock_max_concurrency` setting: bedrock_concurrency.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings # Increase concurrency for faster throughput embedder = Embedder( 'bedrock:amazon.titan-embed-text-v2:0', settings=BedrockEmbeddingSettings(bedrock_max_concurrency=10), ) # Or reduce concurrency to avoid rate limits embedder = Embedder( 'bedrock:amazon.nova-2-multimodal-embeddings-v1:0', settings=BedrockEmbeddingSettings(bedrock_max_concurrency=2), ) ``` #### Regional Prefixes (Cross-Region Inference) Bedrock supports cross-region inference using geographic prefixes like `us.`, `eu.`, or `apac.`: bedrock_regional.py ```python from pydantic_ai import Embedder embedder = Embedder('bedrock:us.amazon.titan-embed-text-v2:0') ``` #### Using a Custom Provider For advanced configuration like explicit credentials or a custom boto3 client, you can create a BedrockProvider directly. See the [Bedrock provider documentation](https://ai.pydantic.dev/models/bedrock/#provider-argument) for more details. bedrock_provider.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider provider = BedrockProvider( region_name='us-west-2', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ) model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0', provider=provider) embedder = Embedder(model) ``` Token Counting Bedrock embedding models do not support the `count_tokens()` method because AWS Bedrock's token counting API only works with text generation models (Claude, Llama, etc.), not embedding models. Calling `count_tokens()` will raise `NotImplementedError`. ### Sentence Transformers (Local) SentenceTransformerEmbeddingModel runs embeddings locally using the [sentence-transformers](https://www.sbert.net/) library. This is ideal for: - **Privacy** — Data never leaves your infrastructure - **Cost** — No API charges for high-volume workloads - **Offline use** — No internet connection required after model download #### Install To use Sentence Transformers embedding models, you need to install `pydantic-ai-slim` with the `sentence-transformers` optional group: ```bash pip install "pydantic-ai-slim[sentence-transformers]" ``` ```bash uv add "pydantic-ai-slim[sentence-transformers]" ``` #### Usage sentence_transformers_embeddings.py ```python from pydantic_ai import Embedder # Model is downloaded from Hugging Face on first use embedder = Embedder('sentence-transformers:all-MiniLM-L6-v2') async def main(): result = await embedder.embed_query('Hello world') print(len(result.embeddings[0])) #> 384 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* See the [Sentence-Transformers pretrained models](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) documentation for available models. #### Device Selection Control which device to use for inference: sentence_transformers_device.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformersEmbeddingSettings, ) embedder = Embedder( 'sentence-transformers:all-MiniLM-L6-v2', settings=SentenceTransformersEmbeddingSettings( sentence_transformers_device='cuda', # Use GPU sentence_transformers_normalize_embeddings=True, # L2 normalize ), ) ``` #### Using an Existing Model Instance If you need more control over model initialization: sentence_transformers_instance.py ```python from sentence_transformers import SentenceTransformer from pydantic_ai import Embedder from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformerEmbeddingModel, ) # Create and configure the model yourself st_model = SentenceTransformer('all-MiniLM-L6-v2', device='cpu') # Wrap it for use with Pydantic AI model = SentenceTransformerEmbeddingModel(st_model) embedder = Embedder(model) ``` ## Settings EmbeddingSettings provides common configuration options that work across providers: - `dimensions`: Reduce the output embedding dimensions (supported by OpenAI, Google, Cohere, Bedrock, VoyageAI) - `truncate`: When `True`, truncate input text that exceeds the model's context length instead of raising an error (supported by Cohere, Bedrock, VoyageAI) Settings can be specified at the embedder level (applied to all calls) or per-call: embedding_settings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import EmbeddingSettings # Default settings for all calls embedder = Embedder( 'openai:text-embedding-3-small', settings=EmbeddingSettings(dimensions=512), ) async def main(): # Override for a specific call result = await embedder.embed_query( 'Hello world', settings=EmbeddingSettings(dimensions=256), ) print(len(result.embeddings[0])) #> 256 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Token Counting You can check token counts before embedding to avoid exceeding model limits: token_counting.py ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): text = 'Hello world, this is a test.' # Count tokens in text token_count = await embedder.count_tokens(text) print(f'Tokens: {token_count}') #> Tokens: 7 # Check model's maximum input tokens (returns None if unknown) max_tokens = await embedder.max_input_tokens() print(f'Max tokens: {max_tokens}') #> Max tokens: 1024 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Testing Use TestEmbeddingModel for testing without making API calls: testing_embeddings.py ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel async def test_my_rag_system(): embedder = Embedder('openai:text-embedding-3-small') test_model = TestEmbeddingModel() with embedder.override(model=test_model): result = await embedder.embed_query('test query') # TestEmbeddingModel returns deterministic embeddings assert result.embeddings[0] == [1.0] * 8 # Check what settings were used assert test_model.last_settings is not None ``` ## Instrumentation Enable OpenTelemetry instrumentation for debugging and monitoring: instrumented_embeddings.py ```python import logfire from pydantic_ai import Embedder logfire.configure() # Instrument a specific embedder embedder = Embedder('openai:text-embedding-3-small', instrument=True) # Or instrument all embedders globally Embedder.instrument_all() ``` See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/index.md) for more details on using Logfire with Pydantic AI. ## Building Custom Embedding Models To integrate a custom embedding provider, subclass EmbeddingModel: custom_embedding_model.py ```python from collections.abc import Sequence from pydantic_ai.embeddings import EmbeddingModel, EmbeddingResult, EmbeddingSettings from pydantic_ai.embeddings.result import EmbedInputType class MyCustomEmbeddingModel(EmbeddingModel): @property def model_name(self) -> str: return 'my-custom-model' @property def system(self) -> str: return 'my-provider' async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) # Call your embedding API here embeddings = [[0.1, 0.2, 0.3] for _ in inputs] # Placeholder return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, model_name=self.model_name, provider_name=self.system, ) ``` Use WrapperEmbeddingModel if you want to wrap an existing model to add custom behavior like caching or logging. # Image, Audio, Video & Document Input Some LLMs are now capable of understanding audio, video, image and document content. ## Image Input Info Some models do not support image input. Please check the model's documentation to confirm whether it supports image input. If you have a direct URL for the image, you can use ImageUrl: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_input.py ```python from pydantic_ai import Agent, ImageUrl agent = Agent(model='gateway/openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', ImageUrl(url='https://iili.io/3Hs4FMg.png'), ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` image_input.py ```python from pydantic_ai import Agent, ImageUrl agent = Agent(model='openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', ImageUrl(url='https://iili.io/3Hs4FMg.png'), ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` If you have the image locally, you can also use BinaryContent: [Learn about Gateway](https://ai.pydantic.dev/gateway) local_image_input.py ```python import httpx from pydantic_ai import Agent, BinaryContent image_response = httpx.get('https://iili.io/3Hs4FMg.png') # Pydantic logo agent = Agent(model='gateway/openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', BinaryContent(data=image_response.content, media_type='image/png'), # (1)! ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` 1. To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents. local_image_input.py ```python import httpx from pydantic_ai import Agent, BinaryContent image_response = httpx.get('https://iili.io/3Hs4FMg.png') # Pydantic logo agent = Agent(model='openai:gpt-5.2') result = agent.run_sync( [ 'What company is this logo from?', BinaryContent(data=image_response.content, media_type='image/png'), # (1)! ] ) print(result.output) #> This is the logo for Pydantic, a data validation and settings management library in Python. ``` 1. To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents. ## Audio Input Info Some models do not support audio input. Please check the model's documentation to confirm whether it supports audio input. You can provide audio input using either AudioUrl or BinaryContent. The process is analogous to the examples above. ## Video Input Info Some models do not support video input. Please check the model's documentation to confirm whether it supports video input. You can provide video input using either VideoUrl or BinaryContent. The process is analogous to the examples above. ## Document Input Info Some models do not support document input. Please check the model's documentation to confirm whether it supports document input. You can provide document input using either DocumentUrl or BinaryContent. The process is similar to the examples above. If you have a direct URL for the document, you can use DocumentUrl: [Learn about Gateway](https://ai.pydantic.dev/gateway) document_input.py ```python from pydantic_ai import Agent, DocumentUrl agent = Agent(model='gateway/anthropic:claude-sonnet-4-5') result = agent.run_sync( [ 'What is the main content of this document?', DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'), ] ) print(result.output) #> This document is the technical report introducing Gemini 1.5, Google's latest large language model... ``` document_input.py ```python from pydantic_ai import Agent, DocumentUrl agent = Agent(model='anthropic:claude-sonnet-4-5') result = agent.run_sync( [ 'What is the main content of this document?', DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'), ] ) print(result.output) #> This document is the technical report introducing Gemini 1.5, Google's latest large language model... ``` The supported document formats vary by model. You can also use BinaryContent to pass document data directly: [Learn about Gateway](https://ai.pydantic.dev/gateway) binary_content_input.py ```python from pathlib import Path from pydantic_ai import Agent, BinaryContent pdf_path = Path('document.pdf') agent = Agent(model='gateway/anthropic:claude-sonnet-4-5') result = agent.run_sync( [ 'What is the main content of this document?', BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'), ] ) print(result.output) #> The document discusses... ``` binary_content_input.py ```python from pathlib import Path from pydantic_ai import Agent, BinaryContent pdf_path = Path('document.pdf') agent = Agent(model='anthropic:claude-sonnet-4-5') result = agent.run_sync( [ 'What is the main content of this document?', BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'), ] ) print(result.output) #> The document discusses... ``` ## User-side download vs. direct file URL When using one of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI will default to sending the URL to the model provider, so the file is downloaded on their side. Support for file URLs varies depending on type and provider: | Model | Send URL directly | Download and send bytes | Unsupported | | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------- | ----------------------------------------------- | | OpenAIChatModel | `ImageUrl` | `AudioUrl`, `DocumentUrl` | `VideoUrl` | | OpenAIResponsesModel | `ImageUrl`, `AudioUrl`, `DocumentUrl` | — | `VideoUrl` | | AnthropicModel | `ImageUrl`, `DocumentUrl` (PDF) | `DocumentUrl` (`text/plain`) | `AudioUrl`, `VideoUrl` | | GoogleModel (Vertex) | All URL types | — | — | | GoogleModel (GLA) | [YouTube](https://ai.pydantic.dev/models/google/#document-image-audio-and-video-input), [Files API](https://ai.pydantic.dev/models/google/#document-image-audio-and-video-input) | All other URLs | — | | XaiModel | `ImageUrl` | `DocumentUrl` | `AudioUrl`, `VideoUrl` | | MistralModel | `ImageUrl`, `DocumentUrl` (PDF) | — | `AudioUrl`, `VideoUrl`, `DocumentUrl` (non-PDF) | | BedrockConverseModel | S3 URLs (`s3://`) | `ImageUrl`, `DocumentUrl`, `VideoUrl` | `AudioUrl` | | OpenRouterModel | `ImageUrl`, `DocumentUrl` | `AudioUrl` | `VideoUrl` | A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, GoogleModel on Vertex AI limits YouTube video URLs to one URL per request. In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object: force_download.py ```python from pydantic_ai import ImageUrl, AudioUrl, VideoUrl, DocumentUrl ImageUrl(url='https://example.com/image.png', force_download=True) AudioUrl(url='https://example.com/audio.mp3', force_download=True) VideoUrl(url='https://example.com/video.mp4', force_download=True) DocumentUrl(url='https://example.com/doc.pdf', force_download=True) ``` ## Uploaded Files Some model providers support passing URLs to files hosted on their platform: - GoogleModel supports the [Files API](https://ai.pydantic.dev/models/google/#document-image-audio-and-video-input) for uploading and referencing files. - BedrockConverseModel supports `s3:///` URIs, provided that the assumed role has the `s3:GetObject` permission. An optional `bucketOwner` query parameter must be specified if the bucket is not owned by the account making the request. For example: `s3://my-bucket/my-file.png?bucketOwner=123456789012`. # Function Tools Function tools provide a mechanism for models to perform actions and retrieve extra information to help them generate a response. They're useful when you want to enable the model to take some action and use the result, when it is impractical or impossible to put all the context an agent might need into the instructions, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool. If you want a model to be able to call a function as its final action, without the result being sent back to the model, you can use an [output function](https://ai.pydantic.dev/output/#output-functions) instead. There are a number of ways to register tools with an agent: - via the @agent.tool decorator — for tools that need access to the agent context - via the @agent.tool_plain decorator — for tools that do not need access to the agent context - via the tools keyword argument to `Agent` which can take either plain functions, or instances of Tool For more advanced use cases, the [toolsets](https://ai.pydantic.dev/toolsets/index.md) feature lets you manage collections of tools (built by you or provided by an [MCP server](https://ai.pydantic.dev/mcp/client/index.md) or other [third party](https://ai.pydantic.dev/third-party-tools/#third-party-tools)) and register them with an agent in one go via the toolsets keyword argument to `Agent`. Internally, all `tools` and `toolsets` are gathered into a single [combined toolset](https://ai.pydantic.dev/toolsets/#combining-toolsets) that's made available to the model. Function tools vs. RAG Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information. The main semantic difference between Pydantic AI Tools and RAG is RAG is synonymous with vector search, while Pydantic AI tools are more general-purpose. For vector search, you can use our [embeddings](https://ai.pydantic.dev/embeddings/index.md) support to generate embeddings across multiple providers. Function Tools vs. Structured Outputs As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for [structured output](https://ai.pydantic.dev/output/index.md) when using the default [tool output mode](https://ai.pydantic.dev/output/#tool-output), thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output. ## Registering via Decorator `@agent.tool` is considered the default decorator since in the majority of cases tools will need access to the agent context. Here's an example using both: [Learn about Gateway](https://ai.pydantic.dev/gateway) dice_game.py ```python import random from pydantic_ai import Agent, RunContext agent = Agent( 'gateway/gemini:gemini-3-flash-preview', # (1)! deps_type=str, # (2)! instructions=( "You're a dice game, you should roll the die and see if the number " "you get back matches the user's guess. If so, tell them they're a winner. " "Use the player's name in the response." ), ) @agent.tool_plain # (3)! def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) @agent.tool # (4)! def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5)! print(dice_result.output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. This is a pretty simple task, so we can use the fast and cheap Gemini flash model. 1. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency. 1. This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case. 1. This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case. 1. Run the agent, passing the player's name as the dependency. dice_game.py ```python import random from pydantic_ai import Agent, RunContext agent = Agent( 'google-gla:gemini-3-flash-preview', # (1)! deps_type=str, # (2)! instructions=( "You're a dice game, you should roll the die and see if the number " "you get back matches the user's guess. If so, tell them they're a winner. " "Use the player's name in the response." ), ) @agent.tool_plain # (3)! def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) @agent.tool # (4)! def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5)! print(dice_result.output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. This is a pretty simple task, so we can use the fast and cheap Gemini flash model. 1. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency. 1. This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case. 1. This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case. 1. Run the agent, passing the player's name as the dependency. *(This example is complete, it can be run "as is")* Let's print the messages from that game to see what happened: dice_game_messages.py ```python from dice_game import dice_result print(dice_result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='My guess is 4', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=54, output_tokens=2), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='roll_dice', content='4', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ ToolCallPart( tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=55, output_tokens=4), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ ToolReturnPart( tool_name='get_player_name', content='Anne', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", run_id='...', ), ModelResponse( parts=[ TextPart( content="Congratulations Anne, you guessed correctly! You're a winner!" ) ], usage=RequestUsage(input_tokens=56, output_tokens=12), model_name='gemini-3-flash-preview', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` We can represent this with a diagram: ``` sequenceDiagram participant Agent participant LLM Note over Agent: Send prompts Agent ->> LLM: System: "You're a dice game..."
User: "My guess is 4" activate LLM Note over LLM: LLM decides to use
a tool LLM ->> Agent: Call tool
roll_dice() deactivate LLM activate Agent Note over Agent: Rolls a six-sided die Agent -->> LLM: ToolReturn
"4" deactivate Agent activate LLM Note over LLM: LLM decides to use
another tool LLM ->> Agent: Call tool
get_player_name() deactivate LLM activate Agent Note over Agent: Retrieves player name Agent -->> LLM: ToolReturn
"Anne" deactivate Agent activate LLM Note over LLM: LLM constructs final response LLM ->> Agent: ModelResponse
"Congratulations Anne, ..." deactivate LLM Note over Agent: Game session complete ``` ## Registering via Agent Argument As well as using the decorators, we can register tools via the `tools` argument to the Agent constructor. This is useful when you want to reuse tools, and can also give more fine-grained control over the tools. [Learn about Gateway](https://ai.pydantic.dev/gateway) dice_game_tool_kwarg.py ```python import random from pydantic_ai import Agent, RunContext, Tool instructions = """\ You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response. """ def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps agent_a = Agent( 'gateway/gemini:gemini-3-flash-preview', deps_type=str, tools=[roll_dice, get_player_name], # (1)! instructions=instructions, ) agent_b = Agent( 'gateway/gemini:gemini-3-flash-preview', deps_type=str, tools=[ # (2)! Tool(roll_dice, takes_ctx=False), Tool(get_player_name, takes_ctx=True), ], instructions=instructions, ) dice_result = {} dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar') dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne') print(dice_result['a'].output) #> Tough luck, Yashar, you rolled a 4. Better luck next time. print(dice_result['b'].output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes RunContext. 1. `agent_a` and `agent_b` are identical — but we can use Tool to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](https://ai.pydantic.dev/tools-advanced/#tool-prepare) method. dice_game_tool_kwarg.py ```python import random from pydantic_ai import Agent, RunContext, Tool instructions = """\ You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response. """ def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps agent_a = Agent( 'google-gla:gemini-3-flash-preview', deps_type=str, tools=[roll_dice, get_player_name], # (1)! instructions=instructions, ) agent_b = Agent( 'google-gla:gemini-3-flash-preview', deps_type=str, tools=[ # (2)! Tool(roll_dice, takes_ctx=False), Tool(get_player_name, takes_ctx=True), ], instructions=instructions, ) dice_result = {} dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar') dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne') print(dice_result['a'].output) #> Tough luck, Yashar, you rolled a 4. Better luck next time. print(dice_result['b'].output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes RunContext. 1. `agent_a` and `agent_b` are identical — but we can use Tool to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](https://ai.pydantic.dev/tools-advanced/#tool-prepare) method. *(This example is complete, it can be run "as is")* ## Tool Output Tools can return anything that Pydantic can serialize to JSON. For advanced output options including multi-modal content and metadata, see [Advanced Tool Features](https://ai.pydantic.dev/tools-advanced/#function-tool-output). ## Tool Schema Function parameters are extracted from the function signature, and all parameters except `RunContext` are used to build the schema for that tool call. Even better, Pydantic AI extracts the docstring from functions and (thanks to [griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions from the docstring and adds them to the schema. [Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/#docstrings) extracting parameter descriptions from `google`, `numpy`, and `sphinx` style docstrings. Pydantic AI will infer the format to use based on the docstring, but you can explicitly set it using docstring_format. You can also enforce parameter requirements by setting `require_parameter_descriptions=True`. This will raise a UserError if a parameter description is missing. To demonstrate a tool's schema, here we use FunctionModel to print the schema a model would receive: tool_schema.py ```python from pydantic_ai import Agent, ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import AgentInfo, FunctionModel agent = Agent() @agent.tool_plain(docstring_format='google', require_parameter_descriptions=True) def foobar(a: int, b: str, c: dict[str, list[float]]) -> str: """Get me foobar. Args: a: apple pie b: banana cake c: carrot smoothie """ return f'{a} {b} {c}' def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: tool = info.function_tools[0] print(tool.description) #> Get me foobar. print(tool.parameters_json_schema) """ { 'additionalProperties': False, 'properties': { 'a': {'description': 'apple pie', 'type': 'integer'}, 'b': {'description': 'banana cake', 'type': 'string'}, 'c': { 'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'}, 'description': 'carrot smoothie', 'type': 'object', }, }, 'required': ['a', 'b', 'c'], 'type': 'object', } """ return ModelResponse(parts=[TextPart('foobar')]) agent.run_sync('hello', model=FunctionModel(print_schema)) ``` *(This example is complete, it can be run "as is")* If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object. Here's an example where we use TestModel.last_model_request_parameters to inspect the tool schema that would be passed to the model. single_parameter_tool.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.test import TestModel agent = Agent() class Foobar(BaseModel): """This is a Foobar""" x: int y: str z: float = 3.14 @agent.tool_plain def foobar(f: Foobar) -> str: return str(f) test_model = TestModel() result = agent.run_sync('hello', model=test_model) print(result.output) #> {"foobar":"x=0 y='a' z=3.14"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='foobar', parameters_json_schema={ 'properties': { 'x': {'type': 'integer'}, 'y': {'type': 'string'}, 'z': {'default': 3.14, 'type': 'number'}, }, 'required': ['x', 'y'], 'title': 'Foobar', 'type': 'object', }, description='This is a Foobar', ) ] """ ``` *(This example is complete, it can be run "as is")* Debugging Tool Calls Understanding tool behavior is crucial for agent development. By instrumenting your agent with [Logfire](https://ai.pydantic.dev/logfire/index.md), you can see: - What arguments were passed to each tool - What each tool returned - How long each tool took to execute - Any errors that occurred This visibility helps you understand why an agent made specific decisions and identify issues in tool implementations. ## See Also For more tool features and integrations, see: - [Advanced Tool Features](https://ai.pydantic.dev/tools-advanced/index.md) - Custom schemas, dynamic tools, tool execution and retries - [Toolsets](https://ai.pydantic.dev/toolsets/index.md) - Managing collections of tools - [Builtin Tools](https://ai.pydantic.dev/builtin-tools/index.md) - Native tools provided by LLM providers - [Common Tools](https://ai.pydantic.dev/common-tools/index.md) - Ready-to-use tool implementations - [Third-Party Tools](https://ai.pydantic.dev/third-party-tools/index.md) - Integrations with MCP, LangChain, ACI.dev and other tool libraries - [Deferred Tools](https://ai.pydantic.dev/deferred-tools/index.md) - Tools requiring approval or external execution # Common Tools Pydantic AI ships with native tools that can be used to enhance your agent's capabilities. ## DuckDuckGo Search Tool The DuckDuckGo search tool allows you to search the web for information. It is built on top of the [DuckDuckGo API](https://github.com/deedy5/ddgs). ### Installation To use duckduckgo_search_tool, you need to install [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) with the `duckduckgo` optional group: ```bash pip install "pydantic-ai-slim[duckduckgo]" ``` ```bash uv add "pydantic-ai-slim[duckduckgo]" ``` ### Usage Here's an example of how you can use the DuckDuckGo search tool with an agent: [Learn about Gateway](https://ai.pydantic.dev/gateway) duckduckgo_search.py ```python from pydantic_ai import Agent from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool agent = Agent( 'gateway/openai:gpt-5.2', tools=[duckduckgo_search_tool()], instructions='Search DuckDuckGo for the given query and return the results.', ) result = agent.run_sync( 'Can you list the top five highest-grossing animated films of 2025?' ) print(result.output) """ I looked into several sources on animated box‐office performance in 2025, and while detailed rankings can shift as more money is tallied, multiple independent reports have already highlighted a couple of record‐breaking shows. For example: • Ne Zha 2 – News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others) have reported that this Chinese title not only became the highest‑grossing animated film of 2025 but also broke records as the highest‑grossing non‑English animated film ever. One article noted its run exceeded US$1.7 billion. • Inside Out 2 – According to data shared on Statista and in industry news, this Pixar sequel has been on pace to set new records (with some sources even noting it as the highest‑grossing animated film ever, as of January 2025). Beyond those two, some entertainment trade sites (for example, a Just Jared article titled "Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during a box‑office run, many of the industry trackers have begun to single out five films as the biggest earners so far in 2025. Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a single, universally accepted list with final numbers that names the complete top five. (Box‑office rankings, especially mid‑year, can be fluid as films continue to add to their totals.) Based on what several sources note so far, the two undisputed leaders are: 1. Ne Zha 2 2. Inside Out 2 The remaining top spots (3–5) are reported by some outlets in their "Top‑10 Animated Films" lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th highest‑grossing films), I recommend checking resources like: • Wikipedia's "List of animated feature films of 2025" page • Box‑office tracking sites (such as Box Office Mojo or The Numbers) • Trade articles like the one on Just Jared To summarize with what is clear from the current reporting: 1. Ne Zha 2 2. Inside Out 2 3–5. Other animated films (yet to be definitively finalized across all reporting outlets) If you're looking for a final, consensus list of the top five, it may be best to wait until the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source. Would you like help finding a current source or additional details on where to look for the complete updated list? """ ``` duckduckgo_search.py ```python from pydantic_ai import Agent from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool agent = Agent( 'openai:gpt-5.2', tools=[duckduckgo_search_tool()], instructions='Search DuckDuckGo for the given query and return the results.', ) result = agent.run_sync( 'Can you list the top five highest-grossing animated films of 2025?' ) print(result.output) """ I looked into several sources on animated box‐office performance in 2025, and while detailed rankings can shift as more money is tallied, multiple independent reports have already highlighted a couple of record‐breaking shows. For example: • Ne Zha 2 – News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others) have reported that this Chinese title not only became the highest‑grossing animated film of 2025 but also broke records as the highest‑grossing non‑English animated film ever. One article noted its run exceeded US$1.7 billion. • Inside Out 2 – According to data shared on Statista and in industry news, this Pixar sequel has been on pace to set new records (with some sources even noting it as the highest‑grossing animated film ever, as of January 2025). Beyond those two, some entertainment trade sites (for example, a Just Jared article titled "Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during a box‑office run, many of the industry trackers have begun to single out five films as the biggest earners so far in 2025. Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a single, universally accepted list with final numbers that names the complete top five. (Box‑office rankings, especially mid‑year, can be fluid as films continue to add to their totals.) Based on what several sources note so far, the two undisputed leaders are: 1. Ne Zha 2 2. Inside Out 2 The remaining top spots (3–5) are reported by some outlets in their "Top‑10 Animated Films" lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th highest‑grossing films), I recommend checking resources like: • Wikipedia's "List of animated feature films of 2025" page • Box‑office tracking sites (such as Box Office Mojo or The Numbers) • Trade articles like the one on Just Jared To summarize with what is clear from the current reporting: 1. Ne Zha 2 2. Inside Out 2 3–5. Other animated films (yet to be definitively finalized across all reporting outlets) If you're looking for a final, consensus list of the top five, it may be best to wait until the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source. Would you like help finding a current source or additional details on where to look for the complete updated list? """ ``` ## Tavily Search Tool Info Tavily is a paid service, but they have free credits to explore their product. You need to [sign up for an account](https://app.tavily.com/home) and get an API key to use the Tavily search tool. The Tavily search tool allows you to search the web for information. It is built on top of the [Tavily API](https://tavily.com/). ### Installation To use tavily_search_tool, you need to install [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) with the `tavily` optional group: ```bash pip install "pydantic-ai-slim[tavily]" ``` ```bash uv add "pydantic-ai-slim[tavily]" ``` ### Usage Here's an example of how you can use the Tavily search tool with an agent: [Learn about Gateway](https://ai.pydantic.dev/gateway) tavily_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'gateway/openai:gpt-5.2', tools=[tavily_search_tool(api_key)], instructions='Search Tavily for the given query and return the results.', ) result = agent.run_sync('Tell me the top news in the GenAI world, give me links.') print(result.output) """ Here are some of the top recent news articles related to GenAI: 1. How CLEAR users can improve risk analysis with GenAI – Thomson Reuters Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/ (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.) 2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think – FT.com Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1 (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.) 3. The Essential Guide to Generative AI – Virtualization Review Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.) Feel free to click on the links to dive deeper into each story! """ ``` tavily_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'openai:gpt-5.2', tools=[tavily_search_tool(api_key)], instructions='Search Tavily for the given query and return the results.', ) result = agent.run_sync('Tell me the top news in the GenAI world, give me links.') print(result.output) """ Here are some of the top recent news articles related to GenAI: 1. How CLEAR users can improve risk analysis with GenAI – Thomson Reuters Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/ (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.) 2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think – FT.com Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1 (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.) 3. The Essential Guide to Generative AI – Virtualization Review Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.) Feel free to click on the links to dive deeper into each story! """ ``` ## Exa Search Tool Info Exa is a paid service with free credits to explore their product. You need to [sign up for an account](https://dashboard.exa.ai) and get an API key to use the Exa tools. Exa is a neural search engine that finds high-quality, relevant results across billions of web pages. It provides several tools including web search, finding similar pages, content retrieval, and AI-powered answers. ### Installation To use Exa tools, you need to install [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) with the `exa` optional group: ```bash pip install "pydantic-ai-slim[exa]" ``` ```bash uv add "pydantic-ai-slim[exa]" ``` ### Usage You can use Exa tools individually or as a toolset. The following tools are available: - exa_search_tool: Search the web with various search types (auto, keyword, neural, fast, deep) - exa_find_similar_tool: Find pages similar to a given URL - exa_get_contents_tool: Get full text content from URLs - exa_answer_tool: Get AI-powered answers with citations #### Using Individual Tools [Learn about Gateway](https://ai.pydantic.dev/gateway) exa_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import exa_search_tool api_key = os.getenv('EXA_API_KEY') assert api_key is not None agent = Agent( 'gateway/openai:gpt-5.2', tools=[exa_search_tool(api_key, num_results=5, max_characters=1000)], system_prompt='Search the web for information using Exa.', ) result = agent.run_sync('What are the latest developments in quantum computing?') print(result.output) ``` exa_search.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import exa_search_tool api_key = os.getenv('EXA_API_KEY') assert api_key is not None agent = Agent( 'openai:gpt-5.2', tools=[exa_search_tool(api_key, num_results=5, max_characters=1000)], system_prompt='Search the web for information using Exa.', ) result = agent.run_sync('What are the latest developments in quantum computing?') print(result.output) ``` #### Using ExaToolset For better efficiency when using multiple Exa tools, use ExaToolset which shares a single API client across all tools. You can configure which tools to include: [Learn about Gateway](https://ai.pydantic.dev/gateway) exa_toolset.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset api_key = os.getenv('EXA_API_KEY') assert api_key is not None toolset = ExaToolset( api_key, num_results=5, max_characters=1000, # Limit text content to control token usage include_search=True, # Include the search tool (default: True) include_find_similar=True, # Include the find_similar tool (default: True) include_get_contents=False, # Exclude the get_contents tool include_answer=True, # Include the answer tool (default: True) ) agent = Agent( 'gateway/openai:gpt-5.2', toolsets=[toolset], system_prompt='You have access to Exa search tools to find information on the web.', ) result = agent.run_sync('Find recent AI research papers and summarize the key findings.') print(result.output) ``` exa_toolset.py ```python import os from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset api_key = os.getenv('EXA_API_KEY') assert api_key is not None toolset = ExaToolset( api_key, num_results=5, max_characters=1000, # Limit text content to control token usage include_search=True, # Include the search tool (default: True) include_find_similar=True, # Include the find_similar tool (default: True) include_get_contents=False, # Exclude the get_contents tool include_answer=True, # Include the answer tool (default: True) ) agent = Agent( 'openai:gpt-5.2', toolsets=[toolset], system_prompt='You have access to Exa search tools to find information on the web.', ) result = agent.run_sync('Find recent AI research papers and summarize the key findings.') print(result.output) ``` "Output" refers to the final value returned from [running an agent](https://ai.pydantic.dev/agent/#running-agents). This can be either plain text, [structured data](#structured-output), an [image](#image-output), or the result of a [function](#output-functions) called with arguments provided by the model. The output is wrapped in AgentRunResult or StreamedRunResult so that you can access other data, like usage of the run and [message history](https://ai.pydantic.dev/message-history/#accessing-messages-from-results). Both `AgentRunResult` and `StreamedRunResult` are generic in the data they wrap, so typing information about the data returned by the agent is preserved. A run ends when the model responds with one of the output types, or, if no output type is specified or `str` is one of the allowed options, when a plain text response is received. A run can also be cancelled if usage limits are exceeded, see [Usage Limits](https://ai.pydantic.dev/agent/#usage-limits). Here's an example using a Pydantic model as the `output_type`, forcing the model to respond with data matching our specification: [Learn about Gateway](https://ai.pydantic.dev/gateway) olympics.py ```python from pydantic import BaseModel from pydantic_ai import Agent class CityLocation(BaseModel): city: str country: str agent = Agent('gateway/gemini:gemini-3-flash-preview', output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` olympics.py ```python from pydantic import BaseModel from pydantic_ai import Agent class CityLocation(BaseModel): city: str country: str agent = Agent('google-gla:gemini-3-flash-preview', output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` *(This example is complete, it can be run "as is")* ## Structured output data The Agent class constructor takes an `output_type` argument that takes one or more types or [output functions](#output-functions). It supports simple scalar types, list and dict types (including `TypedDict`s and [`StructuredDict`s](#structured-dict)), dataclasses and Pydantic models, as well as type unions -- generally everything supported as type hints in a Pydantic model. You can also pass a list of multiple choices. By default, Pydantic AI leverages the model's tool calling capability to make it return structured data. When multiple output types are specified (in a union or list), each member is registered with the model as a separate output tool in order to reduce the complexity of the schema and maximise the chances a model will respond correctly. This has been shown to work well across a wide range of models. If you'd like to change the names of the output tools, use a model's native structured output feature, or pass the output schema to the model in its [instructions](https://ai.pydantic.dev/agent/#instructions), you can use an [output mode](#output-modes) marker class. When no output type is specified, or when `str` is among the output types, any plain text response from the model will be used as the output data. If `str` is not among the output types, the model is forced to return structured data or call an output function. If the output type schema is not of type `"object"` (e.g. it's `int` or `list[int]`), the output type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas. Structured outputs (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model. Type checking considerations The Agent class is generic in its output type, and this type is carried through to `AgentRunResult.output` and `StreamedRunResult.output` so that your IDE or static type checker can warn you when your code doesn't properly take into account all the possible values those outputs could have. Static type checkers like pyright and mypy will do their best to infer the agent's output type from the `output_type` you've specified, but they're not always able to do so correctly when you provide functions or multiple types in a union or list, even though Pydantic AI will behave correctly. When this happens, your type checker will complain even when you're confident you've passed a valid `output_type`, and you'll need to help the type checker by explicitly specifying the generic parameters on the `Agent` constructor. This is shown in the second example below and the output functions example further down. Specifically, there are three valid uses of `output_type` where you'll need to do this: 1. When using a union of types, e.g. `output_type=Foo | Bar`. Until [PEP-747](https://peps.python.org/pep-0747/) "Annotating Type Forms" lands in Python 3.15, type checkers do not consider these a valid value for `output_type`. In addition to the generic parameters on the `Agent` constructor, you'll need to add `# type: ignore` to the line that passes the union to `output_type`. Alternatively, you can use a list: `output_type=[Foo, Bar]`. 1. With mypy: When using a list, as a functionally equivalent alternative to a union, or because you're passing in [output functions](#output-functions). Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19142) with mypy to try and get this fixed. 1. With mypy: when using an async output function. Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19143) with mypy to try and get this fixed. Here's an example of returning either text or structured data: [Learn about Gateway](https://ai.pydantic.dev/gateway) box_or_error.py ```python from pydantic import BaseModel from pydantic_ai import Agent class Box(BaseModel): width: int height: int depth: int units: str agent = Agent( 'gateway/openai:gpt-5-mini', output_type=[Box, str], # (1)! instructions=( "Extract me the dimensions of a box, " "if you can't extract all data, ask the user to try again." ), ) result = agent.run_sync('The box is 10x20x30') print(result.output) #> Please provide the units for the dimensions (e.g., cm, in, m). result = agent.run_sync('The box is 10x20x30 cm') print(result.output) #> width=10 height=20 depth=30 units='cm' ``` 1. This could also have been a union: `output_type=Box | str`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. box_or_error.py ```python from pydantic import BaseModel from pydantic_ai import Agent class Box(BaseModel): width: int height: int depth: int units: str agent = Agent( 'openai:gpt-5-mini', output_type=[Box, str], # (1)! instructions=( "Extract me the dimensions of a box, " "if you can't extract all data, ask the user to try again." ), ) result = agent.run_sync('The box is 10x20x30') print(result.output) #> Please provide the units for the dimensions (e.g., cm, in, m). result = agent.run_sync('The box is 10x20x30 cm') print(result.output) #> width=10 height=20 depth=30 units='cm' ``` 1. This could also have been a union: `output_type=Box | str`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* Here's an example of using a union return type, which will register multiple output tools and wrap non-object schemas in an object: colors_or_sizes.py ```python from pydantic_ai import Agent agent = Agent[None, list[str] | list[int]]( 'openai:gpt-5-mini', output_type=list[str] | list[int], # type: ignore # (1)! instructions='Extract either colors or sizes from the shapes provided.', ) result = agent.run_sync('red square, blue circle, green triangle') print(result.output) #> ['red', 'blue', 'green'] result = agent.run_sync('square size 10, circle size 20, triangle size 30') print(result.output) #> [10, 20, 30] ``` 1. As explained in the "Type checking considerations" section above, using a union rather than a list requires explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* ### Output functions Instead of plain text or structured data, you may want the output of your agent run to be the result of a function called with arguments provided by the model, for example to further process or validate the data provided through the arguments (with the option to tell the model to try again), or to hand off to another agent. Output functions are similar to [function tools](https://ai.pydantic.dev/tools/index.md), but the model is forced to call one of them, the call ends the agent run, and the result is not passed back to the model. As with tool functions, output function arguments provided by the model are validated using Pydantic (with optional [validation context](#validation-context)), can optionally take RunContext as the first argument, and can raise ModelRetry to ask the model to try again with modified arguments (or with a different output type). To specify output functions, you set the agent's `output_type` to either a single function (or bound instance method), or a list of functions. The list can also contain other output types like simple scalars or entire Pydantic models. You typically do not want to also register your output function as a tool (using the `@agent.tool` decorator or `tools` argument), as this could confuse the model about which it should be calling. Here's an example of all of these features in action: output_functions.py ```python import re from pydantic import BaseModel from pydantic_ai import Agent, ModelRetry, RunContext, UnexpectedModelBehavior class Row(BaseModel): name: str country: str tables = { 'capital_cities': [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] } class SQLFailure(BaseModel): """An unrecoverable failure. Only use this when you can't change the query to make it work.""" explanation: str def run_sql_query(query: str) -> list[Row]: """Run a SQL query on the database.""" select_table = re.match(r'SELECT (.+) FROM (\w+)', query) if select_table: column_names = select_table.group(1) if column_names != '*': raise ModelRetry("Only 'SELECT *' is supported, you'll have to do column filtering manually.") table_name = select_table.group(2) if table_name not in tables: raise ModelRetry( f"Unknown table '{table_name}' in query '{query}'. Available tables: {', '.join(tables.keys())}." ) return tables[table_name] raise ModelRetry(f"Unsupported query: '{query}'.") sql_agent = Agent[None, list[Row] | SQLFailure]( 'openai:gpt-5.2', output_type=[run_sql_query, SQLFailure], instructions='You are a SQL agent that can run SQL queries on a database.', ) async def hand_off_to_sql_agent(ctx: RunContext, query: str) -> list[Row]: """I take natural language queries, turn them into SQL, and run them on a database.""" # Drop the final message with the output tool call, as it shouldn't be passed on to the SQL agent messages = ctx.messages[:-1] try: result = await sql_agent.run(query, message_history=messages) output = result.output if isinstance(output, SQLFailure): raise ModelRetry(f'SQL agent failed: {output.explanation}') return output except UnexpectedModelBehavior as e: # Bubble up potentially retryable errors to the router agent if (cause := e.__cause__) and isinstance(cause, ModelRetry): raise ModelRetry(f'SQL agent failed: {cause.message}') from e else: raise class RouterFailure(BaseModel): """Use me when no appropriate agent is found or the used agent failed.""" explanation: str router_agent = Agent[None, list[Row] | RouterFailure]( 'openai:gpt-5.2', output_type=[hand_off_to_sql_agent, RouterFailure], instructions='You are a router to other agents. Never try to solve a problem yourself, just pass it on.', ) result = router_agent.run_sync('Select the names and countries of all capitals') print(result.output) """ [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] """ result = router_agent.run_sync('Select all pets') print(repr(result.output)) """ RouterFailure(explanation="The requested table 'pets' does not exist in the database. The only available table is 'capital_cities', which does not contain data about pets.") """ result = router_agent.run_sync('How do I fly from Amsterdam to Mexico City?') print(repr(result.output)) """ RouterFailure(explanation='I am not equipped to provide travel information, such as flights from Amsterdam to Mexico City.') """ ``` #### Text output If you provide an output function that takes a string, Pydantic AI will by default create an output tool like for any other output function. If instead you'd like the model to provide the string using plain text output, you can wrap the function in the TextOutput marker class. If desired, this marker class can be used alongside one or more [`ToolOutput`](#tool-output) marker classes (or unmarked types or functions) in a list provided to `output_type`. Like other output functions, text output functions can optionally take RunContext as the first argument, and can raise ModelRetry to ask the model to try again with modified arguments (or with a different output type). [Learn about Gateway](https://ai.pydantic.dev/gateway) text_output_function.py ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'gateway/openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` text_output_function.py ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` *(This example is complete, it can be run "as is")* #### Handling partial output in output functions When streaming with `run_stream()` or `run_stream_sync()`, output functions are called **multiple times** — once for each partial output received from the model, and once for the final complete output. You should check the RunContext.partial_output flag when your output function has **side effects** (e.g., sending notifications, logging, database updates) that should only execute on the final output. When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](https://ai.pydantic.dev/agent/#running-agents), `partial_output` is always `False` as the function is only called once with the complete output. [Learn about Gateway](https://ai.pydantic.dev/gateway) output_function_with_side_effects.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext class DatabaseRecord(BaseModel): name: str value: int | None = None # Make optional to allow partial output def save_to_database(ctx: RunContext, record: DatabaseRecord) -> DatabaseRecord: """Output function with side effect - only save final output to database.""" if ctx.partial_output: # Skip side effects for partial outputs return record # Only execute side effect for the final output print(f'Saving to database: {record.name} = {record.value}') #> Saving to database: test = 42 return record agent = Agent('gateway/openai:gpt-5.2', output_type=save_to_database) async def main(): async with agent.run_stream('Create a record with name "test" and value 42') as result: async for output in result.stream_output(debounce_by=None): print(output) #> name='test' value=None #> name='test' value=42 ``` output_function_with_side_effects.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext class DatabaseRecord(BaseModel): name: str value: int | None = None # Make optional to allow partial output def save_to_database(ctx: RunContext, record: DatabaseRecord) -> DatabaseRecord: """Output function with side effect - only save final output to database.""" if ctx.partial_output: # Skip side effects for partial outputs return record # Only execute side effect for the final output print(f'Saving to database: {record.name} = {record.value}') #> Saving to database: test = 42 return record agent = Agent('openai:gpt-5.2', output_type=save_to_database) async def main(): async with agent.run_stream('Create a record with name "test" and value 42') as result: async for output in result.stream_output(debounce_by=None): print(output) #> name='test' value=None #> name='test' value=42 ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Output modes Pydantic AI implements three different methods to get a model to output structured data: 1. [Tool Output](#tool-output), where tool calls are used to produce the output. 1. [Native Output](#native-output), where the model is required to produce text content compliant with a provided JSON schema. 1. [Prompted Output](#prompted-output), where a prompt is injected into the model instructions including the desired JSON schema, and we attempt to parse the model's plain-text response as appropriate. #### Tool Output In the default Tool Output mode, the output JSON schema of each output type (or function) is provided to the model as the parameters schema of a special output tool. This is the default as it's supported by virtually all models and has been shown to work very well. If you'd like to change the name of the output tool, pass a custom description to aid the model, or turn on or off strict mode, you can wrap the type(s) in the ToolOutput marker class and provide the appropriate arguments. Note that by default, the description is taken from the docstring specified on a Pydantic model or output function, so specifying it using the marker class is typically not necessary. To dynamically modify or filter the available output tools during an agent run, you can define an agent-wide `prepare_output_tools` function that will be called ahead of each step of a run. This function should be of type ToolsPrepareFunc, which takes the RunContext and a list of ToolDefinition, and returns a new list of tool definitions (or `None` to disable all tools for that step). This is analogous to the [`prepare_tools` function](https://ai.pydantic.dev/tools-advanced/#prepare-tools) for non-output tools. [Learn about Gateway](https://ai.pydantic.dev/gateway) tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'gateway/openai:gpt-5.2', output_type=[ # (1)! ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` 1. If we were passing just `Fruit` and `Vehicle` without custom tool names, we could have used a union: `output_type=Fruit | Vehicle`. However, as `ToolOutput` is an object rather than a type, we have to use a list. tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-5.2', output_type=[ # (1)! ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` 1. If we were passing just `Fruit` and `Vehicle` without custom tool names, we could have used a union: `output_type=Fruit | Vehicle`. However, as `ToolOutput` is an object rather than a type, we have to use a list. *(This example is complete, it can be run "as is")* ##### Parallel Output Tool Calls When the model calls other tools in parallel with an output tool, you can control how tool calls are executed by setting the agent's end_strategy: - `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped - `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output The `'exhaustive'` strategy is useful when tools have important side effects (like logging, sending notifications, or updating metrics) that should always execute. Priority of output and deferred tools in streaming methods The run_stream() and run_stream_sync() methods will consider the first output that matches the [output type](https://ai.pydantic.dev/output/#structured-output) (which could be text, an [output tool](https://ai.pydantic.dev/output/#tool-output) call, or a [deferred](https://ai.pydantic.dev/deferred-tools/index.md) tool call) to be the final output of the agent run, even when the model generates (additional) tool calls after this "final" output. This means that if the model calls deferred tools before output tools when using these methods, the deferred tool calls determine the agent run's final output, while the other [run methods](https://ai.pydantic.dev/agent/#running-agents) would have prioritized the tool output. #### Native Output Native Output mode uses a model's native "Structured Outputs" feature (aka "JSON Schema response format"), where the model is forced to only output text matching the provided JSON schema. Note that this is not supported by all models, and sometimes comes with restrictions. For example, Gemini cannot use tools at the same time as structured output, and attempting to do so will result in an error. To use this mode, you can wrap the output type(s) in the NativeOutput marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. [Learn about Gateway](https://ai.pydantic.dev/gateway) native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'gateway/openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], # (1)! name='Fruit_or_vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Fruit | Vehicle`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], # (1)! name='Fruit_or_vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Fruit | Vehicle`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* #### Prompted Output In this mode, the model is prompted to output text matching the provided JSON schema through its [instructions](https://ai.pydantic.dev/agent/#instructions) and it's up to the model to interpret those instructions correctly. This is usable with all models, but is often the least reliable approach as the model is not forced to match the schema. While we would generally suggest starting with tool or native output, in some cases this mode may result in higher quality outputs, and for models without native tool calling or structured output support it is the only option for producing structured outputs. If the model API supports the "JSON Mode" feature (aka "JSON Object response format") to force the model to output valid JSON, this is enabled, but it's still up to the model to abide by the schema. Pydantic AI will validate the returned structured data and tell the model to try again if validation fails, but if the model is not intelligent enough this may not be sufficient. To use this mode, you can wrap the output type(s) in the PromptedOutput marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. Additionally, it supports an `template` argument lets you specify a custom instructions template to be used instead of the default. [Learn about Gateway](https://ai.pydantic.dev/gateway) prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'gateway/openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], # (1)! name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'gateway/openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Vehicle | Device`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], # (1)! name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Vehicle | Device`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* ### Custom JSON schema If it's not feasible to define your desired structured output object using a Pydantic `BaseModel`, dataclass, or `TypedDict`, for example when you get a JSON schema from an external source or generate it dynamically, you can use the StructuredDict() helper function to generate a `dict[str, Any]` subclass with a JSON schema attached that Pydantic AI will pass to the model. Note that Pydantic AI will not perform any validation of the received JSON object and it's up to the model to correctly interpret the schema and any constraints expressed in it, like required fields or integer value ranges. The output type will be a `dict[str, Any]` and it's up to your code to defensively read from it in case the model made a mistake. You can use an [output validator](#output-validator-functions) to reflect validation errors back to the model and get it to try again. Along with the JSON schema, you can optionally pass `name` and `description` arguments to provide additional context to the model: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, StructuredDict HumanDict = StructuredDict( { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] }, name='Human', description='A human with a name and age', ) agent = Agent('gateway/openai:gpt-5.2', output_type=HumanDict) result = agent.run_sync('Create a person') #> {'name': 'John Doe', 'age': 30} ``` ```python from pydantic_ai import Agent, StructuredDict HumanDict = StructuredDict( { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] }, name='Human', description='A human with a name and age', ) agent = Agent('openai:gpt-5.2', output_type=HumanDict) result = agent.run_sync('Create a person') #> {'name': 'John Doe', 'age': 30} ``` ### Validation context Some validation relies on an extra Pydantic [context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) object. You can pass such an object to an `Agent` at definition-time via its validation_context parameter. It will be used in the validation of both structured outputs and [tool arguments](https://ai.pydantic.dev/tools-advanced/#tool-retries). This validation context can be either: - the context object itself (`Any`), used as-is to validate outputs, or - a function that takes the RunContext and returns a context object (`Any`). This function will be called automatically before each validation, allowing you to build a dynamic validation context. Don't confuse this *validation* context with the *LLM* context This Pydantic validation context object is only used internally by Pydantic AI for tool arg and output validation. In particular, it is **not** included in the prompts or messages sent to the language model. [Learn about Gateway](https://ai.pydantic.dev/gateway) validation_context.py ```python from dataclasses import dataclass from pydantic import BaseModel, ValidationInfo, field_validator from pydantic_ai import Agent class Value(BaseModel): x: int @field_validator('x') def increment_value(cls, value: int, info: ValidationInfo): return value + (info.context or 0) agent = Agent( 'gateway/gemini:gemini-3-flash-preview', output_type=Value, validation_context=10, ) result = agent.run_sync('Give me a value of 5.') print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) @dataclass class Deps: increment: int agent = Agent( 'gateway/gemini:gemini-3-flash-preview', output_type=Value, deps_type=Deps, validation_context=lambda ctx: ctx.deps.increment, ) result = agent.run_sync('Give me a value of 5.', deps=Deps(increment=10)) print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) ``` validation_context.py ```python from dataclasses import dataclass from pydantic import BaseModel, ValidationInfo, field_validator from pydantic_ai import Agent class Value(BaseModel): x: int @field_validator('x') def increment_value(cls, value: int, info: ValidationInfo): return value + (info.context or 0) agent = Agent( 'google-gla:gemini-3-flash-preview', output_type=Value, validation_context=10, ) result = agent.run_sync('Give me a value of 5.') print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) @dataclass class Deps: increment: int agent = Agent( 'google-gla:gemini-3-flash-preview', output_type=Value, deps_type=Deps, validation_context=lambda ctx: ctx.deps.increment, ) result = agent.run_sync('Give me a value of 5.', deps=Deps(increment=10)) print(repr(result.output)) # 5 from the model + 10 from the validation context #> Value(x=15) ``` *(This example is complete, it can be run "as is")* ### Output validators Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. Pydantic AI provides a way to add validation functions via the agent.output_validator decorator. If you want to implement separate validation logic for different output types, it's recommended to use [output functions](#output-functions) instead, to save you from having to do `isinstance` checks inside the output validator. If you want the model to output plain text, do your own processing or validation, and then have the agent's final output be the result of your function, it's recommended to use an [output function](#output-functions) with the [`TextOutput` marker class](#text-output). Here's a simplified variant of the [SQL Generation example](https://ai.pydantic.dev/examples/sql-gen/index.md): sql_gen.py ```python from fake_database import DatabaseConn, QueryError from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry class Success(BaseModel): sql_query: str class InvalidRequest(BaseModel): error_message: str Output = Success | InvalidRequest agent = Agent[DatabaseConn, Output]( 'google-gla:gemini-3-flash-preview', output_type=Output, # type: ignore deps_type=DatabaseConn, instructions='Generate PostgreSQL flavored SQL queries based on user input.', ) @agent.output_validator async def validate_sql(ctx: RunContext[DatabaseConn], output: Output) -> Output: if isinstance(output, InvalidRequest): return output try: await ctx.deps.execute(f'EXPLAIN {output.sql_query}') except QueryError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output result = agent.run_sync( 'get me users who were last active yesterday.', deps=DatabaseConn() ) print(result.output) #> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day' ``` *(This example is complete, it can be run "as is")* #### Handling partial output in output validators When streaming with `run_stream()` or `run_stream_sync()`, output validators are called **multiple times** — once for each partial output received from the model, and once for the final complete output. You should check the RunContext.partial_output flag when you want to **validate only the complete result**, not intermediate partial values. When streaming, `partial_output` is `True` for each partial output and `False` for the final complete output. For all [other run methods](https://ai.pydantic.dev/agent/#running-agents), `partial_output` is always `False` as the validator is only called once with the complete output. [Learn about Gateway](https://ai.pydantic.dev/gateway) partial_validation_streaming.py ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('gateway/openai:gpt-5.2') @agent.output_validator def validate_output(ctx: RunContext, output: str) -> str: if ctx.partial_output: return output if len(output) < 50: raise ModelRetry('Output is too short.') return output async def main(): async with agent.run_stream('Write a long story about a cat') as result: async for message in result.stream_text(): print(message) #> Once upon a #> Once upon a time, there was #> Once upon a time, there was a curious cat #> Once upon a time, there was a curious cat named Whiskers who #> Once upon a time, there was a curious cat named Whiskers who loved to explore #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around him... ``` partial_validation_streaming.py ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('openai:gpt-5.2') @agent.output_validator def validate_output(ctx: RunContext, output: str) -> str: if ctx.partial_output: return output if len(output) < 50: raise ModelRetry('Output is too short.') return output async def main(): async with agent.run_stream('Write a long story about a cat') as result: async for message in result.stream_text(): print(message) #> Once upon a #> Once upon a time, there was #> Once upon a time, there was a curious cat #> Once upon a time, there was a curious cat named Whiskers who #> Once upon a time, there was a curious cat named Whiskers who loved to explore #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around #> Once upon a time, there was a curious cat named Whiskers who loved to explore the world around him... ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Image output Some models can generate images as part of their response, for example those that support the [Image Generation built-in tool](https://ai.pydantic.dev/builtin-tools/#image-generation-tool) and OpenAI models using the [Code Execution built-in tool](https://ai.pydantic.dev/builtin-tools/#code-execution-tool) when told to generate a chart. To use the generated image as the output of the agent run, you can set `output_type` to BinaryImage. If no image-generating built-in tool is explicitly specified, the ImageGenerationTool will be enabled automatically. [Learn about Gateway](https://ai.pydantic.dev/gateway) image_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('gateway/openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` image_output.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* If an agent does not need to always generate an image, you can use a union of `BinaryImage` and `str`. If the model generates both, the image will take precedence as output and the text will be available on ModelResponse.text: [Learn about Gateway](https://ai.pydantic.dev/gateway) image_output_union.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('gateway/openai-responses:gpt-5.2', output_type=BinaryImage | str) result = agent.run_sync('Tell me a two-sentence story about an axolotl, no image please.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') assert isinstance(result.output, BinaryImage) print(result.response.text) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ ``` image_output_union.py ```python from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5.2', output_type=BinaryImage | str) result = agent.run_sync('Tell me a two-sentence story about an axolotl, no image please.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') assert isinstance(result.output, BinaryImage) print(result.response.text) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ ``` ## Streamed Results There two main challenges with streamed results: 1. Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in [pydantic/pydantic#10748](https://github.com/pydantic/pydantic/pull/10748). 1. When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. Pydantic AI streams just enough of the response to sniff out if it's a tool call or an output, then streams the whole thing and calls tools, or returns the stream as a StreamedRunResult. Note As the `run_stream()` method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use agent.run_stream_events() ([docs](https://ai.pydantic.dev/agent/#streaming-all-events)) or agent.iter() ([docs](https://ai.pydantic.dev/agent/#streaming-all-events-and-output)) instead. ### Streaming Text Example of streamed text output: [Learn about Gateway](https://ai.pydantic.dev/gateway) streamed_hello_world.py ```python from pydantic_ai import Agent agent = Agent('gateway/gemini:gemini-3-flash-preview') # (1)! async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: # (2)! async for message in result.stream_text(): # (3)! print(message) #> The first known #> The first known use of "hello, #> The first known use of "hello, world" was in #> The first known use of "hello, world" was in a 1974 textbook #> The first known use of "hello, world" was in a 1974 textbook about the C #> The first known use of "hello, world" was in a 1974 textbook about the C programming language. ``` 1. Streaming works with the standard Agent class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming). 1. The Agent.run_stream() method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes. 1. Each item yield by StreamedRunResult.stream_text() is the complete text response, extended as new data is received. streamed_hello_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-flash-preview') # (1)! async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: # (2)! async for message in result.stream_text(): # (3)! print(message) #> The first known #> The first known use of "hello, #> The first known use of "hello, world" was in #> The first known use of "hello, world" was in a 1974 textbook #> The first known use of "hello, world" was in a 1974 textbook about the C #> The first known use of "hello, world" was in a 1974 textbook about the C programming language. ``` 1. Streaming works with the standard Agent class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming). 1. The Agent.run_stream() method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes. 1. Each item yield by StreamedRunResult.stream_text() is the complete text response, extended as new data is received. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* We can also stream text as deltas rather than the entire text in each item: [Learn about Gateway](https://ai.pydantic.dev/gateway) streamed_delta_hello_world.py ```python from pydantic_ai import Agent agent = Agent('gateway/gemini:gemini-3-flash-preview') async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: async for message in result.stream_text(delta=True): # (1)! print(message) #> The first known #> use of "hello, #> world" was in #> a 1974 textbook #> about the C #> programming language. ``` 1. stream_text will error if the response is not text. streamed_delta_hello_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-flash-preview') async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: async for message in result.stream_text(delta=True): # (1)! print(message) #> The first known #> use of "hello, #> world" was in #> a 1974 textbook #> about the C #> programming language. ``` 1. stream_text will error if the response is not text. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Output message not included in `messages` The final output message will **NOT** be added to result messages if you use `.stream_text(delta=True)`, see [Messages and chat history](https://ai.pydantic.dev/message-history/index.md) for more information. ### Streaming Structured Output Here's an example of streaming a user profile as it's built: [Learn about Gateway](https://ai.pydantic.dev/gateway) streamed_user_profile.py ```python from datetime import date from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent class UserProfile(TypedDict): name: str dob: NotRequired[date] bio: NotRequired[str] agent = Agent( 'gateway/openai:gpt-5.2', output_type=UserProfile, instructions='Extract a user profile from the input', ) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for profile in result.stream_output(): print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` streamed_user_profile.py ```python from datetime import date from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent class UserProfile(TypedDict): name: str dob: NotRequired[date] bio: NotRequired[str] agent = Agent( 'openai:gpt-5.2', output_type=UserProfile, instructions='Extract a user profile from the input', ) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for profile in result.stream_output(): print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* As setting an `output_type` uses the [Tool Output](#tool-output) mode by default, this will only work if the model supports streaming tool arguments. For models that don't, like Gemini, try [Native Output](#native-output) or [Prompted Output](#prompted-output) instead. ### Streaming Model Responses If you want fine-grained control of validation, you can use the following pattern to get the entire partial ModelResponse: [Learn about Gateway](https://ai.pydantic.dev/gateway) streamed_user_profile.py ```python from datetime import date from pydantic import ValidationError from typing_extensions import TypedDict from pydantic_ai import Agent class UserProfile(TypedDict, total=False): name: str dob: date bio: str agent = Agent('gateway/openai:gpt-5.2', output_type=UserProfile) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for message, last in result.stream_responses(debounce_by=0.01): # (1)! try: profile = await result.validate_response_output( # (2)! message, allow_partial=not last, ) except ValidationError: continue print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` 1. stream_responses streams the data as ModelResponse objects, thus iteration can't fail with a `ValidationError`. 1. validate_response_output validates the data, `allow_partial=True` enables pydantic's experimental_allow_partial flag on TypeAdapter. streamed_user_profile.py ```python from datetime import date from pydantic import ValidationError from typing_extensions import TypedDict from pydantic_ai import Agent class UserProfile(TypedDict, total=False): name: str dob: date bio: str agent = Agent('openai:gpt-5.2', output_type=UserProfile) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for message, last in result.stream_responses(debounce_by=0.01): # (1)! try: profile = await result.validate_response_output( # (2)! message, allow_partial=not last, ) except ValidationError: continue print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` 1. stream_responses streams the data as ModelResponse objects, thus iteration can't fail with a `ValidationError`. 1. validate_response_output validates the data, `allow_partial=True` enables pydantic's experimental_allow_partial flag on TypeAdapter. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Examples The following examples demonstrate how to use streamed responses in Pydantic AI: - [Stream markdown](https://ai.pydantic.dev/examples/stream-markdown/index.md) - [Stream Whales](https://ai.pydantic.dev/examples/stream-whales/index.md) # HTTP Request Retries Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports. This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors. ## Overview The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client. ## Installation To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group: ```bash pip install 'pydantic-ai-slim[retries]' ``` ```bash uv add 'pydantic-ai-slim[retries]' ``` ## Usage Example Here's an example of adding retry functionality with smart retry handling: smart_retry_example.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_retrying_client(): """Create a client with smart retry handling for multiple error types.""" def should_retry_status(response): """Raise exceptions for retryable HTTP status codes.""" if response.status_code in (429, 502, 503, 504): response.raise_for_status() # This will raise HTTPStatusError transport = AsyncTenacityTransport( config=RetryConfig( # Retry on HTTP errors and connection issues retry=retry_if_exception_type((HTTPStatusError, ConnectionError)), # Smart waiting: respects Retry-After headers, falls back to exponential backoff wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 ), # Stop after 5 attempts stop=stop_after_attempt(5), # Re-raise the last exception if all retries fail reraise=True ), validate_response=should_retry_status ) return AsyncClient(transport=transport) # Use the retrying client with a model client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Wait Strategies ### wait_retry_after The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers: wait_strategy_example.py ```python from tenacity import wait_exponential from pydantic_ai.retries import wait_retry_after # Basic usage - respects Retry-After headers, falls back to exponential backoff wait_strategy_1 = wait_retry_after() # Custom configuration wait_strategy_2 = wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=120), max_wait=600 # Never wait more than 10 minutes ) ``` This wait strategy: - Automatically parses `Retry-After` headers from HTTP 429 responses - Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`) - Falls back to your chosen strategy when no header is present - Respects the `max_wait` limit to prevent excessive delays ## Transport Classes ### AsyncTenacityTransport For asynchronous HTTP clients (recommended for most use cases): async_transport_example.py ```python from httpx import AsyncClient from tenacity import stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = AsyncTenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport: client = AsyncClient(transport=transport) ``` ### TenacityTransport For synchronous HTTP clients: sync_transport_example.py ```python from httpx import Client from tenacity import stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = TenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport client = Client(transport=transport) ``` ## Common Retry Patterns ### Rate Limit Handling with Retry-After Support rate_limit_handling.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_rate_limit_client(): """Create a client that respects Retry-After headers from rate limiting responses.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 # Don't wait more than 5 minutes ), stop=stop_after_attempt(10), reraise=True ), validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx ) return AsyncClient(transport=transport) # Example usage client = create_rate_limit_client() # Client is now ready to use with any HTTP requests and will respect Retry-After headers ``` The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff. ### Network Error Handling network_error_handling.py ```python import httpx from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def create_network_resilient_client(): """Create a client that handles network errors with retries.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(( httpx.TimeoutException, httpx.ConnectError, httpx.ReadError )), wait=wait_exponential(multiplier=1, max=10), stop=stop_after_attempt(3), reraise=True ) ) return httpx.AsyncClient(transport=transport) # Example usage client = create_network_resilient_client() # Client will now retry on timeout, connection, and read errors ``` ### Custom Retry Logic custom_retry_logic.py ```python import httpx from tenacity import retry_if_exception, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_custom_retry_client(): """Create a client with custom retry logic.""" def custom_retry_condition(exception): """Custom logic to determine if we should retry.""" if isinstance(exception, httpx.HTTPStatusError): # Retry on server errors but not client errors return 500 <= exception.response.status_code < 600 return isinstance(exception, httpx.TimeoutException | httpx.ConnectError) transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception(custom_retry_condition), # Use wait_retry_after for smart waiting on rate limits, # with custom exponential backoff as fallback wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=30), max_wait=120 ), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) return httpx.AsyncClient(transport=transport) client = create_custom_retry_client() # Client will retry server errors (5xx) and network errors, but not client errors (4xx) ``` ## Using with Different Providers The retry transports work with any provider that accepts a custom HTTP client: ### OpenAI openai_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ### Anthropic anthropic_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = AnthropicModel('claude-sonnet-4-5-20250929', provider=AnthropicProvider(http_client=client)) agent = Agent(model) ``` ### Any OpenAI-Compatible Provider openai_compatible_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel( 'your-model-name', # Replace with actual model name provider=OpenAIProvider( base_url='https://api.example.com/v1', # Replace with actual API URL api_key='your-api-key', # Replace with actual API key http_client=client ) ) agent = Agent(model) ``` ## Best Practices 1. **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times. 1. **Use Exponential Backoff**: This helps avoid overwhelming servers during outages. 1. **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times. 1. **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible. 1. **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.) 1. **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns. Monitoring Retries in Production Excessive retries can indicate underlying issues and increase costs. [Logfire](https://ai.pydantic.dev/logfire/index.md) helps you track retry patterns: - See which requests triggered retries - Understand retry causes (rate limits, server errors, timeouts) - Monitor retry frequency over time - Identify opportunities to reduce retries With [HTTPX instrumentation](https://ai.pydantic.dev/logfire/#monitoring-http-requests) enabled, retry attempts are automatically captured in your traces. ## Error Handling The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application: error_handling_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Performance Considerations - Retries add latency to requests, especially with exponential backoff - Consider the total timeout for your application when configuring retry behavior - Monitor retry rates to detect systemic issues - Use async transports for better concurrency when handling multiple requests For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/). ## Provider-Specific Retry Behavior ### AWS Bedrock The AWS Bedrock provider uses boto3's built-in retry mechanisms instead of httpx. To configure retries for Bedrock, use boto3's `Config`: ```python from botocore.config import Config config = Config(retries={'max_attempts': 5, 'mode': 'adaptive'}) ``` See [Bedrock: Configuring Retries](https://ai.pydantic.dev/models/bedrock/#configuring-retries) for complete examples. # Messages and chat history Pydantic AI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed. ### Accessing Messages from Results After running an agent, you can access the messages exchanged during that run from the `result` object. Both RunResult (returned by Agent.run, Agent.run_sync) and StreamedRunResult (returned by Agent.run_stream) have the following methods: - all_messages(): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, all_messages_json(). - new_messages(): returns only the messages from the current run. There's also a variant that returns JSON bytes, new_messages_json(). StreamedRunResult and complete messages On StreamedRunResult, the messages returned from these methods will only include the final result message once the stream has finished. E.g. you've awaited one of the following coroutines: - StreamedRunResult.stream_output() - StreamedRunResult.stream_text() - StreamedRunResult.stream_responses() - StreamedRunResult.get_output() **Note:** The final result message will NOT be added to result messages if you use .stream_text(delta=True) since in this case the result content is never built as one string. Example of accessing methods on a RunResult : [Learn about Gateway](https://ai.pydantic.dev/gateway) run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be a helpful assistant.') result = agent.run_sync('Tell me a joke.') print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # all messages from the run print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result = agent.run_sync('Tell me a joke.') print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # all messages from the run print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` *(This example is complete, it can be run "as is")* Example of accessing methods on a StreamedRunResult : [Learn about Gateway](https://ai.pydantic.dev/gateway) streamed_run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be a helpful assistant.') async def main(): async with agent.run_stream('Tell me a joke.') as result: # incomplete messages before the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ) ] """ async for text in result.stream_text(): print(text) #> Did you hear #> Did you hear about the toothpaste #> Did you hear about the toothpaste scandal? They called #> Did you hear about the toothpaste scandal? They called it Colgate. # complete messages once the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=50, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` streamed_run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') async def main(): async with agent.run_stream('Tell me a joke.') as result: # incomplete messages before the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ) ] """ async for text in result.stream_text(): print(text) #> Did you hear #> Did you hear about the toothpaste #> Did you hear about the toothpaste scandal? They called #> Did you hear about the toothpaste scandal? They called it Colgate. # complete messages once the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=50, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Using Messages as Input for Further Agent Runs The primary use of message histories in Pydantic AI is to maintain context across multiple agent runs. To use existing messages in a run, pass them to the `message_history` parameter of Agent.run, Agent.run_sync or Agent.run_stream. If `message_history` is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt. [Learn about Gateway](https://ai.pydantic.dev/gateway) Reusing messages in a conversation ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync('Explain?', message_history=result1.new_messages()) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` Reusing messages in a conversation ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync('Explain?', message_history=result1.new_messages()) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` *(This example is complete, it can be run "as is")* ## Storing and loading messages (to JSON) While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases. The intended way to do this is using a `TypeAdapter`. We export ModelMessagesTypeAdapter that can be used for this, or you can create your own. Here's an example showing how: [Learn about Gateway](https://ai.pydantic.dev/gateway) serialize messages to json ```python from pydantic_core import to_jsonable_python from pydantic_ai import ( Agent, ModelMessagesTypeAdapter, # (1)! ) agent = Agent('gateway/openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') history_step_1 = result1.all_messages() as_python_objects = to_jsonable_python(history_step_1) # (2)! same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects) result2 = agent.run_sync( # (3)! 'Tell me a different joke.', message_history=same_history_as_step_1 ) ``` 1. Alternatively, you can create a `TypeAdapter` from scratch: ```python from pydantic import TypeAdapter from pydantic_ai import ModelMessage ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage]) ``` 1. Alternatively you can serialize to/from JSON directly: ```python from pydantic_core import to_json ... as_json_objects = to_json(history_step_1) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects) ``` 1. You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run. serialize messages to json ```python from pydantic_core import to_jsonable_python from pydantic_ai import ( Agent, ModelMessagesTypeAdapter, # (1)! ) agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') history_step_1 = result1.all_messages() as_python_objects = to_jsonable_python(history_step_1) # (2)! same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects) result2 = agent.run_sync( # (3)! 'Tell me a different joke.', message_history=same_history_as_step_1 ) ``` 1. Alternatively, you can create a `TypeAdapter` from scratch: ```python from pydantic import TypeAdapter from pydantic_ai import ModelMessage ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage]) ``` 1. Alternatively you can serialize to/from JSON directly: ```python from pydantic_core import to_json ... as_json_objects = to_json(history_step_1) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects) ``` 1. You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run. *(This example is complete, it can be run "as is")* ## Other ways of using messages Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing. The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models. In the example below, we reuse the message from the first agent run, which uses the `openai:gpt-5.2` model, in a second agent run using the `google-gla:gemini-3-pro-preview` model. [Learn about Gateway](https://ai.pydantic.dev/gateway) Reusing messages with a different model ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync( 'Explain?', model='google-gla:gemini-3-pro-preview', message_history=result1.new_messages(), ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gemini-3-pro-preview', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` Reusing messages with a different model ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync( 'Explain?', model='google-gla:gemini-3-pro-preview', message_history=result1.new_messages(), ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=55, output_tokens=12), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), instructions='Be a helpful assistant.', run_id='...', ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=56, output_tokens=26), model_name='gemini-3-pro-preview', timestamp=datetime.datetime(...), run_id='...', ), ] """ ``` ## Processing Message History Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or custom processing logic. Pydantic AI provides a `history_processors` parameter on `Agent` that allows you to intercept and modify the message history before each model request. History processors replace the message history History processors replace the message history in the state with the processed messages, including the new user prompt part. This means that if you want to keep the original message history, you need to make a copy of it. ### Usage The `history_processors` is a list of callables that take a list of ModelMessage and return a modified list of the same type. Each processor is applied in sequence, and processors can be either synchronous or asynchronous. [Learn about Gateway](https://ai.pydantic.dev/gateway) simple_history_processor.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: """Remove all ModelResponse messages, keeping only ModelRequest messages.""" return [msg for msg in messages if isinstance(msg, ModelRequest)] # Create agent with history processor agent = Agent('gateway/openai:gpt-5.2', history_processors=[filter_responses]) # Example: Create some conversation history message_history = [ ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]), ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out ] # When you run the agent, the history processor will filter out ModelResponse messages # result = agent.run_sync('What about 3+3?', message_history=message_history) ``` simple_history_processor.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: """Remove all ModelResponse messages, keeping only ModelRequest messages.""" return [msg for msg in messages if isinstance(msg, ModelRequest)] # Create agent with history processor agent = Agent('openai:gpt-5.2', history_processors=[filter_responses]) # Example: Create some conversation history message_history = [ ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]), ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out ] # When you run the agent, the history processor will filter out ModelResponse messages # result = agent.run_sync('What about 3+3?', message_history=message_history) ``` #### Keep Only Recent Messages You can use the `history_processor` to only keep the recent messages: [Learn about Gateway](https://ai.pydantic.dev/gateway) keep_recent_messages.py ```python from pydantic_ai import Agent, ModelMessage async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]: """Keep only the last 5 messages to manage token usage.""" return messages[-5:] if len(messages) > 5 else messages agent = Agent('gateway/openai:gpt-5.2', history_processors=[keep_recent_messages]) # Example: Even with a long conversation history, only the last 5 messages are sent to the model long_conversation_history: list[ModelMessage] = [] # Your long conversation history here # result = agent.run_sync('What did we discuss?', message_history=long_conversation_history) ``` keep_recent_messages.py ```python from pydantic_ai import Agent, ModelMessage async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]: """Keep only the last 5 messages to manage token usage.""" return messages[-5:] if len(messages) > 5 else messages agent = Agent('openai:gpt-5.2', history_processors=[keep_recent_messages]) # Example: Even with a long conversation history, only the last 5 messages are sent to the model long_conversation_history: list[ModelMessage] = [] # Your long conversation history here # result = agent.run_sync('What did we discuss?', message_history=long_conversation_history) ``` Be careful when slicing the message history When slicing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269). #### `RunContext` parameter History processors can optionally accept a RunContext parameter to access additional information about the current run, such as dependencies, model information, and usage statistics: [Learn about Gateway](https://ai.pydantic.dev/gateway) context_aware_processor.py ```python from pydantic_ai import Agent, ModelMessage, RunContext def context_aware_processor( ctx: RunContext[None], messages: list[ModelMessage], ) -> list[ModelMessage]: # Access current usage current_tokens = ctx.usage.total_tokens # Filter messages based on context if current_tokens > 1000: return messages[-3:] # Keep only recent messages when token usage is high return messages agent = Agent('gateway/openai:gpt-5.2', history_processors=[context_aware_processor]) ``` context_aware_processor.py ```python from pydantic_ai import Agent, ModelMessage, RunContext def context_aware_processor( ctx: RunContext[None], messages: list[ModelMessage], ) -> list[ModelMessage]: # Access current usage current_tokens = ctx.usage.total_tokens # Filter messages based on context if current_tokens > 1000: return messages[-3:] # Keep only recent messages when token usage is high return messages agent = Agent('openai:gpt-5.2', history_processors=[context_aware_processor]) ``` This allows for more sophisticated message processing based on the current state of the agent run. #### Summarize Old Messages Use an LLM to summarize older messages to preserve context while reducing tokens. [Learn about Gateway](https://ai.pydantic.dev/gateway) summarize_old_messages.py ```python from pydantic_ai import Agent, ModelMessage # Use a cheaper model to summarize old messages. summarize_agent = Agent( 'gateway/openai:gpt-5-mini', instructions=""" Summarize this conversation, omitting small talk and unrelated topics. Focus on the technical discussion and next steps. """, ) async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: # Summarize the oldest 10 messages if len(messages) > 10: oldest_messages = messages[:10] summary = await summarize_agent.run(message_history=oldest_messages) # Return the last message and the summary return summary.new_messages() + messages[-1:] return messages agent = Agent('gateway/openai:gpt-5.2', history_processors=[summarize_old_messages]) ``` summarize_old_messages.py ```python from pydantic_ai import Agent, ModelMessage # Use a cheaper model to summarize old messages. summarize_agent = Agent( 'openai:gpt-5-mini', instructions=""" Summarize this conversation, omitting small talk and unrelated topics. Focus on the technical discussion and next steps. """, ) async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: # Summarize the oldest 10 messages if len(messages) > 10: oldest_messages = messages[:10] summary = await summarize_agent.run(message_history=oldest_messages) # Return the last message and the summary return summary.new_messages() + messages[-1:] return messages agent = Agent('openai:gpt-5.2', history_processors=[summarize_old_messages]) ``` Be careful when summarizing the message history When summarizing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269), where you can find examples of summarizing the message history. ### Testing History Processors You can test what messages are actually sent to the model provider using FunctionModel: test_history_processor.py ```python import pytest from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel @pytest.fixture def received_messages() -> list[ModelMessage]: return [] @pytest.fixture def function_model(received_messages: list[ModelMessage]) -> FunctionModel: def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: # Capture the messages that the provider actually receives received_messages.clear() received_messages.extend(messages) return ModelResponse(parts=[TextPart(content='Provider response')]) return FunctionModel(capture_model_function) def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]): def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] agent = Agent(function_model, history_processors=[filter_responses]) message_history = [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelResponse(parts=[TextPart(content='Answer 1')]), ] agent.run_sync('Question 2', message_history=message_history) assert received_messages == [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelRequest(parts=[UserPromptPart(content='Question 2')]), ] ``` ### Multiple Processors You can also use multiple processors: [Learn about Gateway](https://ai.pydantic.dev/gateway) multiple_history_processors.py ```python from pydantic_ai import Agent, ModelMessage, ModelRequest def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: return messages[-5:] agent = Agent('gateway/openai:gpt-5.2', history_processors=[filter_responses, summarize_old_messages]) ``` multiple_history_processors.py ```python from pydantic_ai import Agent, ModelMessage, ModelRequest def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: return messages[-5:] agent = Agent('openai:gpt-5.2', history_processors=[filter_responses, summarize_old_messages]) ``` In this case, the `filter_responses` processor will be applied first, and the `summarize_old_messages` processor will be applied second. ## Examples For a more complete example of using messages in conversations, see the [chat app](https://ai.pydantic.dev/examples/chat-app/index.md) example. # Multi-agent Applications There are roughly five levels of complexity when building applications with Pydantic AI: 1. Single agent workflows — what most of the `pydantic_ai` documentation covers 1. [Agent delegation](#agent-delegation) — agents using another agent via tools 1. [Programmatic agent hand-off](#programmatic-agent-hand-off) — one agent runs, then application code calls another agent 1. [Graph based control flow](https://ai.pydantic.dev/graph/index.md) — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents 1. [Deep Agents](#deep-agents) — autonomous agents with planning, file operations, task delegation, and sandboxed code execution Of course, you can combine multiple strategies in a single application. ## Agent delegation "Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes. If you want to hand off control to another agent completely, without coming back to the first agent, you can use an [output function](https://ai.pydantic.dev/output/#output-functions). Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](https://ai.pydantic.dev/dependencies/index.md). You'll generally want to pass ctx.usage to the usage keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run. Multiple models Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final result.usage() of the run will not be possible, but you can still use UsageLimits — including `request_limit`, `total_tokens_limit`, and `tool_calls_limit` — to avoid unexpected costs or runaway tool loops. [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_delegation_simple.py ```python from pydantic_ai import Agent, RunContext, UsageLimits joke_selection_agent = Agent( # (1)! 'gateway/openai:gpt-5.2', instructions=( 'Use the `joke_factory` to generate some jokes, then choose the best. ' 'You must return just a single joke.' ), ) joke_generation_agent = Agent( # (2)! 'gateway/gemini:gemini-3-flash-preview', output_type=list[str] ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[None], count: int) -> list[str]: r = await joke_generation_agent.run( # (3)! f'Please generate {count} jokes.', usage=ctx.usage, # (4)! ) return r.output # (5)! result = joke_selection_agent.run_sync( 'Tell me a joke.', usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500), ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) #> RunUsage(input_tokens=166, output_tokens=24, requests=3, tool_calls=1) ``` 1. The "parent" or controlling agent. 1. The "delegate" agent, which is called from within a tool of the parent agent. 1. Call the delegate agent from within a tool of the parent agent. 1. Pass the usage from the parent agent to the delegate agent so the final result.usage() includes the usage from both agents. 1. Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool. agent_delegation_simple.py ```python from pydantic_ai import Agent, RunContext, UsageLimits joke_selection_agent = Agent( # (1)! 'openai:gpt-5.2', instructions=( 'Use the `joke_factory` to generate some jokes, then choose the best. ' 'You must return just a single joke.' ), ) joke_generation_agent = Agent( # (2)! 'google-gla:gemini-3-flash-preview', output_type=list[str] ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[None], count: int) -> list[str]: r = await joke_generation_agent.run( # (3)! f'Please generate {count} jokes.', usage=ctx.usage, # (4)! ) return r.output # (5)! result = joke_selection_agent.run_sync( 'Tell me a joke.', usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500), ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) #> RunUsage(input_tokens=166, output_tokens=24, requests=3, tool_calls=1) ``` 1. The "parent" or controlling agent. 1. The "delegate" agent, which is called from within a tool of the parent agent. 1. Call the delegate agent from within a tool of the parent agent. 1. Pass the usage from the parent agent to the delegate agent so the final result.usage() includes the usage from both agents. 1. Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool. *(This example is complete, it can be run "as is")* The control flow for this example is pretty simple and can be summarised as follows: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ### Agent delegation and dependencies Generally the delegate agent needs to either have the same [dependencies](https://ai.pydantic.dev/dependencies/index.md) as the calling agent, or dependencies which are a subset of the calling agent's dependencies. Initializing dependencies We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent. [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_delegation_deps.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class ClientAndKey: # (1)! http_client: httpx.AsyncClient api_key: str joke_selection_agent = Agent( 'gateway/openai:gpt-5.2', deps_type=ClientAndKey, # (2)! instructions=( 'Use the `joke_factory` tool to generate some jokes on the given subject, ' 'then choose the best. You must return just a single joke.' ), ) joke_generation_agent = Agent( 'gateway/gemini:gemini-3-flash-preview', deps_type=ClientAndKey, # (4)! output_type=list[str], instructions=( 'Use the "get_jokes" tool to get some jokes on the given subject, ' 'then extract each joke into a list.' ), ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]: r = await joke_generation_agent.run( f'Please generate {count} jokes.', deps=ctx.deps, # (3)! usage=ctx.usage, ) return r.output @joke_generation_agent.tool # (5)! async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str: response = await ctx.deps.http_client.get( 'https://example.com', params={'count': count}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text async def main(): async with httpx.AsyncClient() as client: deps = ClientAndKey(client, 'foobar') result = await joke_selection_agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) # (6)! #> RunUsage(input_tokens=221, output_tokens=32, requests=4, tool_calls=2) ``` 1. Define a dataclass to hold the client and API key dependencies. 1. Set the `deps_type` of the calling agent — `joke_selection_agent` here. 1. Pass the dependencies to the delegate agent's run method within the tool call. 1. Also set the `deps_type` of the delegate agent — `joke_generation_agent` here. 1. Define a tool on the delegate agent that uses the dependencies to make an HTTP request. 1. Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent. agent_delegation_deps.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class ClientAndKey: # (1)! http_client: httpx.AsyncClient api_key: str joke_selection_agent = Agent( 'openai:gpt-5.2', deps_type=ClientAndKey, # (2)! instructions=( 'Use the `joke_factory` tool to generate some jokes on the given subject, ' 'then choose the best. You must return just a single joke.' ), ) joke_generation_agent = Agent( 'google-gla:gemini-3-flash-preview', deps_type=ClientAndKey, # (4)! output_type=list[str], instructions=( 'Use the "get_jokes" tool to get some jokes on the given subject, ' 'then extract each joke into a list.' ), ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]: r = await joke_generation_agent.run( f'Please generate {count} jokes.', deps=ctx.deps, # (3)! usage=ctx.usage, ) return r.output @joke_generation_agent.tool # (5)! async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str: response = await ctx.deps.http_client.get( 'https://example.com', params={'count': count}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text async def main(): async with httpx.AsyncClient() as client: deps = ClientAndKey(client, 'foobar') result = await joke_selection_agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) # (6)! #> RunUsage(input_tokens=221, output_tokens=32, requests=4, tool_calls=2) ``` 1. Define a dataclass to hold the client and API key dependencies. 1. Set the `deps_type` of the calling agent — `joke_selection_agent` here. 1. Pass the dependencies to the delegate agent's run method within the tool call. 1. Also set the `deps_type` of the delegate agent — `joke_generation_agent` here. 1. Define a tool on the delegate agent that uses the dependencies to make an HTTP request. 1. Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* This example shows how even a fairly simple agent delegation can lead to a complex control flow: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> get_jokes["get_jokes (tool)"] get_jokes --> http_request["HTTP request"] http_request --> get_jokes get_jokes --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ## Programmatic agent hand-off "Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next. Here agents don't need to use the same deps. Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference. programmatic_handoff.py ```python from typing import Literal from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import Agent, ModelMessage, RunContext, RunUsage, UsageLimits class FlightDetails(BaseModel): flight_number: str class Failed(BaseModel): """Unable to find a satisfactory choice.""" flight_search_agent = Agent[None, FlightDetails | Failed]( # (1)! 'openai:gpt-5.2', output_type=FlightDetails | Failed, # type: ignore instructions=( 'Use the "flight_search" tool to find a flight ' 'from the given origin to the given destination.' ), ) @flight_search_agent.tool # (2)! async def flight_search( ctx: RunContext[None], origin: str, destination: str ) -> FlightDetails | None: # in reality, this would call a flight search API or # use a browser to scrape a flight search website return FlightDetails(flight_number='AK456') usage_limits = UsageLimits(request_limit=15) # (3)! async def find_flight(usage: RunUsage) -> FlightDetails | None: # (4)! message_history: list[ModelMessage] | None = None for _ in range(3): prompt = Prompt.ask( 'Where would you like to fly from and to?', ) result = await flight_search_agent.run( prompt, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, FlightDetails): return result.output else: message_history = result.all_messages( output_tool_return_content='Please try again.' ) class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( # (5)! 'openai:gpt-5.2', output_type=SeatPreference | Failed, # type: ignore instructions=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) async def find_seat(usage: RunUsage) -> SeatPreference: # (6)! message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def main(): # (7)! usage: RunUsage = RunUsage() opt_flight_details = await find_flight(usage) if opt_flight_details is not None: print(f'Flight found: {opt_flight_details.flight_number}') #> Flight found: AK456 seat_preference = await find_seat(usage) print(f'Seat preference: {seat_preference}') #> Seat preference: row=1 seat='A' ``` 1. Define the first agent, which finds a flight. We use an explicit type annotation until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured output](https://ai.pydantic.dev/output/#structured-output). We use a union as the output type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool. 1. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary. 1. Define usage limits for the entire app. 1. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight. 1. As with `flight_search_agent` above, we use an explicit type annotation to define the agent. 1. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference. 1. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* The control flow for this example can be summarised as follows: ``` graph TB START --> ask_user_flight["ask user for flight"] subgraph find_flight flight_search_agent --> ask_user_flight ask_user_flight --> flight_search_agent end flight_search_agent --> ask_user_seat["ask user for seat"] flight_search_agent --> END subgraph find_seat seat_preference_agent --> ask_user_seat ask_user_seat --> seat_preference_agent end seat_preference_agent --> END ``` ## Pydantic Graphs See the [graph](https://ai.pydantic.dev/graph/index.md) documentation on when and how to use graphs. ## Deep Agents Deep agents are autonomous agents that combine multiple architectural patterns and capabilities to handle complex, multi-step tasks reliably. These patterns can be implemented using Pydantic AI's built-in features and (third-party) toolsets: - **Planning and progress tracking** — agents break down complex tasks into steps and track their progress, giving users visibility into what the agent is working on. See [Task Management toolsets](https://ai.pydantic.dev/toolsets/#task-management). - **File system operations** — reading, writing, and editing files with proper abstraction layers that work across in-memory storage, real file systems, and sandboxed containers. See [File Operations toolsets](https://ai.pydantic.dev/toolsets/#file-operations). - **Task delegation** — spawning specialized sub-agents for specific tasks, with isolated context to prevent recursive delegation issues. See [Agent Delegation](#agent-delegation) above. - **Sandboxed code execution** — running AI-generated code in isolated environments (typically Docker containers) to prevent accidents. See [Code Execution toolsets](https://ai.pydantic.dev/toolsets/#code-execution). - **Context management** — automatic conversation summarization to handle long sessions that would otherwise exceed token limits. See [Processing Message History](https://ai.pydantic.dev/message-history/#processing-message-history). - **Human-in-the-loop** — approval workflows for dangerous operations like code execution or file deletion. See [Requiring Tool Approval](https://ai.pydantic.dev/toolsets/#requiring-tool-approval). - **Durable execution** — preserving agent state across transient API failures and application errors or restarts. See [Durable Execution](https://ai.pydantic.dev/durable_execution/overview/index.md). In addition, the community maintains packages that bring these concepts together in a more opinionated way: - [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) by [Vstorm](https://vstorm.co/) ## Observing Multi-Agent Systems Multi-agent systems can be challenging to debug due to their complexity; when multiple agents interact, understanding the flow of execution becomes essential. ### Tracing Agent Delegation With [Logfire](https://ai.pydantic.dev/logfire/index.md), you can trace the entire flow across multiple agents: ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() # Your multi-agent code here... ``` Logfire shows you: - **Which agent handled which part** of the request - **Delegation decisions**—when and why one agent called another - **End-to-end latency** broken down by agent - **Token usage and costs** per agent - **What triggered the agent run**—the HTTP request, scheduled job, or user action that started it all - **What happened inside tool calls**—database queries, HTTP requests, file operations, and any other instrumented code that tools execute This is essential for understanding and optimizing complex agent workflows. When something goes wrong in a multi-agent system, you'll see exactly which agent failed and what it was trying to do, and whether the problem was in the agent's reasoning or in the backend systems it called. ### Full-Stack Visibility If your PydanticAI application includes a TypeScript frontend, API gateway, or services in other languages, Logfire can trace them too—Logfire provides SDKs for Python, JavaScript/TypeScript, and Rust, plus compatibility with any OpenTelemetry-instrumented application. See traces from your entire stack in a unified view. For details on sending data from other languages using standard OpenTelemetry, see the [alternative clients guide](https://logfire.pydantic.dev/docs/how-to-guides/alternative-clients/). PydanticAI's instrumentation is built on [OpenTelemetry](https://opentelemetry.io/), so you can also use any OTel-compatible backend. See the [Logfire integration guide](https://ai.pydantic.dev/logfire/index.md) for details. ## Examples The following examples demonstrate how to use multi-agent patterns in Pydantic AI: - [Flight booking](https://ai.pydantic.dev/examples/flight-booking/index.md) # Thinking Thinking (or reasoning) is the process by which a model works through a problem step-by-step before providing its final answer. This capability is typically disabled by default and depends on the specific model being used. See the sections below for how to enable thinking for each provider. ## OpenAI When using the OpenAIChatModel, text output inside `` tags are converted to ThinkingPart objects. You can customize the tags using the thinking_tags field on the [model profile](https://ai.pydantic.dev/models/openai/#model-profile). Some [OpenAI-compatible model providers](https://ai.pydantic.dev/models/openai/#openai-compatible-models) might also support native thinking parts that are not delimited by tags. Instead, they are sent and received as separate, custom fields in the API. Typically, if you are calling the model via the `:` shorthand, Pydantic AI handles it for you. Nonetheless, you can still configure the fields with openai_chat_thinking_field. If your provider recommends to send back these custom fields not changed, for caching or interleaved thinking benefits, you can also achieve this with openai_chat_send_back_thinking_parts. ### OpenAI Responses The OpenAIResponsesModel can generate native thinking parts. To enable this functionality, you need to set the OpenAIResponsesModelSettings.openai_reasoning_effort and OpenAIResponsesModelSettings.openai_reasoning_summary [model settings](https://ai.pydantic.dev/agent/#model-run-settings). By default, the unique IDs of reasoning, text, and function call parts from the message history are sent to the model, which can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](https://ai.pydantic.dev/message-history/#processing-message-history). To disable this, you can disable the OpenAIResponsesModelSettings.openai_send_reasoning_ids [model setting](https://ai.pydantic.dev/agent/#model-run-settings). openai_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') settings = OpenAIResponsesModelSettings( openai_reasoning_effort='low', openai_reasoning_summary='detailed', ) agent = Agent(model, model_settings=settings) ... ``` Raw reasoning without summaries Some OpenAI-compatible APIs (such as LM Studio, vLLM, or OpenRouter with gpt-oss models) may return raw reasoning content without reasoning summaries. In this case, ThinkingPart.content will be empty, but the raw reasoning is available in `provider_details['raw_content']`. Following [OpenAI's guidance](https://cookbook.openai.com/examples/responses_api/reasoning_items) that raw reasoning should not be shown directly to users, we store it in `provider_details` rather than in the main `content` field. ## Anthropic To enable thinking, use the AnthropicModelSettings.anthropic_thinking [model setting](https://ai.pydantic.dev/agent/#model-run-settings). Note Extended thinking (`type: 'enabled'` with `budget_tokens`) is deprecated on `claude-opus-4-6`+. For those models, use [adaptive thinking](#adaptive-thinking--effort) instead. anthropic_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-sonnet-4-5') settings = AnthropicModelSettings( anthropic_thinking={'type': 'enabled', 'budget_tokens': 1024}, ) agent = Agent(model, model_settings=settings) ... ``` ### Interleaved Thinking To enable [interleaved thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#interleaved-thinking), you need to include the beta header in your model settings: anthropic_interleaved_thinking.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-sonnet-4-5') settings = AnthropicModelSettings( anthropic_thinking={'type': 'enabled', 'budget_tokens': 10000}, extra_headers={'anthropic-beta': 'interleaved-thinking-2025-05-14'}, ) agent = Agent(model, model_settings=settings) ... ``` ### Adaptive Thinking & Effort Starting with `claude-opus-4-6`, Anthropic supports [adaptive thinking](https://docs.anthropic.com/en/docs/build-with-claude/adaptive-thinking), where the model dynamically decides when and how much to think based on the complexity of each request. This replaces extended thinking (`type: 'enabled'` with `budget_tokens`) which is deprecated on Opus 4.6. Adaptive thinking also automatically enables interleaved thinking. anthropic_adaptive_thinking.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-opus-4-6') settings = AnthropicModelSettings( anthropic_thinking={'type': 'adaptive'}, anthropic_effort='high', ) agent = Agent(model, model_settings=settings) ... ``` The anthropic_effort setting controls how much effort the model puts into its response (independent of thinking). See the [Anthropic effort docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for details. Note Older models (`claude-sonnet-4-5`, `claude-opus-4-5`, etc.) do not support adaptive thinking and require `{'type': 'enabled', 'budget_tokens': N}` as shown [above](#anthropic). ## Google To enable thinking, use the GoogleModelSettings.google_thinking_config [model setting](https://ai.pydantic.dev/agent/#model-run-settings). google_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model = GoogleModel('gemini-3-pro-preview') settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True}) agent = Agent(model, model_settings=settings) ... ``` ## xAI xAI reasoning models (Grok) support native thinking. To preserve the thinking content for multi-turn conversations, enable XaiModelSettings.xai_include_encrypted_content. xai_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel, XaiModelSettings model = XaiModel('grok-4-fast-reasoning') settings = XaiModelSettings(xai_include_encrypted_content=True) agent = Agent(model, model_settings=settings) ... ``` ## Bedrock Although Bedrock Converse doesn't provide a unified API to enable thinking, you can still use BedrockModelSettings.bedrock_additional_model_requests_fields [model setting](https://ai.pydantic.dev/agent/#model-run-settings) to pass provider-specific configuration: bedrock_claude_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={ 'thinking': {'type': 'enabled', 'budget_tokens': 1024} } ) agent = Agent(model=model, model_settings=model_settings) ``` bedrock_openai_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('openai.gpt-oss-120b-1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={'reasoning_effort': 'low'} ) agent = Agent(model=model, model_settings=model_settings) ``` bedrock_qwen_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('qwen.qwen3-32b-v1:0') model_settings = BedrockModelSettings( bedrock_additional_model_requests_fields={'reasoning_config': 'high'} ) agent = Agent(model=model, model_settings=model_settings) ``` Reasoning is [always enabled](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-reasoning.html) for Deepseek model bedrock_deepseek_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('us.deepseek.r1-v1:0') agent = Agent(model=model) ``` ## Groq Groq supports different formats to receive thinking parts: - `"raw"`: The thinking part is included in the text content inside `` tags, which are automatically converted to ThinkingPart objects. - `"hidden"`: The thinking part is not included in the text content. - `"parsed"`: The thinking part has its own structured part in the response which is converted into a ThinkingPart object. To enable thinking, use the GroqModelSettings.groq_reasoning_format [model setting](https://ai.pydantic.dev/agent/#model-run-settings): groq_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel, GroqModelSettings model = GroqModel('qwen-qwq-32b') settings = GroqModelSettings(groq_reasoning_format='parsed') agent = Agent(model, model_settings=settings) ... ``` ## OpenRouter To enable thinking, use the OpenRouterModelSettings.openrouter_reasoning [model setting](https://ai.pydantic.dev/agent/#model-run-settings). openrouter_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings model = OpenRouterModel('openai/gpt-5.2') settings = OpenRouterModelSettings(openrouter_reasoning={'effort': 'high'}) agent = Agent(model, model_settings=settings) ... ``` ## Mistral Thinking is supported by the `magistral` family of models. It does not need to be specifically enabled. ## Cohere Thinking is supported by the `command-a-reasoning-08-2025` model. It does not need to be specifically enabled. ## Hugging Face Text output inside `` tags is automatically converted to ThinkingPart objects. You can customize the tags using the thinking_tags field on the [model profile](https://ai.pydantic.dev/models/openai/#model-profile). ## Outlines Some local models run through Outlines include in their text output a thinking part delimited by tags. In that case, it will be handled by Pydantic AI that will separate the thinking part from the final answer without the need to specifically enable it. The thinking tags used by default are `""` and `""`. If your model uses different tags, you can specify them in the [model profile](https://ai.pydantic.dev/models/openai/#model-profile) using the thinking_tags field. Outlines currently does not support thinking along with structured output. If you provide an `output_type`, the model text output will not contain a thinking part with the associated tags, and you may experience degraded performance. # Third-Party Tools Pydantic AI supports integration with various third-party tool libraries, allowing you to leverage existing tool ecosystems in your agents. ## MCP Tools See the [MCP Client](https://ai.pydantic.dev/mcp/client/index.md) documentation for how to use MCP servers with Pydantic AI as [toolsets](https://ai.pydantic.dev/toolsets/index.md). ## LangChain Tools If you'd like to use a tool from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the tool_from_langchain convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tool in question. Here is how you can use the LangChain `DuckDuckGoSearchRun` tool, which requires the `ddgs` package: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from langchain_community.tools import DuckDuckGoSearchRun from pydantic_ai import Agent from pydantic_ai.ext.langchain import tool_from_langchain search = DuckDuckGoSearchRun() search_tool = tool_from_langchain(search) agent = Agent( 'gateway/gemini:gemini-3-flash-preview', tools=[search_tool], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). ```python from langchain_community.tools import DuckDuckGoSearchRun from pydantic_ai import Agent from pydantic_ai.ext.langchain import tool_from_langchain search = DuckDuckGoSearchRun() search_tool = tool_from_langchain(search) agent = Agent( 'google-gla:gemini-3-flash-preview', tools=[search_tool], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple LangChain tools or a LangChain [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits), you can use the LangChainToolset [toolset](https://ai.pydantic.dev/toolsets/index.md) which takes a list of LangChain tools: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) # ... ``` ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) # ... ``` ## ACI.dev Tools If you'd like to use a tool from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the tool_from_aci convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. Here is how you can use the ACI.dev `TAVILY__SEARCH` tool: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import tool_from_aci tavily_search = tool_from_aci( 'TAVILY__SEARCH', linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent( 'gateway/gemini:gemini-3-flash-preview', tools=[tavily_search], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import tool_from_aci tavily_search = tool_from_aci( 'TAVILY__SEARCH', linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent( 'google-gla:gemini-3-flash-preview', tools=[tavily_search], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple ACI.dev tools, you can use the ACIToolset [toolset](https://ai.pydantic.dev/toolsets/index.md) which takes a list of ACI tool names as well as the `linked_account_owner_id`: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) ``` ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` ## See Also - [Function Tools](https://ai.pydantic.dev/tools/index.md) - Basic tool concepts and registration - [Toolsets](https://ai.pydantic.dev/toolsets/index.md) - Managing collections of tools - [MCP Client](https://ai.pydantic.dev/mcp/client/index.md) - Using MCP servers with Pydantic AI - [LangChain Toolsets](https://ai.pydantic.dev/toolsets/#langchain-tools) - Using LangChain toolsets - [ACI.dev Toolsets](https://ai.pydantic.dev/toolsets/#aci-tools) - Using ACI.dev toolsets # Advanced Tool Features This page covers advanced features for function tools in Pydantic AI. For basic tool usage, see the [Function Tools](https://ai.pydantic.dev/tools/index.md) documentation. ## Tool Output Tools can return anything that Pydantic can serialize to JSON, as well as audio, video, image or document content depending on the types of [multi-modal input](https://ai.pydantic.dev/input/index.md) the model supports: function_tool_output.py ```python from datetime import datetime from pydantic import BaseModel from pydantic_ai import Agent, DocumentUrl, ImageUrl from pydantic_ai.models.openai import OpenAIResponsesModel class User(BaseModel): name: str age: int agent = Agent(model=OpenAIResponsesModel('gpt-5.2')) @agent.tool_plain def get_current_time() -> datetime: return datetime.now() @agent.tool_plain def get_user() -> User: return User(name='John', age=30) @agent.tool_plain def get_company_logo() -> ImageUrl: return ImageUrl(url='https://iili.io/3Hs4FMg.png') @agent.tool_plain def get_document() -> DocumentUrl: return DocumentUrl(url='https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf') result = agent.run_sync('What time is it?') print(result.output) #> The current time is 10:45 PM on April 17, 2025. result = agent.run_sync('What is the user name?') print(result.output) #> The user's name is John. result = agent.run_sync('What is the company name in the logo?') print(result.output) #> The company name in the logo is "Pydantic." result = agent.run_sync('What is the main content of the document?') print(result.output) #> The document contains just the text "Dummy PDF file." ``` *(This example is complete, it can be run "as is")* Some models (e.g. Gemini) natively support semi-structured return values, while some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON. ### Advanced Tool Returns For scenarios where you need more control over both the tool's return value and the content sent to the model, you can use ToolReturn. This is particularly useful when you want to: - Provide rich multi-modal content (images, documents, etc.) to the model as context - Separate the programmatic return value from the model's context - Include additional metadata that shouldn't be sent to the LLM Here's an example of a computer automation tool that captures screenshots and provides visual feedback: [Learn about Gateway](https://ai.pydantic.dev/gateway) advanced_tool_return.py ```python import time from pydantic_ai import Agent from pydantic_ai import ToolReturn, BinaryContent agent = Agent('gateway/openai:gpt-5.2') @agent.tool_plain def click_and_capture(x: int, y: int) -> ToolReturn: """Click at coordinates and show before/after screenshots.""" # Take screenshot before action before_screenshot = capture_screen() # Perform click operation perform_click(x, y) time.sleep(0.5) # Wait for UI to update # Take screenshot after action after_screenshot = capture_screen() return ToolReturn( return_value=f'Successfully clicked at ({x}, {y})', content=[ f'Clicked at coordinates ({x}, {y}). Here\'s the comparison:', 'Before:', BinaryContent(data=before_screenshot, media_type='image/png'), 'After:', BinaryContent(data=after_screenshot, media_type='image/png'), 'Please analyze the changes and suggest next steps.' ], metadata={ 'coordinates': {'x': x, 'y': y}, 'action_type': 'click_and_capture', 'timestamp': time.time() } ) # The model receives the rich visual content for analysis # while your application can access the structured return_value and metadata result = agent.run_sync('Click on the submit button and tell me what happened') print(result.output) # The model can analyze the screenshots and provide detailed feedback ``` advanced_tool_return.py ```python import time from pydantic_ai import Agent from pydantic_ai import ToolReturn, BinaryContent agent = Agent('openai:gpt-5.2') @agent.tool_plain def click_and_capture(x: int, y: int) -> ToolReturn: """Click at coordinates and show before/after screenshots.""" # Take screenshot before action before_screenshot = capture_screen() # Perform click operation perform_click(x, y) time.sleep(0.5) # Wait for UI to update # Take screenshot after action after_screenshot = capture_screen() return ToolReturn( return_value=f'Successfully clicked at ({x}, {y})', content=[ f'Clicked at coordinates ({x}, {y}). Here\'s the comparison:', 'Before:', BinaryContent(data=before_screenshot, media_type='image/png'), 'After:', BinaryContent(data=after_screenshot, media_type='image/png'), 'Please analyze the changes and suggest next steps.' ], metadata={ 'coordinates': {'x': x, 'y': y}, 'action_type': 'click_and_capture', 'timestamp': time.time() } ) # The model receives the rich visual content for analysis # while your application can access the structured return_value and metadata result = agent.run_sync('Click on the submit button and tell me what happened') print(result.output) # The model can analyze the screenshots and provide detailed feedback ``` - **`return_value`**: The actual return value used in the tool response. This is what gets serialized and sent back to the model as the tool's result. - **`content`**: A sequence of content (text, images, documents, etc.) that provides additional context to the model. This appears as a separate user message. - **`metadata`**: Optional metadata that your application can access but is not sent to the LLM. Useful for logging, debugging, or additional processing. Some other AI frameworks call this feature "artifacts". This separation allows you to provide rich context to the model while maintaining clean, structured return values for your application logic. ## Custom Tool Schema If you have a function that lacks appropriate documentation (i.e. poorly named, no type information, poor docstring, use of \*args or \*\*kwargs and suchlike) then you can still turn it into a tool that can be effectively used by the agent with the Tool.from_schema function. With this you provide the name, description, JSON schema, and whether the function takes a `RunContext` for the function directly: ```python from pydantic_ai import Agent, Tool from pydantic_ai.models.test import TestModel def foobar(**kwargs) -> str: return kwargs['a'] + kwargs['b'] tool = Tool.from_schema( function=foobar, name='sum', description='Sum two numbers.', json_schema={ 'additionalProperties': False, 'properties': { 'a': {'description': 'the first number', 'type': 'integer'}, 'b': {'description': 'the second number', 'type': 'integer'}, }, 'required': ['a', 'b'], 'type': 'object', }, takes_ctx=False, ) test_model = TestModel() agent = Agent(test_model, tools=[tool]) result = agent.run_sync('testing...') print(result.output) #> {"sum":0} ``` Please note that validation of the tool arguments will not be performed, and this will pass all arguments as keyword arguments. ## Dynamic Tools Tools can optionally be defined with another function: `prepare`, which is called at each step of a run to customize the definition of the tool passed to the model, or omit the tool completely from that step. A `prepare` method can be registered via the `prepare` kwarg to any of the tool registration mechanisms: - @agent.tool decorator - @agent.tool_plain decorator - Tool dataclass The `prepare` method, should be of type ToolPrepareFunc, a function which takes RunContext and a pre-built ToolDefinition, and should either return that `ToolDefinition` with or without modifying it, return a new `ToolDefinition`, or return `None` to indicate this tools should not be registered for that step. Here's a simple `prepare` method that only includes the tool if the value of the dependency is `42`. As with the previous example, we use TestModel to demonstrate the behavior without calling a real model. tool_only_if_42.py ```python from pydantic_ai import Agent, RunContext, ToolDefinition agent = Agent('test') async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def @agent.tool(prepare=only_if_42) def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' result = agent.run_sync('testing...', deps=41) print(result.output) #> success (no tool calls) result = agent.run_sync('testing...', deps=42) print(result.output) #> {"hitchhiker":"42 a"} ``` *(This example is complete, it can be run "as is")* Here's a more complex example where we change the description of the `name` parameter to based on the value of `deps` For the sake of variation, we create this tool using the Tool dataclass. customize_name.py ```python from __future__ import annotations from typing import Literal from pydantic_ai import Agent, RunContext, Tool, ToolDefinition from pydantic_ai.models.test import TestModel def greet(name: str) -> str: return f'hello {name}' async def prepare_greet( ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition ) -> ToolDefinition | None: d = f'Name of the {ctx.deps} to greet.' tool_def.parameters_json_schema['properties']['name']['description'] = d return tool_def greet_tool = Tool(greet, prepare=prepare_greet) test_model = TestModel() agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine']) result = agent.run_sync('testing...', deps='human') print(result.output) #> {"greet":"hello a"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='greet', parameters_json_schema={ 'additionalProperties': False, 'properties': { 'name': {'type': 'string', 'description': 'Name of the human to greet.'} }, 'required': ['name'], 'type': 'object', }, ) ] """ ``` *(This example is complete, it can be run "as is")* ### Agent-wide Dynamic Tools In addition to per-tool `prepare` methods, you can also define an agent-wide `prepare_tools` function. This function is called at each step of a run and allows you to filter or modify the list of all tool definitions available to the agent for that step. This is especially useful if you want to enable or disable multiple tools at once, or apply global logic based on the current context. The `prepare_tools` function should be of type ToolsPrepareFunc, which takes the RunContext and a list of ToolDefinition, and returns a new list of tool definitions (or `None` to disable all tools for that step). Note The list of tool definitions passed to `prepare_tools` includes both regular function tools and tools from any [toolsets](https://ai.pydantic.dev/toolsets/index.md) registered on the agent, but not [output tools](https://ai.pydantic.dev/output/#tool-output). To modify output tools, you can set a `prepare_output_tools` function instead. Here's an example that makes all tools strict if the model is an OpenAI model: agent_prepare_tools_customize.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs test_model = TestModel() agent = Agent(test_model, prepare_tools=turn_on_strict_if_openai) @agent.tool_plain def echo(message: str) -> str: return message agent.run_sync('testing...') assert test_model.last_model_request_parameters.function_tools[0].strict is None # Set the system attribute of the test_model to 'openai' test_model._system = 'openai' agent.run_sync('testing with openai...') assert test_model.last_model_request_parameters.function_tools[0].strict ``` *(This example is complete, it can be run "as is")* Here's another example that conditionally filters out the tools by name if the dependency (`ctx.deps`) is `True`: agent_prepare_tools_filter_out.py ```python from pydantic_ai import Agent, RunContext, Tool, ToolDefinition def launch_potato(target: str) -> str: return f'Potato launched at {target}!' async def filter_out_tools_by_name( ctx: RunContext[bool], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.deps: return [tool_def for tool_def in tool_defs if tool_def.name != 'launch_potato'] return tool_defs agent = Agent( 'test', tools=[Tool(launch_potato)], prepare_tools=filter_out_tools_by_name, deps_type=bool, ) result = agent.run_sync('testing...', deps=False) print(result.output) #> {"launch_potato":"Potato launched at a!"} result = agent.run_sync('testing...', deps=True) print(result.output) #> success (no tool calls) ``` *(This example is complete, it can be run "as is")* You can use `prepare_tools` to: - Dynamically enable or disable tools based on the current model, dependencies, or other context - Modify tool definitions globally (e.g., set all tools to strict mode, change descriptions, etc.) If both per-tool `prepare` and agent-wide `prepare_tools` are used, the per-tool `prepare` is applied first to each tool, and then `prepare_tools` is called with the resulting list of tool definitions. ## Tool Execution and Retries When a tool is executed, its arguments (provided by the LLM) are first validated against the function's signature using Pydantic (with optional [validation context](https://ai.pydantic.dev/output/#validation-context)). If validation fails (e.g., due to incorrect types or missing required arguments), a `ValidationError` is raised, and the framework automatically generates a RetryPromptPart containing the validation details. This prompt is sent back to the LLM, informing it of the error and allowing it to correct the parameters and retry the tool call. Beyond automatic validation errors, the tool's own internal logic can also explicitly request a retry by raising the ModelRetry exception. This is useful for situations where the parameters were technically valid, but an issue occurred during execution (like a transient network error, or the tool determining the initial attempt needs modification). ```python from pydantic_ai import ModelRetry def my_flaky_tool(query: str) -> str: if query == 'bad': # Tell the LLM the query was bad and it should try again raise ModelRetry("The query 'bad' is not allowed. Please provide a different query.") # ... process query ... return 'Success!' ``` Raising `ModelRetry` also generates a `RetryPromptPart` containing the exception message, which is sent back to the LLM to guide its next attempt. Both `ValidationError` and `ModelRetry` respect the `retries` setting configured on the `Tool` or `Agent`. ### Tool Timeout You can set a timeout for tool execution to prevent tools from running indefinitely. If a tool exceeds its timeout, it is treated as a failure and a retry prompt is sent to the model (counting towards the retry limit). ```python import asyncio from pydantic_ai import Agent # Set a default timeout for all tools on the agent agent = Agent('test', tool_timeout=30) @agent.tool_plain async def slow_tool() -> str: """This tool will use the agent's default timeout (30 seconds).""" await asyncio.sleep(10) return 'Done' @agent.tool_plain(timeout=5) async def fast_tool() -> str: """This tool has its own timeout (5 seconds) that overrides the agent default.""" await asyncio.sleep(1) return 'Done' ``` - **Agent-level timeout**: Set `tool_timeout` on the Agent to apply a default timeout to all tools. - **Per-tool timeout**: Set `timeout` on individual tools via @agent.tool, @agent.tool_plain, or the Tool dataclass. This overrides the agent-level default. When a timeout occurs, the tool is considered to have failed and the model receives a retry prompt with the message `"Timed out after {timeout} seconds."`. This counts towards the tool's retry limit just like validation errors or explicit ModelRetry exceptions. ### Parallel tool calls & concurrency When a model returns multiple tool calls in one response, Pydantic AI schedules them concurrently using `asyncio.create_task`. If a tool requires sequential/serial execution, you can pass the sequential flag when registering the tool, or wrap the agent run in the with agent.parallel_tool_call_execution_mode('sequential') context manager. Async functions are run on the event loop, while sync functions are offloaded to threads. To get the best performance, *always* use an async function *unless* you're doing blocking I/O (and there's no way to use a non-blocking library instead) or CPU-bound work (like `numpy` or `scikit-learn` operations), so that simple functions are not offloaded to threads unnecessarily. Limiting tool executions You can cap tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](https://ai.pydantic.dev/agent/#usage-limits). The counter increments only after a successful tool invocation. Output tools (used for [structured output](https://ai.pydantic.dev/output/index.md)) are not counted in the `tool_calls` metric. #### Output Tool Calls When a model calls an [output tool](https://ai.pydantic.dev/output/#tool-output) in parallel with other tools, the agent's end_strategy parameter controls how these tool calls are executed. The `'exhaustive'` strategy ensures all tools are executed even after a final result is found, which is useful when tools have side effects (like logging, sending notifications, or updating metrics) that should always execute. For more information of how `end_strategy` works with both function tools and output tools, see the [Output Tool](https://ai.pydantic.dev/output/#parallel-output-tool-calls) docs. ## See Also - [Function Tools](https://ai.pydantic.dev/tools/index.md) - Basic tool concepts and registration - [Toolsets](https://ai.pydantic.dev/toolsets/index.md) - Managing collections of tools - [Deferred Tools](https://ai.pydantic.dev/deferred-tools/index.md) - Tools requiring approval or external execution - [Third-Party Tools](https://ai.pydantic.dev/third-party-tools/index.md) - Integrations with external tool libraries # Toolsets A toolset represents a collection of [tools](https://ai.pydantic.dev/tools/index.md) that can be registered with an agent in one go. They can be reused by different agents, swapped out at runtime or during testing, and composed in order to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. A toolset can contain locally defined functions, depend on an external service to provide them, or implement custom logic to list available tools and handle them being called. Toolsets are used (among many other things) to define [MCP servers](https://ai.pydantic.dev/mcp/client/index.md) available to an agent. Pydantic AI includes many kinds of toolsets which are described below, and you can define a [custom toolset](#building-a-custom-toolset) by inheriting from the AbstractToolset class. The toolsets that will be available during an agent run can be specified in four different ways: - at agent construction time, via the toolsets keyword argument to `Agent`, which takes toolset instances as well as functions that generate toolsets [dynamically](#dynamically-building-a-toolset) based on the agent run context - at agent run time, via the `toolsets` keyword argument to agent.run(), agent.run_sync(), agent.run_stream(), or agent.iter(). These toolsets will be additional to those registered on the `Agent` - [dynamically](#dynamically-building-a-toolset), via the @agent.toolset decorator which lets you build a toolset based on the agent run context - as a contextual override, via the `toolsets` keyword argument to the agent.override() context manager. These toolsets will replace those provided at agent construction or run time during the life of the context manager toolsets.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel def agent_tool(): return "I'm registered directly on the agent" def extra_tool(): return "I'm passed as an extra tool for a specific run" def override_tool(): return 'I override all other tools' agent_toolset = FunctionToolset(tools=[agent_tool]) # (1)! extra_toolset = FunctionToolset(tools=[extra_tool]) override_toolset = FunctionToolset(tools=[override_tool]) test_model = TestModel() # (2)! agent = Agent(test_model, toolsets=[agent_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool'] result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool', 'extra_tool'] with agent.override(toolsets=[override_toolset]): result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) # (3)! print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['override_tool'] ``` 1. The FunctionToolset will be explained in detail in the next section. 1. We're using TestModel here because it makes it easy to see which tools were available on each run. 1. This `extra_toolset` will be ignored because we're inside an override context. *(This example is complete, it can be run "as is")* ## Function Toolset As the name suggests, a FunctionToolset makes locally defined functions available as tools. Functions can be added as tools in three different ways: - via the @toolset.tool decorator - via the tools keyword argument to the constructor which can take either plain functions, or instances of Tool - via the toolset.add_function() and toolset.add_tool() methods which can take a plain function or an instance of Tool respectively Functions registered in any of these ways can define an initial `ctx: RunContext` argument in order to receive the agent run context. The `add_function()` and `add_tool()` methods can also be used from a tool function to dynamically register new tools during a run to be available in future run steps. function_toolset.py ```python from datetime import datetime from pydantic_ai import Agent, FunctionToolset, RunContext from pydantic_ai.models.test import TestModel def temperature_celsius(city: str) -> float: return 21.0 def temperature_fahrenheit(city: str) -> float: return 69.8 weather_toolset = FunctionToolset(tools=[temperature_celsius, temperature_fahrenheit]) @weather_toolset.tool def conditions(ctx: RunContext, city: str) -> str: if ctx.run_step % 2 == 0: return "It's sunny" else: return "It's raining" datetime_toolset = FunctionToolset() datetime_toolset.add_function(lambda: datetime.now(), name='now') test_model = TestModel() # (1)! agent = Agent(test_model) result = agent.run_sync('What tools are available?', toolsets=[weather_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions'] result = agent.run_sync('What tools are available?', toolsets=[datetime_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ## Toolset Composition Toolsets can be composed to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. Multiple toolsets can also be combined into one. ### Combining Toolsets CombinedToolset takes a list of toolsets and lets them be used as one. combined_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset([weather_toolset, datetime_toolset]) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions', 'now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Filtering Tools FilteredToolset wraps a toolset and filters available tools ahead of each step of the run based on a user-defined function that is passed the agent run context and each tool's ToolDefinition and returns a boolean to indicate whether or not a given tool should be available. To easily chain different modifications, you can also call filtered() on any toolset instead of directly constructing a `FilteredToolset`. filtered_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset filtered_toolset = combined_toolset.filtered(lambda ctx, tool_def: 'fahrenheit' not in tool_def.name) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[filtered_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['weather_temperature_celsius', 'weather_conditions', 'datetime_now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Prefixing Tool Names PrefixedToolset wraps a toolset and adds a prefix to each tool name to prevent tool name conflicts between different toolsets. To easily chain different modifications, you can also call prefixed() on any toolset instead of directly constructing a `PrefixedToolset`. combined_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset( [ weather_toolset.prefixed('weather'), datetime_toolset.prefixed('datetime') ] ) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ [ 'weather_temperature_celsius', 'weather_temperature_fahrenheit', 'weather_conditions', 'datetime_now', ] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Renaming Tools RenamedToolset wraps a toolset and lets you rename tools using a dictionary mapping new names to original names. This is useful when the names provided by a toolset are ambiguous or would conflict with tools defined by other toolsets, but [prefixing them](#prefixing-tool-names) creates a name that is unnecessarily long or could be confusing to the model. To easily chain different modifications, you can also call renamed() on any toolset instead of directly constructing a `RenamedToolset`. renamed_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset renamed_toolset = combined_toolset.renamed( { 'current_time': 'datetime_now', 'temperature_celsius': 'weather_temperature_celsius', 'temperature_fahrenheit': 'weather_temperature_fahrenheit' } ) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[renamed_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ ['temperature_celsius', 'temperature_fahrenheit', 'weather_conditions', 'current_time'] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Dynamic Tool Definitions PreparedToolset lets you modify the entire list of available tools ahead of each step of the agent run using a user-defined function that takes the agent run context and a list of ToolDefinitions and returns a list of modified `ToolDefinition`s. This is the toolset-specific equivalent of the [`prepare_tools`](https://ai.pydantic.dev/tools-advanced/#prepare-tools) argument to `Agent` that prepares all tool definitions registered on an agent across toolsets. Note that it is not possible to add or rename tools using `PreparedToolset`. Instead, you can use [`FunctionToolset.add_function()`](#function-toolset) or [`RenamedToolset`](#renaming-tools). To easily chain different modifications, you can also call prepared() on any toolset instead of directly constructing a `PreparedToolset`. prepared_toolset.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel from renamed_toolset import renamed_toolset descriptions = { 'temperature_celsius': 'Get the temperature in degrees Celsius', 'temperature_fahrenheit': 'Get the temperature in degrees Fahrenheit', 'weather_conditions': 'Get the current weather conditions', 'current_time': 'Get the current time', } async def add_descriptions(ctx: RunContext, tool_defs: list[ToolDefinition]) -> list[ToolDefinition] | None: return [ replace(tool_def, description=description) if (description := descriptions.get(tool_def.name, None)) else tool_def for tool_def in tool_defs ] prepared_toolset = renamed_toolset.prepared(add_descriptions) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[prepared_toolset]) result = agent.run_sync('What tools are available?') print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='temperature_celsius', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Celsius', ), ToolDefinition( name='temperature_fahrenheit', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Fahrenheit', ), ToolDefinition( name='weather_conditions', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the current weather conditions', ), ToolDefinition( name='current_time', parameters_json_schema={ 'additionalProperties': False, 'properties': {}, 'type': 'object', }, description='Get the current time', ), ] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. ### Requiring Tool Approval ApprovalRequiredToolset wraps a toolset and lets you dynamically [require approval](https://ai.pydantic.dev/deferred-tools/#human-in-the-loop-tool-approval) for a given tool call based on a user-defined function that is passed the agent run context, the tool's ToolDefinition, and the validated tool call arguments. If no function is provided, all tool calls will require approval. To easily chain different modifications, you can also call approval_required() on any toolset instead of directly constructing a `ApprovalRequiredToolset`. See the [Human-in-the-Loop Tool Approval](https://ai.pydantic.dev/deferred-tools/#human-in-the-loop-tool-approval) documentation for more information on how to handle agent runs that call tools that require approval and how to pass in the results. approval_required_toolset.py ```python from pydantic_ai import Agent, DeferredToolRequests, DeferredToolResults from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset approval_required_toolset = prepared_toolset.approval_required(lambda ctx, tool_def, tool_args: tool_def.name.startswith('temperature')) test_model = TestModel(call_tools=['temperature_celsius', 'temperature_fahrenheit']) # (1)! agent = Agent( test_model, toolsets=[approval_required_toolset], output_type=[str, DeferredToolRequests], ) result = agent.run_sync('Call the temperature tools') messages = result.all_messages() print(result.output) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='temperature_celsius', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_celsius', ), ToolCallPart( tool_name='temperature_fahrenheit', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_fahrenheit', ), ], metadata={}, ) """ result = agent.run_sync( message_history=messages, deferred_tool_results=DeferredToolResults( approvals={ 'pyd_ai_tool_call_id__temperature_celsius': True, 'pyd_ai_tool_call_id__temperature_fahrenheit': False, } ) ) print(result.output) #> {"temperature_celsius":21.0,"temperature_fahrenheit":"The tool call was denied."} ``` 1. We're using TestModel here because it makes it easy to specify which tools to call. *(This example is complete, it can be run "as is")* ### Changing Tool Execution WrapperToolset wraps another toolset and delegates all responsibility to it. It is is a no-op by default, but you can subclass `WrapperToolset` to change the wrapped toolset's tool execution behavior by overriding the call_tool() method. logging_toolset.py ```python import asyncio from typing_extensions import Any from pydantic_ai import Agent, RunContext, ToolsetTool, WrapperToolset from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset LOG = [] class LoggingToolset(WrapperToolset): async def call_tool(self, name: str, tool_args: dict[str, Any], ctx: RunContext, tool: ToolsetTool) -> Any: LOG.append(f'Calling tool {name!r} with args: {tool_args!r}') try: await asyncio.sleep(0.1 * len(LOG)) # (1)! result = await super().call_tool(name, tool_args, ctx, tool) LOG.append(f'Finished calling tool {name!r} with result: {result!r}') except Exception as e: LOG.append(f'Error calling tool {name!r}: {e}') raise e else: return result logging_toolset = LoggingToolset(prepared_toolset) agent = Agent(TestModel(), toolsets=[logging_toolset]) # (2)! result = agent.run_sync('Call all the tools') print(LOG) """ [ "Calling tool 'temperature_celsius' with args: {'city': 'a'}", "Calling tool 'temperature_fahrenheit' with args: {'city': 'a'}", "Calling tool 'weather_conditions' with args: {'city': 'a'}", "Calling tool 'current_time' with args: {}", "Finished calling tool 'temperature_celsius' with result: 21.0", "Finished calling tool 'temperature_fahrenheit' with result: 69.8", 'Finished calling tool \'weather_conditions\' with result: "It\'s raining"', "Finished calling tool 'current_time' with result: datetime.datetime(...)", ] """ ``` 1. All docs examples are tested in CI and their their output is verified, so we need `LOG` to always have the same order whenever this code is run. Since the tools could finish in any order, we sleep an increasing amount of time based on which number tool call we are to ensure that they finish (and log) in the same order they were called in. 1. We use TestModel here as it will automatically call each tool. *(This example is complete, it can be run "as is")* ## External Toolset If your agent needs to be able to call [external tools](https://ai.pydantic.dev/deferred-tools/#external-tool-execution) that are provided and executed by an upstream service or frontend, you can build an ExternalToolset from a list of ToolDefinitions containing the tool names, arguments JSON schemas, and descriptions. When the model calls an external tool, the call is considered to be ["deferred"](https://ai.pydantic.dev/deferred-tools/#deferred-tools), and the agent run will end with a DeferredToolRequests output object with a `calls` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID, which are expected to be passed to the upstream service or frontend that will produce the results. When the tool call results are received from the upstream service or frontend, you can build a DeferredToolResults object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](https://ai.pydantic.dev/tools-advanced/#advanced-tool-returns) object, or a ModelRetry exception in case the tool call failed and the model should [try again](https://ai.pydantic.dev/tools-advanced/#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](https://ai.pydantic.dev/message-history/index.md). Note that you need to add `DeferredToolRequests` to the `Agent`'s or `agent.run()`'s [`output_type`](https://ai.pydantic.dev/output/#structured-output) so that the possible types of the agent run output are correctly inferred. For more information, see the [Deferred Tools](https://ai.pydantic.dev/deferred-tools/#deferred-tools) documentation. To demonstrate, let us first define a simple agent *without* deferred tools: [Learn about Gateway](https://ai.pydantic.dev/gateway) deferred_toolset_agent.py ```python from pydantic import BaseModel from pydantic_ai import Agent, FunctionToolset toolset = FunctionToolset() @toolset.tool def get_default_language(): return 'en-US' @toolset.tool def get_user_name(): return 'David' class PersonalizedGreeting(BaseModel): greeting: str language_code: str agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset], output_type=PersonalizedGreeting) result = agent.run_sync('Greet the user in a personalized way') print(repr(result.output)) #> PersonalizedGreeting(greeting='Hello, David!', language_code='en-US') ``` deferred_toolset_agent.py ```python from pydantic import BaseModel from pydantic_ai import Agent, FunctionToolset toolset = FunctionToolset() @toolset.tool def get_default_language(): return 'en-US' @toolset.tool def get_user_name(): return 'David' class PersonalizedGreeting(BaseModel): greeting: str language_code: str agent = Agent('openai:gpt-5.2', toolsets=[toolset], output_type=PersonalizedGreeting) result = agent.run_sync('Greet the user in a personalized way') print(repr(result.output)) #> PersonalizedGreeting(greeting='Hello, David!', language_code='en-US') ``` Next, let's define a function that represents a hypothetical "run agent" API endpoint that can be called by the frontend and takes a list of messages to send to the model, a list of frontend tool definitions, and optional deferred tool results. This is where `ExternalToolset`, `DeferredToolRequests`, and `DeferredToolResults` come in: deferred_toolset_api.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ExternalToolset, ModelMessage, ToolDefinition, ) from deferred_toolset_agent import PersonalizedGreeting, agent def run_agent( messages: list[ModelMessage] = [], frontend_tools: list[ToolDefinition] = {}, deferred_tool_results: DeferredToolResults | None = None, ) -> tuple[PersonalizedGreeting | DeferredToolRequests, list[ModelMessage]]: deferred_toolset = ExternalToolset(frontend_tools) result = agent.run_sync( toolsets=[deferred_toolset], # (1)! output_type=[agent.output_type, DeferredToolRequests], # (2)! message_history=messages, # (3)! deferred_tool_results=deferred_tool_results, ) return result.output, result.new_messages() ``` 1. As mentioned in the [Deferred Tools](https://ai.pydantic.dev/deferred-tools/#deferred-tools) documentation, these `toolsets` are additional to those provided to the `Agent` constructor 1. As mentioned in the [Deferred Tools](https://ai.pydantic.dev/deferred-tools/#deferred-tools) documentation, this `output_type` overrides the one provided to the `Agent` constructor, so we have to make sure to not lose it 1. We don't include an `user_prompt` keyword argument as we expect the frontend to provide it via `messages` Now, imagine that the code below is implemented on the frontend, and `run_agent` stands in for an API call to the backend that runs the agent. This is where we actually execute the deferred tool calls and start a new run with the new result included: deferred_tools.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ModelMessage, ModelRequest, ModelRetry, ToolDefinition, UserPromptPart, ) from deferred_toolset_api import run_agent frontend_tool_definitions = [ ToolDefinition( name='get_preferred_language', parameters_json_schema={'type': 'object', 'properties': {'default_language': {'type': 'string'}}}, description="Get the user's preferred language from their browser", ) ] def get_preferred_language(default_language: str) -> str: return 'es-MX' # (1)! frontend_tool_functions = {'get_preferred_language': get_preferred_language} messages: list[ModelMessage] = [ ModelRequest( parts=[ UserPromptPart(content='Greet the user in a personalized way') ] ) ] deferred_tool_results: DeferredToolResults | None = None final_output = None while True: output, new_messages = run_agent(messages, frontend_tool_definitions, deferred_tool_results) messages += new_messages if not isinstance(output, DeferredToolRequests): final_output = output break print(output.calls) """ [ ToolCallPart( tool_name='get_preferred_language', args={'default_language': 'en-US'}, tool_call_id='pyd_ai_tool_call_id', ) ] """ deferred_tool_results = DeferredToolResults() for tool_call in output.calls: if function := frontend_tool_functions.get(tool_call.tool_name): result = function(**tool_call.args_as_dict()) else: result = ModelRetry(f'Unknown tool {tool_call.tool_name!r}') deferred_tool_results.calls[tool_call.tool_call_id] = result print(repr(final_output)) """ PersonalizedGreeting(greeting='Hola, David! Espero que tengas un gran día!', language_code='es-MX') """ ``` 1. Imagine that this returns the frontend [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/language). *(This example is complete, it can be run "as is")* ## Dynamically Building a Toolset Toolsets can be built dynamically ahead of each agent run or run step using a function that takes the agent run context and returns a toolset or `None`. This is useful when a toolset (like an MCP server) depends on information specific to an agent run, like its [dependencies](https://ai.pydantic.dev/dependencies/index.md). To register a dynamic toolset, you can pass a function that takes RunContext to the `toolsets` argument of the `Agent` constructor, or you can wrap a compliant function in the @agent.toolset decorator. By default, the function will be called again ahead of each agent run step. If you are using the decorator, you can optionally provide a `per_run_step=False` argument to indicate that the toolset only needs to be built once for the entire run. dynamic_toolset.py ```python from dataclasses import dataclass from typing import Literal from pydantic_ai import Agent, RunContext from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset @dataclass class ToggleableDeps: active: Literal['weather', 'datetime'] def toggle(self): if self.active == 'weather': self.active = 'datetime' else: self.active = 'weather' test_model = TestModel() # (1)! agent = Agent( test_model, deps_type=ToggleableDeps # (2)! ) @agent.toolset def toggleable_toolset(ctx: RunContext[ToggleableDeps]): if ctx.deps.active == 'weather': return weather_toolset else: return datetime_toolset @agent.tool def toggle(ctx: RunContext[ToggleableDeps]): ctx.deps.toggle() deps = ToggleableDeps('weather') result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) # (3)! #> ['toggle', 'now'] result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['toggle', 'temperature_celsius', 'temperature_fahrenheit', 'conditions'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. 1. We're using the agent's dependencies to give the `toggle` tool access to the `active` via the `RunContext` argument. 1. This shows the available tools *after* the `toggle` tool was executed, as the "last model request" was the one that returned the `toggle` tool result to the model. *(This example is complete, it can be run "as is")* ## Building a Custom Toolset To define a fully custom toolset with its own logic to list available tools and handle them being called, you can subclass AbstractToolset and implement the get_tools() and call_tool() methods. If you want to reuse a network connection or session across tool listings and calls during an agent run, you can implement __aenter__() and __aexit__(). ## Third-Party Toolsets ### MCP Servers Pydantic AI provides two toolsets that allow an agent to connect to and call tools on local and remote MCP Servers: 1. `MCPServer`: the [MCP SDK-based Client](https://ai.pydantic.dev/mcp/client/index.md) which offers more direct control by leveraging the MCP SDK directly 1. `FastMCPToolset`: the [FastMCP-based Client](https://ai.pydantic.dev/mcp/fastmcp-client/index.md) which offers additional capabilities like Tool Transformation, simpler OAuth configuration, and more. ### Agent Skills Toolsets that implement [Agent Skills](https://agentskills.io) support so agents can efficiently discover and perform specific tasks: - [`pydantic-ai-skills`](https://github.com/DougTrajano/pydantic-ai-skills) - `SkillsToolset` implements Agent Skills support with progressive disclosure (load skills on-demand to reduce tokens). Supports filesystem and programmatic skills; compatible with [agentskills.io](https://agentskills.io). ### Task Management Toolsets for task planning and progress tracking help agents organize complex work and provide visibility into agent progress: - [`pydantic-ai-todo`](https://github.com/vstorm-co/pydantic-ai-todo) - `TodoToolset` with `read_todos` and `write_todos` tools. Included in the third-party [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) [deep agent](https://ai.pydantic.dev/multi-agent-applications/#deep-agents) framework. ### File Operations Toolsets for file operations help agents read, write, and edit files: - [`pydantic-ai-filesystem-sandbox`](https://github.com/zby/pydantic-ai-filesystem-sandbox) - `FileSystemToolset` with a sandbox and LLM-friendly errors - [`pydantic-deep`](https://github.com/vstorm-co/pydantic-deepagents) — Deep agent framework that includes a `FilesystemToolset` with multiple backends (in-memory, real filesystem, Docker sandbox). ### Code Execution Toolsets for sandboxed code execution help agents run code in a sandboxed environment: - [`mcp-run-python`](https://github.com/pydantic/mcp-run-python) - MCP server by the Pydantic team that runs Python code in a sandboxed environment. Can be used as `MCPServerStdio('uv', args=['run', 'mcp-run-python', 'stdio'])`. ### LangChain Tools If you'd like to use tools or a [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits) from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the LangChainToolset which takes a list of LangChain tools. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tools in question. [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) # ... ``` ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) # ... ``` ### ACI.dev Tools If you'd like to use tools from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the ACIToolset [toolset](https://ai.pydantic.dev/toolsets/index.md) which takes a list of ACI tool names as well as the `linked_account_owner_id`. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) ``` ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` # Models # Anthropic ## Install To use `AnthropicModel` models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `anthropic` optional group: ```bash pip install "pydantic-ai-slim[anthropic]" ``` ```bash uv add "pydantic-ai-slim[anthropic]" ``` ## Configuration To use [Anthropic](https://anthropic.com) through their API, go to [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) to generate an API key. `AnthropicModelName` contains a list of available Anthropic models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key' ``` You can then use `AnthropicModel` by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/anthropic:claude-sonnet-4-5') ... ``` ```python from pydantic_ai import Agent agent = Agent('anthropic:claude-sonnet-4-5') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel model = AnthropicModel('claude-sonnet-4-5') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider model = AnthropicModel( 'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `AnthropicProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider custom_http_client = AsyncClient(timeout=30) model = AnthropicModel( 'claude-sonnet-4-5', provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` ## Cloud Platform Integrations You can use Anthropic models through cloud platforms by passing a custom client to AnthropicProvider. ### AWS Bedrock To use Claude models via [AWS Bedrock](https://aws.amazon.com/bedrock/claude/), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock) on how to set up an `AsyncAnthropicBedrock` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicBedrock from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider bedrock_client = AsyncAnthropicBedrock() # Uses AWS credentials from environment provider = AnthropicProvider(anthropic_client=bedrock_client) model = AnthropicModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0', provider=provider) agent = Agent(model) ... ``` Bedrock vs BedrockConverseModel This approach uses Anthropic's SDK with AWS Bedrock credentials. For an alternative using AWS SDK (boto3) directly, see [`BedrockConverseModel`](https://ai.pydantic.dev/models/bedrock/index.md). ### Google Vertex AI To use Claude models via [Google Cloud Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude), follow the [Anthropic documentation](https://docs.anthropic.com/en/api/claude-on-vertex-ai) on how to set up an `AsyncAnthropicVertex` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicVertex from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider vertex_client = AsyncAnthropicVertex(region='us-east5', project_id='your-project-id') provider = AnthropicProvider(anthropic_client=vertex_client) model = AnthropicModel('claude-sonnet-4-5', provider=provider) agent = Agent(model) ... ``` Vertex vs GoogleModel This approach uses Anthropic's SDK with Vertex AI credentials. For an alternative using Google's Vertex AI SDK directly, see [`GoogleModel`](https://ai.pydantic.dev/models/google/index.md). ### Microsoft Foundry To use Claude models via [Microsoft Foundry](https://ai.azure.com/), follow the [Anthropic documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) on how to set up an `AsyncAnthropicFoundry` client and then pass it to `AnthropicProvider`: ```python from anthropic import AsyncAnthropicFoundry from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider foundry_client = AsyncAnthropicFoundry( api_key='your-foundry-api-key', # Or set ANTHROPIC_FOUNDRY_API_KEY resource='your-resource-name', ) provider = AnthropicProvider(anthropic_client=foundry_client) model = AnthropicModel('claude-sonnet-4-5', provider=provider) agent = Agent(model) ... ``` See [Anthropic's Microsoft Foundry documentation](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry) for setup instructions including Entra ID authentication. ## Prompt Caching Anthropic supports [prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) to reduce costs by caching parts of your prompts. Pydantic AI provides four ways to use prompt caching: 1. **Cache User Messages with CachePoint**: Insert a `CachePoint` marker in your user messages to cache everything before it 1. **Cache System Instructions**: Set AnthropicModelSettings.anthropic_cache_instructions to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly 1. **Cache Tool Definitions**: Set AnthropicModelSettings.anthropic_cache_tool_definitions to `True` (uses 5m TTL by default) or specify `'5m'` / `'1h'` directly 1. **Cache All Messages**: Set AnthropicModelSettings.anthropic_cache_messages to `True` to automatically cache all messages Amazon Bedrock When using `AsyncAnthropicBedrock`, the TTL parameter is automatically omitted from all cache control settings (including `CachePoint`, `anthropic_cache_instructions`, `anthropic_cache_tool_definitions`, and `anthropic_cache_messages`) because Bedrock doesn't support explicit TTL. ### Example 1: Automatic Message Caching Use `anthropic_cache_messages` to automatically cache all messages up to and including the newest user message: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='You are a helpful assistant.', model_settings=AnthropicModelSettings( anthropic_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='You are a helpful assistant.', model_settings=AnthropicModelSettings( anthropic_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ### Example 2: Comprehensive Caching Strategy Combine multiple cache settings for maximum savings: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, RunContext from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # Cache system instructions anthropic_cache_tool_definitions='1h', # Cache tool definitions with 1h TTL anthropic_cache_messages=True, # Also cache the last message ), ) @agent.tool def search_docs(ctx: RunContext, query: str) -> str: """Search documentation.""" return f'Results for {query}' result = agent.run_sync('Search for Python best practices') print(result.output) ``` ```python from pydantic_ai import Agent, RunContext from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # Cache system instructions anthropic_cache_tool_definitions='1h', # Cache tool definitions with 1h TTL anthropic_cache_messages=True, # Also cache the last message ), ) @agent.tool def search_docs(ctx: RunContext, query: str) -> str: """Search documentation.""" return f'Results for {query}' result = agent.run_sync('Search for Python best practices') print(result.output) ``` ### Example 3: Fine-Grained Control with CachePoint Use manual `CachePoint` markers to control cache locations precisely: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ### Accessing Cache Usage Statistics Access cache usage statistics via `result.usage()`: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True # Default 5m TTL ), ) result = agent.run_sync('Your question') usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True # Default 5m TTL ), ) result = agent.run_sync('Your question') usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` ### Cache Point Limits Anthropic enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors. #### How Cache Points Are Allocated Cache points can be placed in three locations: 1. **System Prompt**: Via `anthropic_cache_instructions` setting (adds cache point to last system prompt block) 1. **Tool Definitions**: Via `anthropic_cache_tool_definitions` setting (adds cache point to last tool definition) 1. **Messages**: Via `CachePoint` markers or `anthropic_cache_messages` setting (adds cache points to message content) Each setting uses **at most 1 cache point**, but you can combine them. #### Example: Using All 3 Cache Point Sources Define an agent with all cache settings enabled: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point anthropic_cache_messages=True, # 1 cache point ), ) @agent.tool_plain def my_tool() -> str: return 'result' # This uses 3 cache points (instructions + tools + last message) # You can add 1 more CachePoint marker before hitting the limit result = agent.run_sync([ 'Context', CachePoint(), # 4th cache point - OK 'Question' ]) print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='Detailed instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point anthropic_cache_messages=True, # 1 cache point ), ) @agent.tool_plain def my_tool() -> str: return 'result' # This uses 3 cache points (instructions + tools + last message) # You can add 1 more CachePoint marker before hitting the limit result = agent.run_sync([ 'Context', CachePoint(), # 4th cache point - OK 'Question' ]) print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` #### Automatic Cache Point Limiting When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones). Define an agent with 2 cache points from settings: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'gateway/anthropic:claude-sonnet-4-5', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.anthropic import AnthropicModelSettings agent = Agent( 'anthropic:claude-sonnet-4-5', instructions='Instructions...', model_settings=AnthropicModelSettings( anthropic_cache_instructions=True, # 1 cache point anthropic_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) usage = result.usage() print(f'Cache write tokens: {usage.cache_write_tokens}') print(f'Cache read tokens: {usage.cache_read_tokens}') ``` **Key Points**: - System and tool cache points are **always preserved** - The cache point created by `anthropic_cache_messages` is **always preserved** (as it's the newest message cache point) - Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded - This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching # Bedrock ## Install To use `BedrockConverseModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: ```bash pip install "pydantic-ai-slim[bedrock]" ``` ```bash uv add "pydantic-ai-slim[bedrock]" ``` ## Configuration To use [AWS Bedrock](https://aws.amazon.com/bedrock/), you'll need an AWS account with Bedrock enabled and appropriate credentials. You can use either AWS credentials directly or a pre-configured boto3 client. `BedrockModelName` contains a list of available Bedrock models, including models from Anthropic, Amazon, Cohere, Meta, and Mistral. ## Environment variables You can set your AWS credentials as environment variables ([among other options](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables)): ```bash export AWS_BEARER_TOKEN_BEDROCK='your-api-key' # or: export AWS_ACCESS_KEY_ID='your-access-key' export AWS_SECRET_ACCESS_KEY='your-secret-key' export AWS_DEFAULT_REGION='us-east-1' # or your preferred region ``` You can then use `BedrockConverseModel` by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0') ... ``` ```python from pydantic_ai import Agent agent = Agent('bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0') ... ``` Or initialize the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('anthropic.claude-sonnet-4-5-20250929-v1:0') agent = Agent(model) ... ``` ## Customizing Bedrock Runtime API You can customize the Bedrock Runtime API calls by adding additional parameters, such as [guardrail configurations](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) and [performance settings](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html). For a complete list of configurable parameters, refer to the documentation for BedrockModelSettings. customize_bedrock_model_settings.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings # Define Bedrock model settings with guardrail and performance configurations bedrock_model_settings = BedrockModelSettings( bedrock_guardrail_config={ 'guardrailIdentifier': 'v1', 'guardrailVersion': 'v1', 'trace': 'enabled' }, bedrock_performance_configuration={ 'latency': 'optimized' } ) model = BedrockConverseModel(model_name='us.amazon.nova-pro-v1:0') agent = Agent(model=model, model_settings=bedrock_model_settings) ``` ## Prompt Caching Bedrock supports [prompt caching](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html) on Anthropic models so you can reuse expensive context across requests. Pydantic AI provides four ways to use prompt caching: 1. **Cache User Messages with CachePoint**: Insert a `CachePoint` marker to cache everything before it in the current user message. 1. **Cache System Instructions**: Enable BedrockModelSettings.bedrock_cache_instructions to append a cache point after the system prompt. 1. **Cache Tool Definitions**: Enable BedrockModelSettings.bedrock_cache_tool_definitions to cache your tool schemas. 1. **Cache All Messages**: Set BedrockModelSettings.bedrock_cache_messages to `True` to automatically cache the last user message. No TTL Support Unlike the direct Anthropic API, Bedrock manages cache TTL automatically. All cache settings are boolean only — no `'5m'` or `'1h'` options. Minimum Token Threshold AWS only serves cached content once a segment crosses the provider-specific minimum token thresholds (see the [Bedrock prompt caching docs](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)). Short prompts or tool definitions below those limits will bypass the cache, so don't expect savings for tiny payloads. ### Example 1: Automatic Message Caching Use `bedrock_cache_messages` to automatically cache the last user message: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'gateway/bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='You are a helpful assistant.', model_settings=BedrockModelSettings( bedrock_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='You are a helpful assistant.', model_settings=BedrockModelSettings( bedrock_cache_messages=True, # Automatically caches the last message ), ) # The last message is automatically cached - no need for manual CachePoint result1 = agent.run_sync('What is the capital of France?') # Subsequent calls with similar conversation benefit from cache result2 = agent.run_sync('What is the capital of Germany?') print(f'Cache write: {result1.usage().cache_write_tokens}') print(f'Cache read: {result2.usage().cache_read_tokens}') ``` ### Example 2: Comprehensive Caching Strategy Combine multiple cache settings for maximum savings: ```python from pydantic_ai import Agent, RunContext from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings model = BedrockConverseModel('us.anthropic.claude-sonnet-4-5-20250929-v1:0') agent = Agent( model, system_prompt='Detailed instructions...', model_settings=BedrockModelSettings( bedrock_cache_instructions=True, # Cache system instructions bedrock_cache_tool_definitions=True, # Cache tool definitions bedrock_cache_messages=True, # Also cache the last message ), ) @agent.tool def search_docs(ctx: RunContext, query: str) -> str: """Search documentation.""" return f'Results for {query}' result = agent.run_sync('Search for Python best practices') print(result.output) ``` ### Example 3: Fine-Grained Control with CachePoint Use manual `CachePoint` markers to control cache locations precisely: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'gateway/bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ```python from pydantic_ai import Agent, CachePoint agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', ) # Manually control cache points for specific content blocks result = agent.run_sync([ 'Long context from documentation...', CachePoint(), # Cache everything up to this point 'First question' ]) print(result.output) ``` ### Accessing Cache Usage Statistics Access cache usage statistics via RequestUsage: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint agent = Agent('gateway/bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0') async def main(): result = await agent.run( [ 'Reference material...', CachePoint(), 'What changed since last time?', ] ) usage = result.usage() print(f'Cache writes: {usage.cache_write_tokens}') print(f'Cache reads: {usage.cache_read_tokens}') ``` ```python from pydantic_ai import Agent, CachePoint agent = Agent('bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0') async def main(): result = await agent.run( [ 'Reference material...', CachePoint(), 'What changed since last time?', ] ) usage = result.usage() print(f'Cache writes: {usage.cache_write_tokens}') print(f'Cache reads: {usage.cache_read_tokens}') ``` ### Cache Point Limits Bedrock enforces a maximum of 4 cache points per request. Pydantic AI automatically manages this limit to ensure your requests always comply without errors. #### How Cache Points Are Allocated Cache points can be placed in three locations: 1. **System Prompt**: Via `bedrock_cache_instructions` setting (adds cache point to last system prompt block) 1. **Tool Definitions**: Via `bedrock_cache_tool_definitions` setting (adds cache point to last tool definition) 1. **Messages**: Via `CachePoint` markers or `bedrock_cache_messages` setting (adds cache points to message content) Each setting uses **at most 1 cache point**, but you can combine them. #### Automatic Cache Point Limiting When cache points from all sources (settings + `CachePoint` markers) exceed 4, Pydantic AI automatically removes excess cache points from **older message content** (keeping the most recent ones). [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'gateway/bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', model_settings=BedrockModelSettings( bedrock_cache_instructions=True, # 1 cache point bedrock_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) ``` ```python from pydantic_ai import Agent, CachePoint from pydantic_ai.models.bedrock import BedrockModelSettings agent = Agent( 'bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0', system_prompt='Instructions...', model_settings=BedrockModelSettings( bedrock_cache_instructions=True, # 1 cache point bedrock_cache_tool_definitions=True, # 1 cache point ), ) @agent.tool_plain def search() -> str: return 'data' # Already using 2 cache points (instructions + tools) # Can add 2 more CachePoint markers (4 total limit) result = agent.run_sync([ 'Context 1', CachePoint(), # Oldest - will be removed 'Context 2', CachePoint(), # Will be kept (3rd point) 'Context 3', CachePoint(), # Will be kept (4th point) 'Question' ]) # Final cache points: instructions + tools + Context 2 + Context 3 = 4 print(result.output) ``` **Key Points**: - System and tool cache points are **always preserved** - The cache point created by `bedrock_cache_messages` is **always preserved** (as it's the newest message cache point) - Additional `CachePoint` markers in messages are removed from oldest to newest when the limit is exceeded - This ensures critical caching (instructions/tools) is maintained while still benefiting from message-level caching ## `provider` argument You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using AWS credentials directly model = BedrockConverseModel( 'anthropic.claude-sonnet-4-5-20250929-v1:0', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ), ) agent = Agent(model) ... ``` You can also pass a pre-configured boto3 client: ```python import boto3 from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using a pre-configured boto3 client bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1') model = BedrockConverseModel( 'anthropic.claude-sonnet-4-5-20250929-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ... ``` ## Using AWS Application Inference Profiles AWS Bedrock supports [custom application inference profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-create.html) for cost tracking and resource management. When using these profiles, you should specify a [model profile](https://ai.pydantic.dev/models/overview/#models-and-providers) to ensure Pydantic AI can correctly identify model capabilities (streaming, tool use, caching, etc.) while still using the custom inference profile for cost tracking. Without explicit configuration, an inference profile ARN like `arn:aws:bedrock:us-east-2:*****:application-inference-profile/****` doesn't contain enough information for Pydantic AI to determine the underlying model. You can work around this by: 1. Passing the inference profile ARN as the model name to BedrockConverseModel 1. Using the `profile` parameter to specify the logical model name for feature detection ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Create provider with your AWS configuration provider = BedrockProvider(region_name='us-east-2') # Create a profile with the logical model name for feature detection profile = provider.model_profile('us.anthropic.claude-opus-4-5-20251101-v1:0') # Pass the inference profile ARN as the model name model = BedrockConverseModel( 'arn:aws:bedrock:us-east-2:123456789012:application-inference-profile/my-profile', provider=provider, profile=profile, # Provides the logical model name for feature detection ) agent = Agent(model) ``` ## Configuring Retries Bedrock uses boto3's built-in retry mechanisms. You can configure retry behavior by passing a custom boto3 client with retry settings: ```python import boto3 from botocore.config import Config from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Configure retry settings config = Config( retries={ 'max_attempts': 5, 'mode': 'adaptive' # Recommended for rate limiting } ) bedrock_client = boto3.client( 'bedrock-runtime', region_name='us-east-1', config=config ) model = BedrockConverseModel( 'us.amazon.nova-micro-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ``` ### Retry Modes - `'legacy'` (default): 5 attempts, basic retry behavior - `'standard'`: 3 attempts, more comprehensive error coverage - `'adaptive'`: 3 attempts with client-side rate limiting (recommended for handling `ThrottlingException`) For more details on boto3 retry configuration, see the [AWS boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html). Note Unlike other providers that use httpx for HTTP requests, Bedrock uses boto3's native retry mechanisms. The retry strategies described in [HTTP Request Retries](https://ai.pydantic.dev/retries/index.md) do not apply to Bedrock. # Cerebras ## Install To use `CerebrasModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cerebras` optional group: ```bash pip install "pydantic-ai-slim[cerebras]" ``` ```bash uv add "pydantic-ai-slim[cerebras]" ``` ## Configuration To use [Cerebras](https://cerebras.ai/) through their API, go to [cloud.cerebras.ai](https://cloud.cerebras.ai/?utm_source=3pi_pydantic-ai&utm_campaign=partner_doc) and generate an API key. For a list of available models, see the [Cerebras models documentation](https://inference-docs.cerebras.ai/models). ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export CEREBRAS_API_KEY='your-api-key' ``` You can then use `CerebrasModel` by name: ```python from pydantic_ai import Agent agent = Agent('cerebras:llama-3.3-70b') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel model = CerebrasModel('llama-3.3-70b') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel from pydantic_ai.providers.cerebras import CerebrasProvider model = CerebrasModel( 'llama-3.3-70b', provider=CerebrasProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `CerebrasProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cerebras import CerebrasModel from pydantic_ai.providers.cerebras import CerebrasProvider custom_http_client = AsyncClient(timeout=30) model = CerebrasModel( 'llama-3.3-70b', provider=CerebrasProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Cohere ## Install To use `CohereModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: ```bash pip install "pydantic-ai-slim[cohere]" ``` ```bash uv add "pydantic-ai-slim[cohere]" ``` ## Configuration To use [Cohere](https://cohere.com/) through their API, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. `CohereModelName` contains a list of the most popular Cohere models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export CO_API_KEY='your-api-key' ``` You can then use `CohereModel` by name: ```python from pydantic_ai import Agent agent = Agent('cohere:command-r7b-12-2024') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel model = CohereModel('command-r7b-12-2024') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider model = CohereModel('command-r7b-12-2024', provider=CohereProvider(api_key='your-api-key')) agent = Agent(model) ... ``` You can also customize the `CohereProvider` with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider custom_http_client = AsyncClient(timeout=30) model = CohereModel( 'command-r7b-12-2024', provider=CohereProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Google The `GoogleModel` is a model that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Install To use `GoogleModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: ```bash pip install "pydantic-ai-slim[google]" ``` ```bash uv add "pydantic-ai-slim[google]" ``` ## Configuration `GoogleModel` lets you use Google's Gemini models through their [Generative Language API](https://ai.google.dev/api/all-methods) (`generativelanguage.googleapis.com`) or [Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) (`*-aiplatform.googleapis.com`). ### API Key (Generative Language API) To use Gemini via the Generative Language API, go to [aistudio.google.com](https://aistudio.google.com/apikey) and create an API key. Once you have the API key, set it as an environment variable: ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` by name (where GLA stands for Generative Language API): [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/gemini:gemini-3-pro-preview') ... ``` ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(api_key='your-api-key') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` ### Vertex AI (Enterprise/Cloud) If you are an enterprise user, you can also use `GoogleModel` to access Gemini via Vertex AI. This interface has a number of advantages over the Generative Language API: 1. The VertexAI API comes with more enterprise readiness guarantees. 1. You can [purchase provisioned throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput#purchase-provisioned-throughput) with Vertex AI to guarantee capacity. 1. If you're running Pydantic AI inside GCP, you don't need to set up authentication, it should "just work". 1. You can decide which region to use, which might be important from a regulatory perspective, and might improve latency. You can authenticate using [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials), a service account, or an [API key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode). Whichever way you authenticate, you'll need to have Vertex AI enabled in your GCP account. #### Application Default Credentials If you have the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) installed and configured, you can use `GoogleProvider` in Vertex AI mode by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/google-vertex:gemini-3-pro-preview') ... ``` ```python from pydantic_ai import Agent agent = Agent('google-vertex:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider and model: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True) model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Service Account To use a service account JSON file, explicitly create the provider and model: google_model_service_account.py ```python from google.oauth2 import service_account from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider credentials = service_account.Credentials.from_service_account_file( 'path/to/service-account.json', scopes=['https://www.googleapis.com/auth/cloud-platform'], ) provider = GoogleProvider(credentials=credentials, project='your-project-id') model = GoogleModel('gemini-3-flash-preview', provider=provider) agent = Agent(model) ... ``` #### API Key To use Vertex AI with an API key, [create a key](https://cloud.google.com/vertex-ai/generative-ai/docs/start/api-keys?usertype=expressmode) and set it as an environment variable: ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` in Vertex AI mode by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/google-vertex:gemini-3-pro-preview') ... ``` ```python from pydantic_ai import Agent agent = Agent('google-vertex:gemini-3-pro-preview') ... ``` Or you can explicitly create the provider and model: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, api_key='your-api-key') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Customizing Location or Project You can specify the location and/or project when using Vertex AI: google_model_location.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, location='asia-east1', project='your-gcp-project-id') model = GoogleModel('gemini-3-pro-preview', provider=provider) agent = Agent(model) ... ``` #### Model Garden You can access models from the [Model Garden](https://cloud.google.com/model-garden?hl=en) that support the `generateContent` API and are available under your GCP project, including but not limited to Gemini, using one of the following `model_name` patterns: - `{model_id}` for Gemini models - `{publisher}/{model_id}` - `publishers/{publisher}/models/{model_id}` - `projects/{project}/locations/{location}/publishers/{publisher}/models/{model_id}` ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider( project='your-gcp-project-id', location='us-central1', # the region where the model is available ) model = GoogleModel('meta/llama-3.3-70b-instruct-maas', provider=provider) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `GoogleProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider custom_http_client = AsyncClient(timeout=30) model = GoogleModel( 'gemini-3-pro-preview', provider=GoogleProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` ## Document, Image, Audio, and Video Input `GoogleModel` supports multi-modal input, including documents, images, audio, and video. YouTube video URLs can be passed directly to Google models: youtube_input.py ```python from pydantic_ai import Agent, VideoUrl from pydantic_ai.models.google import GoogleModel agent = Agent(GoogleModel('gemini-3-flash-preview')) result = agent.run_sync( [ 'What is this video about?', VideoUrl(url='https://www.youtube.com/watch?v=dQw4w9WgXcQ'), ] ) print(result.output) ``` Files can be uploaded via the [Files API](https://ai.google.dev/gemini-api/docs/files) and passed as URLs: file_upload.py ```python from pydantic_ai import Agent, DocumentUrl from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider() file = provider.client.files.upload(file='pydantic-ai-logo.png') assert file.uri is not None agent = Agent(GoogleModel('gemini-3-flash-preview', provider=provider)) result = agent.run_sync( [ 'What company is this logo from?', DocumentUrl(url=file.uri, media_type=file.mime_type), ] ) print(result.output) ``` See the [input documentation](https://ai.pydantic.dev/input/index.md) for more details and examples. ## Model settings You can customize model behavior using GoogleModelSettings: ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings settings = GoogleModelSettings( temperature=0.2, max_tokens=1024, google_thinking_config={'thinking_level': 'low'}, google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-3-pro-preview') agent = Agent(model, model_settings=settings) ... ``` ### Configure thinking Gemini 3 models use `thinking_level` to control thinking behavior: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings # Set thinking level for Gemini 3 models model_settings = GoogleModelSettings(google_thinking_config={'thinking_level': 'low'}) # 'low' or 'high' model = GoogleModel('gemini-3-flash-preview') agent = Agent(model, model_settings=model_settings) ... ``` For older models (pre-Gemini 3), you can use `thinking_budget` instead: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings # Disable thinking on older models by setting budget to 0 model_settings = GoogleModelSettings(google_thinking_config={'thinking_budget': 0}) model = GoogleModel('gemini-2.5-flash') # Older model agent = Agent(model, model_settings=model_settings) ... ``` Check out the [Gemini API docs](https://ai.google.dev/gemini-api/docs/thinking) for more on thinking. ### Safety settings You can customize the safety settings by setting the `google_safety_settings` field. ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings( google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-3-flash-preview') agent = Agent(model, model_settings=model_settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings. # Groq ## Install To use `GroqModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `groq` optional group: ```bash pip install "pydantic-ai-slim[groq]" ``` ```bash uv add "pydantic-ai-slim[groq]" ``` ## Configuration To use [Groq](https://groq.com/) through their API, go to [console.groq.com/keys](https://console.groq.com/keys) and follow your nose until you find the place to generate an API key. `GroqModelName` contains a list of available Groq models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export GROQ_API_KEY='your-api-key' ``` You can then use `GroqModel` by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/groq:llama-3.3-70b-versatile') ... ``` ```python from pydantic_ai import Agent agent = Agent('groq:llama-3.3-70b-versatile') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel model = GroqModel('llama-3.3-70b-versatile') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `GroqProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider custom_http_client = AsyncClient(timeout=30) model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Hugging Face [Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos. You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure. ## Install To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group: ```bash pip install "pydantic-ai-slim[huggingface]" ``` ```bash uv add "pydantic-ai-slim[huggingface]" ``` ## Configuration To use [Hugging Face](https://huggingface.co/) inference, you'll need to set up an account which will give you [free tier](https://huggingface.co/docs/inference-providers/pricing) allowance on [Inference Providers](https://huggingface.co/docs/inference-providers). To setup inference, follow these steps: 1. Go to [Hugging Face](https://huggingface.co/join) and sign up for an account. 1. Create a new access token in [Hugging Face](https://huggingface.co/settings/tokens). 1. Set the `HF_TOKEN` environment variable to the token you just created. Once you have a Hugging Face access token, you can set it as an environment variable: ```bash export HF_TOKEN='hf_token' ``` ## Usage You can then use HuggingFaceModel by name: ```python from pydantic_ai import Agent agent = Agent('huggingface:Qwen/Qwen3-235B-A22B') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel model = HuggingFaceModel('Qwen/Qwen3-235B-A22B') agent = Agent(model) ... ``` By default, the HuggingFaceModel uses the HuggingFaceProvider that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in https://hf.co/settings/inference-providers. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the HuggingFaceProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius')) agent = Agent(model) ... ``` ## Custom Hugging Face client HuggingFaceProvider also accepts a custom [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client via the `hf_client` parameter, so you can customise the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the [Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client). ```python from huggingface_hub import AsyncInferenceClient from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider client = AsyncInferenceClient( bill_to='openai', api_key='hf_token', provider='fireworks-ai', ) model = HuggingFaceModel( 'Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(hf_client=client), ) agent = Agent(model) ... ``` # Mistral ## Install To use `MistralModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `mistral` optional group: ```bash pip install "pydantic-ai-slim[mistral]" ``` ```bash uv add "pydantic-ai-slim[mistral]" ``` ## Configuration To use [Mistral](https://mistral.ai) through their API, go to [console.mistral.ai/api-keys/](https://console.mistral.ai/api-keys/) and follow your nose until you find the place to generate an API key. `LatestMistralModelNames` contains a list of the most popular Mistral models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export MISTRAL_API_KEY='your-api-key' ``` You can then use `MistralModel` by name: ```python from pydantic_ai import Agent agent = Agent('mistral:mistral-large-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel model = MistralModel('mistral-small-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', base_url='https://') ) agent = Agent(model) ... ``` You can also customize the provider with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider custom_http_client = AsyncClient(timeout=30) model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # OpenAI ## Install To use OpenAI models or OpenAI-compatible APIs, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` ## Configuration To use `OpenAIChatModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use `OpenAIChatModel` by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2') ... ``` ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel model = OpenAIChatModel('gpt-5.2') agent = Agent(model) ... ``` By default, the `OpenAIChatModel` uses the `OpenAIProvider` with the `base_url` set to `https://api.openai.com/v1`. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the OpenAIProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(api_key='your-api-key')) agent = Agent(model) ... ``` ## Custom OpenAI Client `OpenAIProvider` also accepts a custom `AsyncOpenAI` client via the `openai_client` parameter, so you can customise the `organization`, `project`, `base_url` etc. as defined in the [OpenAI API docs](https://platform.openai.com/docs/api-reference). custom_openai_client.py ```python from openai import AsyncOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncOpenAI(max_retries=3) model = OpenAIChatModel('gpt-5.2', provider=OpenAIProvider(openai_client=client)) agent = Agent(model) ... ``` You could also use the [`AsyncAzureOpenAI`](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) client to use the Azure OpenAI API. Note that the `AsyncAzureOpenAI` is a subclass of `AsyncOpenAI`. ```python from openai import AsyncAzureOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncAzureOpenAI( azure_endpoint='...', api_version='2024-07-01-preview', api_key='your-api-key', ) model = OpenAIChatModel( 'gpt-5.2', provider=OpenAIProvider(openai_client=client), ) agent = Agent(model) ... ``` ## OpenAI Responses API Pydantic AI also supports OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) through the You can use OpenAIResponsesModel by name: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent agent = Agent('gateway/openai-responses:gpt-5.2') ... ``` ```python from pydantic_ai import Agent agent = Agent('openai-responses:gpt-5.2') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model) ... ``` You can learn more about the differences between the Responses API and Chat Completions API in the [OpenAI API docs](https://platform.openai.com/docs/guides/migrate-to-responses). ### Built-in tools The Responses API has built-in tools that you can use instead of building your own: - [Web search](https://platform.openai.com/docs/guides/tools-web-search): allow models to search the web for the latest information before generating a response. - [Code interpreter](https://platform.openai.com/docs/guides/tools-code-interpreter): allow models to write and run Python code in a sandboxed environment before generating a response. - [Image generation](https://platform.openai.com/docs/guides/tools-image-generation): allow models to generate images based on a text prompt. - [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response. - [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf. Web search, Code interpreter, Image generation, and File search are natively supported through the [Built-in tools](https://ai.pydantic.dev/builtin-tools/index.md) feature. Computer use can be enabled by passing an [`openai.types.responses.ComputerToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/computer_tool_param.py) in the `openai_builtin_tools` setting on OpenAIResponsesModelSettings. It doesn't currently generate BuiltinToolCallPart or BuiltinToolReturnPart parts in the message history, or streamed events; please submit an issue if you need native support for this built-in tool. computer_use_tool.py ```python from openai.types.responses import ComputerToolParam from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model_settings = OpenAIResponsesModelSettings( openai_builtin_tools=[ ComputerToolParam( type='computer_use', ) ], ) model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model, model_settings=model_settings) result = agent.run_sync('Open a new browser tab') print(result.output) ``` #### Referencing earlier responses The Responses API supports referencing earlier model responses in a new request using a `previous_response_id` parameter, to ensure the full [conversation state](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#passing-context-from-the-previous-response) including [reasoning items](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) are kept in context. This is available through the `openai_previous_response_id` field in OpenAIResponsesModelSettings. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model) result = agent.run_sync('The secret is 1234') model_settings = OpenAIResponsesModelSettings( openai_previous_response_id=result.all_messages()[-1].provider_response_id ) result = agent.run_sync('What is the secret code?', model_settings=model_settings) print(result.output) #> 1234 ``` By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history. ##### Automatically referencing earlier responses When the `openai_previous_response_id` field is set to `'auto'`, Pydantic AI will automatically select the most recent `provider_response_id` from message history and omit messages that came before it, letting the OpenAI API leverage server-side history instead for improved efficiency. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5.2') agent = Agent(model=model) result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # When set to 'auto', the most recent provider_response_id # and messages after it are sent as request. model_settings = OpenAIResponsesModelSettings(openai_previous_response_id='auto') result2 = agent.run_sync( 'Explain?', message_history=result1.new_messages(), model_settings=model_settings ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. ``` ## OpenAI-compatible Models Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. Before getting started, check the [installation and configuration](#install) instructions above. To use another OpenAI-compatible API, you can make use of the `base_url` and `api_key` arguments from `OpenAIProvider`: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://', api_key='your-api-key' ), ) agent = Agent(model) ... ``` Various providers also have their own provider classes so that you don't need to specify the base URL yourself and you can use the standard `_API_KEY` environment variable to set the API key. When a provider has its own provider class, you can use the `Agent(":")` shorthand, e.g. `Agent("deepseek:deepseek-chat")` or `Agent("moonshotai:kimi-k2-0711-preview")`, instead of building the `OpenAIChatModel` explicitly. Similarly, you can pass the provider name as a string to the `provider` argument on `OpenAIChatModel` instead of building instantiating the provider class explicitly. #### Model Profile Sometimes, the provider or model you're using will have slightly different requirements than OpenAI's API or models, like having different restrictions on JSON schemas for tool definitions, or not supporting tool definitions to be marked as strict. When using an alternative provider class provided by Pydantic AI, an appropriate model profile is typically selected automatically based on the model name. If the model you're using is not working correctly out of the box, you can tweak various aspects of how model requests are constructed by providing your own ModelProfile (for behaviors shared among all model classes) or OpenAIModelProfile (for behaviors specific to `OpenAIChatModel`): ```python from pydantic_ai import Agent, InlineDefsJsonSchemaTransformer from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.profiles.openai import OpenAIModelProfile from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://.com', api_key='your-api-key' ), profile=OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, # Supported by any model class on a plain ModelProfile openai_supports_strict_tool_definition=False # Supported by OpenAIModel only, requires OpenAIModelProfile ) ) agent = Agent(model) ``` ### DeepSeek To use the [DeepSeek](https://deepseek.com) provider, first create an API key by following the [Quick Start guide](https://api-docs.deepseek.com/). You can then set the `DEEPSEEK_API_KEY` environment variable and use DeepSeekProvider by name: ```python from pydantic_ai import Agent agent = Agent('deepseek:deepseek-chat') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider(api_key='your-deepseek-api-key'), ) agent = Agent(model) ... ``` You can also customize any provider with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider custom_http_client = AsyncClient(timeout=30) model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider( api_key='your-deepseek-api-key', http_client=custom_http_client ), ) agent = Agent(model) ... ``` ### Alibaba Cloud Model Studio (DashScope) To use Qwen models via [Alibaba Cloud Model Studio (DashScope)](https://www.alibabacloud.com/en/product/modelstudio), you can set the `ALIBABA_API_KEY` (or `DASHSCOPE_API_KEY`) environment variable and use AlibabaProvider by name: ```python from pydantic_ai import Agent agent = Agent('alibaba:qwen-max') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.alibaba import AlibabaProvider model = OpenAIChatModel( 'qwen-max', provider=AlibabaProvider(api_key='your-api-key'), ) agent = Agent(model) ... ``` The `AlibabaProvider` uses the international DashScope compatible endpoint `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` by default. You can override this by passing a custom `base_url`: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.alibaba import AlibabaProvider model = OpenAIChatModel( 'qwen-max', provider=AlibabaProvider( api_key='your-api-key', base_url='https://dashscope.aliyuncs.com/compatible-mode/v1', # China region ), ) agent = Agent(model) ... ``` ### Ollama Pydantic AI supports both self-hosted [Ollama](https://ollama.com/) servers (running locally or remotely) and [Ollama Cloud](https://ollama.com/cloud). For servers running locally, use the `http://localhost:11434/v1` base URL. For Ollama Cloud, use `https://ollama.com/v1` and ensure an API key is set. You can set the `OLLAMA_BASE_URL` and (optionally) `OLLAMA_API_KEY` environment variables and use OllamaProvider by name: ```python from pydantic_ai import Agent agent = Agent('ollama:gpt-oss:20b') ... ``` Or initialise the model and provider directly: ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ollama import OllamaProvider class CityLocation(BaseModel): city: str country: str ollama_model = OpenAIChatModel( model_name='gpt-oss:20b', provider=OllamaProvider(base_url='http://localhost:11434/v1'), # (1)! ) agent = Agent(ollama_model, output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` 1. For Ollama Cloud, use the `base_url='https://ollama.com/v1'` and set the `OLLAMA_API_KEY` environment variable. ### Azure AI Foundry To use [Azure AI Foundry](https://ai.azure.com/) as your provider, you can set the `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION` environment variables and use AzureProvider by name: ```python from pydantic_ai import Agent agent = Agent('azure:gpt-5.2') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.azure import AzureProvider model = OpenAIChatModel( 'gpt-5.2', provider=AzureProvider( azure_endpoint='your-azure-endpoint', api_version='your-api-version', api_key='your-api-key', ), ) agent = Agent(model) ... ``` ### Vercel AI Gateway To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token. You can set the `VERCEL_AI_GATEWAY_API_KEY` and `VERCEL_OIDC_TOKEN` environment variables and use VercelProvider by name: ```python from pydantic_ai import Agent agent = Agent('vercel:anthropic/claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.vercel import VercelProvider model = OpenAIChatModel( 'anthropic/claude-sonnet-4-5', provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'), ) agent = Agent(model) ... ``` ### MoonshotAI Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console). You can set the `MOONSHOTAI_API_KEY` environment variable and use MoonshotAIProvider by name: ```python from pydantic_ai import Agent agent = Agent('moonshotai:kimi-k2-0711-preview') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.moonshotai import MoonshotAIProvider model = OpenAIChatModel( 'kimi-k2-0711-preview', provider=MoonshotAIProvider(api_key='your-moonshot-api-key'), ) agent = Agent(model) ... ``` ### GitHub Models To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission. You can set the `GITHUB_API_KEY` environment variable and use GitHubProvider by name: ```python from pydantic_ai import Agent agent = Agent('github:xai/grok-3-mini') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.github import GitHubProvider model = OpenAIChatModel( 'xai/grok-3-mini', # GitHub Models uses prefixed model names provider=GitHubProvider(api_key='your-github-token'), ) agent = Agent(model) ... ``` GitHub Models supports various model families with different prefixes. You can see the full list on the [GitHub Marketplace](https://github.com/marketplace?type=models) or the public [catalog endpoint](https://models.github.ai/catalog/models). ### Perplexity Follow the Perplexity [getting started](https://docs.perplexity.ai/guides/getting-started) guide to create an API key. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'sonar-pro', provider=OpenAIProvider( base_url='https://api.perplexity.ai', api_key='your-perplexity-api-key', ), ) agent = Agent(model) ... ``` ### Fireworks AI Go to [Fireworks.AI](https://fireworks.ai/) and create an API key in your account settings. You can set the `FIREWORKS_API_KEY` environment variable and use FireworksProvider by name: ```python from pydantic_ai import Agent agent = Agent('fireworks:accounts/fireworks/models/qwq-32b') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.fireworks import FireworksProvider model = OpenAIChatModel( 'accounts/fireworks/models/qwq-32b', # model library available at https://fireworks.ai/models provider=FireworksProvider(api_key='your-fireworks-api-key'), ) agent = Agent(model) ... ``` ### Together AI Go to [Together.ai](https://www.together.ai/) and create an API key in your account settings. You can set the `TOGETHER_API_KEY` environment variable and use TogetherProvider by name: ```python from pydantic_ai import Agent agent = Agent('together:meta-llama/Llama-3.3-70B-Instruct-Turbo-Free') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.together import TogetherProvider model = OpenAIChatModel( 'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free', # model library available at https://www.together.ai/models provider=TogetherProvider(api_key='your-together-api-key'), ) agent = Agent(model) ... ``` ### Heroku AI To use [Heroku AI](https://www.heroku.com/ai), first create an API key. You can set the `HEROKU_INFERENCE_KEY` and (optionally )`HEROKU_INFERENCE_URL` environment variables and use HerokuProvider by name: ```python from pydantic_ai import Agent agent = Agent('heroku:claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.heroku import HerokuProvider model = OpenAIChatModel( 'claude-sonnet-4-5', provider=HerokuProvider(api_key='your-heroku-inference-key'), ) agent = Agent(model) ... ``` ### LiteLLM To use [LiteLLM](https://www.litellm.ai/), set the configs as outlined in the [doc](https://docs.litellm.ai/docs/set_keys). In `LiteLLMProvider`, you can pass `api_base` and `api_key`. The value of these configs will depend on your setup. For example, if you are using OpenAI models, then you need to pass `https://api.openai.com/v1` as the `api_base` and your OpenAI API key as the `api_key`. If you are using a LiteLLM proxy server running on your local machine, then you need to pass `http://localhost:` as the `api_base` and your LiteLLM API key (or a placeholder) as the `api_key`. To use custom LLMs, use `custom/` prefix in the model name. Once you have the configs, use the LiteLLMProvider as follows: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.litellm import LiteLLMProvider model = OpenAIChatModel( 'openai/gpt-5.2', provider=LiteLLMProvider( api_base='', api_key='' ) ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ... ``` ### Nebius AI Studio Go to [Nebius AI Studio](https://studio.nebius.com/) and create an API key. You can set the `NEBIUS_API_KEY` environment variable and use NebiusProvider by name: ```python from pydantic_ai import Agent agent = Agent('nebius:Qwen/Qwen3-32B-fast') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.nebius import NebiusProvider model = OpenAIChatModel( 'Qwen/Qwen3-32B-fast', provider=NebiusProvider(api_key='your-nebius-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### OVHcloud AI Endpoints To use OVHcloud AI Endpoints, you need to create a new API key. To do so, go to the [OVHcloud manager](https://ovh.com/manager), then in Public Cloud > AI Endpoints > API keys. Click on `Create a new API key` and copy your new key. You can explore the [catalog](https://endpoints.ai.cloud.ovh.net/catalog) to find which models are available. You can set the `OVHCLOUD_API_KEY` environment variable and use OVHcloudProvider by name: ```python from pydantic_ai import Agent agent = Agent('ovhcloud:gpt-oss-120b') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` If you need to configure the provider, you can use the OVHcloudProvider class: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ovhcloud import OVHcloudProvider model = OpenAIChatModel( 'gpt-oss-120b', provider=OVHcloudProvider(api_key='your-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### SambaNova To use [SambaNova Cloud](https://cloud.sambanova.ai/), you need to obtain an API key from the [SambaNova Cloud dashboard](https://cloud.sambanova.ai/dashboard). SambaNova provides access to multiple model families including Meta Llama, DeepSeek, Qwen, and Mistral models with fast inference speeds. You can set the `SAMBANOVA_API_KEY` environment variable and use SambaNovaProvider by name: ```python from pydantic_ai import Agent agent = Agent('sambanova:Meta-Llama-3.1-8B-Instruct') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.sambanova import SambaNovaProvider model = OpenAIChatModel( 'Meta-Llama-3.1-8B-Instruct', provider=SambaNovaProvider(api_key='your-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` For a complete list of available models, see the [SambaNova supported models documentation](https://docs.sambanova.ai/docs/en/models/sambacloud-models). You can customize the base URL if needed: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.sambanova import SambaNovaProvider model = OpenAIChatModel( 'DeepSeek-R1-0528', provider=SambaNovaProvider( api_key='your-api-key', base_url='https://custom.endpoint.com/v1', ), ) agent = Agent(model) ... ``` # OpenRouter ## Install To use `OpenRouterModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openrouter` optional group: ```bash pip install "pydantic-ai-slim[openrouter]" ``` ```bash uv add "pydantic-ai-slim[openrouter]" ``` ## Configuration To use [OpenRouter](https://openrouter.ai), first create an API key at [openrouter.ai/keys](https://openrouter.ai/keys). You can set the `OPENROUTER_API_KEY` environment variable and use OpenRouterProvider by name: ```python from pydantic_ai import Agent agent = Agent('openrouter:anthropic/claude-sonnet-4-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel from pydantic_ai.providers.openrouter import OpenRouterProvider model = OpenRouterModel( 'anthropic/claude-sonnet-4-5', provider=OpenRouterProvider(api_key='your-openrouter-api-key'), ) agent = Agent(model) ... ``` ## App Attribution OpenRouter has an [app attribution](https://openrouter.ai/docs/app-attribution) feature to track your application in their public ranking and analytics. You can pass in an `app_url` and `app_title` when initializing the provider to enable app attribution. ```python from pydantic_ai.providers.openrouter import OpenRouterProvider provider=OpenRouterProvider( api_key='your-openrouter-api-key', app_url='https://your-app.com', app_title='Your App', ), ... ``` ## Model Settings You can customize model behavior using OpenRouterModelSettings: ```python from pydantic_ai import Agent from pydantic_ai.models.openrouter import OpenRouterModel, OpenRouterModelSettings settings = OpenRouterModelSettings( openrouter_reasoning={ 'effort': 'high', }, openrouter_usage={ 'include': True, } ) model = OpenRouterModel('openai/gpt-5.2') agent = Agent(model, model_settings=settings) ... ``` # Outlines ## Install As Outlines is a library allowing you to run models from various different providers, it does not include the necessary dependencies for any provider by default. As a result, to use the OutlinesModel, you must install `pydantic-ai-slim` with an optional group composed of outlines, a dash, and the name of the specific model provider you would use through Outlines. For instance: ```bash pip install "pydantic-ai-slim[outlines-transformers]" ``` ```bash uv add "pydantic-ai-slim[outlines-transformers]" ``` Or ```bash pip install "pydantic-ai-slim[outlines-mlxlm]" ``` ```bash uv add "pydantic-ai-slim[outlines-mlxlm]" ``` There are 5 optional groups for the 5 model providers supported through Outlines: - `outlines-transformers` - `outlines-llamacpp` - `outlines-mlxlm` - `outlines-sglang` - `outlines-vllm-offline` ## Model Initialization As Outlines is not an inference provider, but instead a library allowing you to run both local and API-based models, instantiating the model is a bit different from the other models available on Pydantic AI. To initialize the `OutlinesModel` through the `__init__` method, the first argument you must provide has to be an `outlines.Model` or an `outlines.AsyncModel` instance. For instance: ```python import outlines from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai.models.outlines import OutlinesModel outlines_model = outlines.from_transformers( AutoModelForCausalLM.from_pretrained('erwanf/gpt2-mini'), AutoTokenizer.from_pretrained('erwanf/gpt2-mini') ) model = OutlinesModel(outlines_model) ``` As you already providing an Outlines model instance, there is no need to provide an `OutlinesProvider` yourself. ### Model Loading Methods Alternatively, you can use some `OutlinesModel` class methods made to load a specific type of Outlines model directly. To do so, you must provide as arguments the same arguments you would have given to the associated Outlines model loading function (except in the case of SGLang). There are methods for the 5 Outlines models that are officially supported in the integration into Pydantic AI: - from_transformers - from_llamacpp - from_mlxlm - from_sglang - from_vllm_offline #### Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_transformers( AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'), AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct') ) ``` #### LlamaCpp ```python from llama_cpp import Llama from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_llamacpp( Llama.from_pretrained( repo_id='TheBloke/Mistral-7B-Instruct-v0.2-GGUF', filename='mistral-7b-instruct-v0.2.Q5_K_M.gguf', ) ) ``` #### MLXLM ```python from mlx_lm import load from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_mlxlm( *load('mlx-community/TinyLlama-1.1B-Chat-v1.0-4bit') ) ``` #### SGLang ```python from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_sglang( 'http://localhost:11434', 'api_key', 'meta-llama/Llama-3.1-8B' ) ``` #### vLLM Offline ```python from vllm import LLM from pydantic_ai.models.outlines import OutlinesModel model = OutlinesModel.from_vllm_offline( LLM('microsoft/Phi-3-mini-4k-instruct') ) ``` ## Running the model Once you have initialized an `OutlinesModel`, you can use it with an Agent as with all other Pydantic AI models. As Outlines is focused on structured output, this provider supports the `output_type` component through the NativeOutput format. There is not need to include information on the required output format in your prompt, instructions based on the `output_type` will be included automatically. ```python from pydantic import BaseModel from transformers import AutoModelForCausalLM, AutoTokenizer from pydantic_ai import Agent from pydantic_ai.models.outlines import OutlinesModel from pydantic_ai.settings import ModelSettings class Box(BaseModel): """Class representing a box""" width: int height: int depth: int units: str model = OutlinesModel.from_transformers( AutoModelForCausalLM.from_pretrained('microsoft/Phi-3-mini-4k-instruct'), AutoTokenizer.from_pretrained('microsoft/Phi-3-mini-4k-instruct') ) agent = Agent(model, output_type=Box) result = agent.run_sync( 'Give me the dimensions of a box', model_settings=ModelSettings(extra_body={'max_new_tokens': 100}) ) print(result.output) # width=20 height=30 depth=40 units='cm' ``` Outlines does not support tools yet, but support for that feature will be added in the near future. ## Multimodal models If the model you are running through Outlines and the provider selected supports it, you can include images in your prompts using ImageUrl or BinaryImage. In that case, the prompt you provide when running the agent should be a list containing a string and one or several images. See the [input documentation](https://ai.pydantic.dev/input/index.md) for details and examples on using assets in model inputs. This feature is supported in Outlines for the `SGLang` and `Transformers` models. If you want to run a multimodal model through `transformers`, you must provide a processor instead of a tokenizer as the second argument when initializing the model with the `OutlinesModel.from_transformers` method. ```python from datetime import date from typing import Literal import torch from pydantic import BaseModel from transformers import AutoProcessor, Qwen2VLForConditionalGeneration from pydantic_ai import Agent, ModelSettings from pydantic_ai.messages import ImageUrl from pydantic_ai.models.outlines import OutlinesModel MODEL_NAME = 'Qwen/Qwen2-VL-7B-Instruct' class Item(BaseModel): name: str quantity: int | None price_per_unit: float | None total_price: float | None class ReceiptSummary(BaseModel): store_name: str store_address: str store_number: int | None items: list[Item] tax: float | None total: float | None date: date payment_method: Literal['cash', 'credit', 'debit', 'check', 'other'] tf_model = Qwen2VLForConditionalGeneration.from_pretrained( MODEL_NAME, device_map='auto', dtype=torch.bfloat16 ) tf_processor = AutoProcessor.from_pretrained( MODEL_NAME, device_map='auto' ) model = OutlinesModel.from_transformers(tf_model, tf_processor) agent = Agent(model, output_type=ReceiptSummary) result = agent.run_sync( [ 'You are an expert at extracting information from receipts. Please extract the information from the receipt. Be as detailed as possible, do not miss any information', ImageUrl('https://raw.githubusercontent.com/dottxt-ai/outlines/refs/heads/main/docs/examples/images/trader-joes-receipt.jpg') ], model_settings=ModelSettings(extra_body={'max_new_tokens': 1000}) ) print(result.output) # store_name="Trader Joe's" # store_address='401 Bay Street, San Francisco, CA 94133' # store_number=0 # items=[ # Item(name='BANANA EACH', quantity=7, price_per_unit=0.23, total_price=1.61), # Item(name='BAREBELLS CHOCOLATE DOUG',quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CHOCOLATE DOUG', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='BAREBELLS CARAMEL CASHEW', quantity=2, price_per_unit=2.29, total_price=4.58), # Item(name='BAREBELLS CREAMY CRISP', quantity=1, price_per_unit=2.29, total_price=2.29), # Item(name='T SPINDRIFT ORANGE MANGO 8', quantity=1, price_per_unit=7.49, total_price=7.49), # Item(name='T Bottle Deposit', quantity=8, price_per_unit=0.05, total_price=0.4), # Item(name='MILK ORGANIC GALLON WHOL', quantity=1, price_per_unit=6.79, total_price=6.79), # Item(name='CLASSIC GREEK SALAD', quantity=1, price_per_unit=3.49, total_price=3.49), # Item(name='COBB SALAD', quantity=1, price_per_unit=5.99, total_price=5.99), # Item(name='PEPPER BELL RED XL EACH', quantity=1, price_per_unit=1.29, total_price=1.29), # Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25), # Item(name='BAG FEE.', quantity=1, price_per_unit=0.25, total_price=0.25)] # tax=7.89 # total=41.98 # date='2023-04-01' # payment_method='credit' ``` # Model Providers Pydantic AI is model-agnostic and has built-in support for multiple model providers: - [OpenAI](https://ai.pydantic.dev/models/openai/index.md) - [Anthropic](https://ai.pydantic.dev/models/anthropic/index.md) - [Gemini](https://ai.pydantic.dev/models/google/index.md) (via two different APIs: Generative Language API and VertexAI API) - [xAI](https://ai.pydantic.dev/models/xai/index.md) - [Bedrock](https://ai.pydantic.dev/models/bedrock/index.md) - [Cerebras](https://ai.pydantic.dev/models/cerebras/index.md) - [Cohere](https://ai.pydantic.dev/models/cohere/index.md) - [Groq](https://ai.pydantic.dev/models/groq/index.md) - [Hugging Face](https://ai.pydantic.dev/models/huggingface/index.md) - [Mistral](https://ai.pydantic.dev/models/mistral/index.md) - [OpenRouter](https://ai.pydantic.dev/models/openrouter/index.md) - [Outlines](https://ai.pydantic.dev/models/outlines/index.md) ## OpenAI-compatible Providers In addition, many providers are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI: - [Alibaba Cloud Model Studio (DashScope)](https://ai.pydantic.dev/models/openai/#alibaba-cloud-model-studio-dashscope) - [Azure AI Foundry](https://ai.pydantic.dev/models/openai/#azure-ai-foundry) - [DeepSeek](https://ai.pydantic.dev/models/openai/#deepseek) - [Fireworks AI](https://ai.pydantic.dev/models/openai/#fireworks-ai) - [GitHub Models](https://ai.pydantic.dev/models/openai/#github-models) - [Heroku](https://ai.pydantic.dev/models/openai/#heroku-ai) - [LiteLLM](https://ai.pydantic.dev/models/openai/#litellm) - [Nebius AI Studio](https://ai.pydantic.dev/models/openai/#nebius-ai-studio) - [Ollama](https://ai.pydantic.dev/models/openai/#ollama) - [OVHcloud AI Endpoints](https://ai.pydantic.dev/models/openai/#ovhcloud-ai-endpoints) - [Perplexity](https://ai.pydantic.dev/models/openai/#perplexity) - [SambaNova](https://ai.pydantic.dev/models/openai/#sambanova) - [Together AI](https://ai.pydantic.dev/models/openai/#together-ai) - [Vercel AI Gateway](https://ai.pydantic.dev/models/openai/#vercel-ai-gateway) Pydantic AI also comes with [`TestModel`](https://ai.pydantic.dev/api/models/test/index.md) and [`FunctionModel`](https://ai.pydantic.dev/api/models/function/index.md) for testing and development. To use each model provider, you need to configure your local environment and make sure you have the right packages installed. If you try to use the model without having done so, you'll be told what to install. ## Models and Providers Pydantic AI uses a few key terms to describe how it interacts with different LLMs: - **Model**: This refers to the Pydantic AI class used to make requests following a specific LLM API (generally by wrapping a vendor-provided SDK, like the `openai` python SDK). These classes implement a vendor-SDK-agnostic API, ensuring a single Pydantic AI agent is portable to different LLM vendors without any other code changes just by swapping out the Model it uses. Model classes are named roughly in the format `Model`, for example, we have `OpenAIChatModel`, `AnthropicModel`, `GoogleModel`, etc. When using a Model class, you specify the actual LLM model name (e.g., `gpt-5`, `claude-sonnet-4-5`, `gemini-3-flash-preview`) as a parameter. - **Provider**: This refers to provider-specific classes which handle the authentication and connections to an LLM vendor. Passing a non-default *Provider* as a parameter to a Model is how you can ensure that your agent will make requests to a specific endpoint, or make use of a specific approach to authentication (e.g., you can use Azure auth with the `OpenAIChatModel` by way of the `AzureProvider`). In particular, this is how you can make use of an AI gateway, or an LLM vendor that offers API compatibility with the vendor SDK used by an existing Model (such as `OpenAIChatModel`). - **Profile**: This refers to a description of how requests to a specific model or family of models need to be constructed to get the best results, independent of the model and provider classes used. For example, different models have different restrictions on the JSON schemas that can be used for tools, and the same schema transformer needs to be used for Gemini models whether you're using `GoogleModel` with model name `gemini-3-pro-preview`, or `OpenAIChatModel` with `OpenRouterProvider` and model name `google/gemini-3-pro-preview`. When you instantiate an Agent with just a name formatted as `:`, e.g. `openai:gpt-5.2` or `openrouter:google/gemini-3-pro-preview`, Pydantic AI will automatically select the appropriate model class, provider, and profile. If you want to use a different provider or profile, you can instantiate a model class directly and pass in `provider` and/or `profile` arguments. ## Custom Models Note If a model API is compatible with the OpenAI API, you do not need a custom model class and can provide your own [custom provider](https://ai.pydantic.dev/models/openai/#openai-compatible-models) instead. To implement support for a model API that's not already supported, you will need to subclass the Model abstract base class. For streaming, you'll also need to implement the StreamedResponse abstract base class. The best place to start is to review the source code for existing implementations, e.g. [`OpenAIChatModel`](https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py). For details on when we'll accept contributions adding new models to Pydantic AI, see the [contributing guidelines](https://ai.pydantic.dev/contributing/#new-model-rules). ## HTTP Request Concurrency You can limit the number of concurrent HTTP requests to a model using the ConcurrencyLimitedModel wrapper. This is useful for respecting rate limits or managing resource usage when running many agents in parallel. model_concurrency.py ```python import asyncio from pydantic_ai import Agent, ConcurrencyLimitedModel # Wrap a model with concurrency limiting model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5) # Multiple agents can share this rate-limited model agent = Agent(model) async def main(): # These will be rate-limited to 5 concurrent HTTP requests results = await asyncio.gather( *[agent.run(f'Question {i}') for i in range(20)] ) print(len(results)) #> 20 ``` The `limiter` parameter accepts: - An integer for simple limiting (e.g., `limiter=5`) - A ConcurrencyLimit for advanced configuration with backpressure control - A ConcurrencyLimiter for sharing limits across multiple models ### Shared Concurrency Limits To share a concurrency limit across multiple models (e.g., different models from the same provider), you can create a ConcurrencyLimiter and pass it to multiple `ConcurrencyLimitedModel` instances: shared_concurrency.py ```python import asyncio from pydantic_ai import Agent, ConcurrencyLimitedModel, ConcurrencyLimiter # Create a shared limiter with a descriptive name shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool') # Both models share the same concurrency limit model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter) model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter) agent1 = Agent(model1) agent2 = Agent(model2) async def main(): # Total concurrent requests across both agents limited to 10 results = await asyncio.gather( *[agent1.run(f'Question {i}') for i in range(10)], *[agent2.run(f'Question {i}') for i in range(10)], ) print(len(results)) #> 20 ``` When instrumentation is enabled, requests waiting for a concurrency slot appear as spans with attributes showing the queue depth and configured limits. The `name` parameter on `ConcurrencyLimiter` helps identify shared limiters in traces. ## Fallback Model You can use FallbackModel to attempt multiple models in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches from one model to the next if the current model returns a 4xx or 5xx status code. Note The provider SDKs on which Models are based (like OpenAI, Anthropic, etc.) often have built-in retry logic that can delay the `FallbackModel` from activating. When using `FallbackModel`, it's recommended to disable provider SDK retries to ensure immediate fallback, for example by setting `max_retries=0` on a [custom OpenAI client](https://ai.pydantic.dev/models/openai/#custom-openai-client). In the following example, the agent first makes a request to the OpenAI model (which fails due to an invalid API key), and then falls back to the Anthropic model. fallback_model.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) response = agent.run_sync('What is the capital of France?') print(response.data) #> Paris print(response.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ), ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='claude-sonnet-4-5', timestamp=datetime.datetime(...), kind='response', provider_response_id=None, ), ] """ ``` The `ModelResponse` message above indicates in the `model_name` field that the output was returned by the Anthropic model, which is the second model specified in the `FallbackModel`. Note Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`. ### Per-Model Settings You can configure different ModelSettings for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations: fallback_model_per_settings.py ```python from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel # Configure each model with provider-specific optimal settings openai_model = OpenAIChatModel( 'gpt-5.2', settings=ModelSettings(temperature=0.7, max_tokens=1000) # Higher creativity for OpenAI ) anthropic_model = AnthropicModel( 'claude-sonnet-4-5', settings=ModelSettings(temperature=0.2, max_tokens=1000) # Lower temperature for consistency ) fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) result = agent.run_sync('Write a creative story about space exploration') print(result.output) """ In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it. """ ``` In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request. ### Exception Handling The next example demonstrates the exception-handling capabilities of `FallbackModel`. If all models fail, a FallbackExceptionGroup is raised, which contains all the exceptions encountered during the `run` execution. fallback_model_failure.py ```python from pydantic_ai import Agent, ModelAPIError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) try: response = agent.run_sync('What is the capital of France?') except* ModelAPIError as exc_group: for exc in exc_group.exceptions: print(exc) ``` Since [`except*`](https://docs.python.org/3/reference/compound_stmts.html#except-star) is only supported in Python 3.11+, we use the [`exceptiongroup`](https://github.com/agronholm/exceptiongroup) backport package for earlier Python versions: fallback_model_failure.py ```python from exceptiongroup import catch from pydantic_ai import Agent, ModelAPIError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel def model_status_error_handler(exc_group: BaseExceptionGroup) -> None: for exc in exc_group.exceptions: print(exc) openai_model = OpenAIChatModel('gpt-5.2') anthropic_model = AnthropicModel('claude-sonnet-4-5') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) with catch({ModelAPIError: model_status_error_handler}): response = agent.run_sync('What is the capital of France?') ``` By default, the `FallbackModel` only moves on to the next model if the current model raises a ModelAPIError, which includes ModelHTTPError. You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor. Note Validation errors (from [structured output](https://ai.pydantic.dev/output/#structured-output) or [tool parameters](https://ai.pydantic.dev/tools/index.md)) do **not** trigger fallback. These errors use the [retry mechanism](https://ai.pydantic.dev/agent/#reflection-and-self-correction) instead, which re-prompts the same model to try again. This is intentional: validation errors stem from the non-deterministic nature of LLMs and may succeed on retry, whereas API errors (4xx/5xx) generally indicate issues that won't resolve by retrying the same request. # xAI ## Install To use XaiModel, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `xai` optional group: ```bash pip install "pydantic-ai-slim[xai]" ``` ```bash uv add "pydantic-ai-slim[xai]" ``` ## Configuration To use xAI models from [xAI](https://x.ai/api) through their API, go to [console.x.ai](https://console.x.ai/team/default/api-keys) to create an API key. [docs.x.ai](https://docs.x.ai/docs/models) contains a list of available xAI models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export XAI_API_KEY='your-api-key' ``` You can then use XaiModel by name: ```python from pydantic_ai import Agent agent = Agent('xai:grok-4-1-fast-non-reasoning') ... ``` Or initialise the model directly: ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel # Uses XAI_API_KEY environment variable model = XaiModel('grok-4-1-fast-non-reasoning') agent = Agent(model) ... ``` You can also customize the XaiModel with a custom provider: ```python from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel from pydantic_ai.providers.xai import XaiProvider # Custom API key provider = XaiProvider(api_key='your-api-key') model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider) agent = Agent(model) ... ``` Or with a custom `xai_sdk.AsyncClient`: ```python from xai_sdk import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.xai import XaiModel from pydantic_ai.providers.xai import XaiProvider xai_client = AsyncClient(api_key='your-api-key') provider = XaiProvider(xai_client=xai_client) model = XaiModel('grok-4-1-fast-non-reasoning', provider=provider) agent = Agent(model) ... ``` # Graphs # Graphs Don't use a nail gun unless you need a nail gun If Pydantic AI [agents](https://ai.pydantic.dev/agent/index.md) are a hammer, and [multi-agent workflows](https://ai.pydantic.dev/multi-agent-applications/index.md) are a sledgehammer, then graphs are a nail gun: - sure, nail guns look cooler than hammers - but nail guns take a lot more setup than hammers - and nail guns don't make you a better builder, they make you a builder with a nail gun - Lastly, (and at the risk of torturing this metaphor), if you're a fan of medieval tools like mallets and untyped Python, you probably won't like nail guns or our approach to graphs. (But then again, if you're not a fan of type hints in Python, you've probably already bounced off Pydantic AI to use one of the toy agent frameworks — good luck, and feel free to borrow my sledgehammer when you realize you need it) In short, graphs are a powerful tool, but they're not the right tool for every job. Please consider other [multi-agent approaches](https://ai.pydantic.dev/multi-agent-applications/index.md) before proceeding. If you're not confident a graph-based approach is a good idea, it might be unnecessary. Graphs and finite state machines (FSMs) are a powerful abstraction to model, execute, control and visualize complex workflows. Alongside Pydantic AI, we've developed `pydantic-graph` — an async graph and state machine library for Python where nodes and edges are defined using type hints. While this library is developed as part of Pydantic AI; it has no dependency on `pydantic-ai` and can be considered as a pure graph-based state machine library. You may find it useful whether or not you're using Pydantic AI or even building with GenAI. `pydantic-graph` is designed for advanced users and makes heavy use of Python generics and type hints. It is not designed to be as beginner-friendly as Pydantic AI. ## Installation `pydantic-graph` is a required dependency of `pydantic-ai`, and an optional dependency of `pydantic-ai-slim`, see [installation instructions](https://ai.pydantic.dev/install/#slim-install) for more information. You can also install it directly: ```bash pip install pydantic-graph ``` ```bash uv add pydantic-graph ``` ## Graph Types `pydantic-graph` is made up of a few key components: ### GraphRunContext GraphRunContext — The context for the graph run, similar to Pydantic AI's RunContext. This holds the state of the graph and dependencies and is passed to nodes when they're run. `GraphRunContext` is generic in the state type of the graph it's used in, StateT. ### End End — return value to indicate the graph run should end. `End` is generic in the graph return type of the graph it's used in, RunEndT. ### Nodes Subclasses of BaseNode define nodes for execution in the graph. Nodes, which are generally dataclasses, generally consist of: - fields containing any parameters required/optional when calling the node - the business logic to execute the node, in the run method - return annotations of the run method, which are read by `pydantic-graph` to determine the outgoing edges of the node Nodes are generic in: - **state**, which must have the same type as the state of graphs they're included in, StateT has a default of `None`, so if you're not using state you can omit this generic parameter, see [stateful graphs](#stateful-graphs) for more information - **deps**, which must have the same type as the deps of the graph they're included in, DepsT has a default of `None`, so if you're not using deps you can omit this generic parameter, see [dependency injection](#dependency-injection) for more information - **graph return type** — this only applies if the node returns End. RunEndT has a default of Never so this generic parameter can be omitted if the node doesn't return `End`, but must be included if it does. Here's an example of a start or intermediate node in a graph — it can't end the run as it doesn't return End: intermediate_node.py ```python from dataclasses import dataclass from pydantic_graph import BaseNode, GraphRunContext @dataclass class MyNode(BaseNode[MyState]): # (1)! foo: int # (2)! async def run( self, ctx: GraphRunContext[MyState], # (3)! ) -> AnotherNode: # (4)! ... return AnotherNode() ``` 1. State in this example is `MyState` (not shown), hence `BaseNode` is parameterized with `MyState`. This node can't end the run, so the `RunEndT` generic parameter is omitted and defaults to `Never`. 1. `MyNode` is a dataclass and has a single field `foo`, an `int`. 1. The `run` method takes a `GraphRunContext` parameter, again parameterized with state `MyState`. 1. The return type of the `run` method is `AnotherNode` (not shown), this is used to determine the outgoing edges of the node. We could extend `MyNode` to optionally end the run if `foo` is divisible by 5: intermediate_or_end_node.py ```python from dataclasses import dataclass from pydantic_graph import BaseNode, End, GraphRunContext @dataclass class MyNode(BaseNode[MyState, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext[MyState], ) -> AnotherNode | End[int]: # (2)! if self.foo % 5 == 0: return End(self.foo) else: return AnotherNode() ``` 1. We parameterize the node with the return type (`int` in this case) as well as state. Because generic parameters are positional-only, we have to include `None` as the second parameter representing deps. 1. The return type of the `run` method is now a union of `AnotherNode` and `End[int]`, this allows the node to end the run if `foo` is divisible by 5. ### Graph Graph — this is the execution graph itself, made up of a set of [node classes](#nodes) (i.e., `BaseNode` subclasses). `Graph` is generic in: - **state** the state type of the graph, StateT - **deps** the deps type of the graph, DepsT - **graph return type** the return type of the graph run, RunEndT Here's an example of a simple graph: graph_example.py ```python from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class DivisibleBy5(BaseNode[None, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext, ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode): # (2)! foo: int async def run(self, ctx: GraphRunContext) -> DivisibleBy5: return DivisibleBy5(self.foo + 1) fives_graph = Graph(nodes=[DivisibleBy5, Increment]) # (3)! result = fives_graph.run_sync(DivisibleBy5(4)) # (4)! print(result.output) #> 5 ``` 1. The `DivisibleBy5` node is parameterized with `None` for the state param and `None` for the deps param as this graph doesn't use state or deps, and `int` as it can end the run. 1. The `Increment` node doesn't return `End`, so the `RunEndT` generic parameter is omitted, state can also be omitted as the graph doesn't use state. 1. The graph is created with a sequence of nodes. 1. The graph is run synchronously with run_sync. The initial node is `DivisibleBy5(4)`. Because the graph doesn't use external state or deps, we don't pass `state` or `deps`. *(This example is complete, it can be run "as is")* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: graph_example_diagram.py ```python from graph_example import DivisibleBy5, fives_graph fives_graph.mermaid_code(start_node=DivisibleBy5) ``` ``` --- title: fives_graph --- stateDiagram-v2 [*] --> DivisibleBy5 DivisibleBy5 --> Increment DivisibleBy5 --> [*] Increment --> DivisibleBy5 ``` In order to visualize a graph within a `jupyter-notebook`, `IPython.display` needs to be used: jupyter_display_mermaid.py ```python from graph_example import DivisibleBy5, fives_graph from IPython.display import Image, display display(Image(fives_graph.mermaid_image(start_node=DivisibleBy5))) ``` ## Stateful Graphs The "state" concept in `pydantic-graph` provides an optional way to access and mutate an object (often a `dataclass` or Pydantic model) as nodes run in a graph. If you think of Graphs as a production line, then your state is the engine being passed along the line and built up by each node as the graph is run. `pydantic-graph` provides state persistence, with the state recorded after each node is run. (See [State Persistence](#state-persistence).) Here's an example of a graph which represents a vending machine where the user may insert coins and select a product to purchase. vending_machine.py ```python from __future__ import annotations from dataclasses import dataclass from rich.prompt import Prompt from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MachineState: # (1)! user_balance: float = 0.0 product: str | None = None @dataclass class InsertCoin(BaseNode[MachineState]): # (3)! async def run(self, ctx: GraphRunContext[MachineState]) -> CoinsInserted: # (16)! return CoinsInserted(float(Prompt.ask('Insert coins'))) # (4)! @dataclass class CoinsInserted(BaseNode[MachineState]): amount: float # (5)! async def run( self, ctx: GraphRunContext[MachineState] ) -> SelectProduct | Purchase: # (17)! ctx.state.user_balance += self.amount # (6)! if ctx.state.product is not None: # (7)! return Purchase(ctx.state.product) else: return SelectProduct() @dataclass class SelectProduct(BaseNode[MachineState]): async def run(self, ctx: GraphRunContext[MachineState]) -> Purchase: return Purchase(Prompt.ask('Select product')) PRODUCT_PRICES = { # (2)! 'water': 1.25, 'soda': 1.50, 'crisps': 1.75, 'chocolate': 2.00, } @dataclass class Purchase(BaseNode[MachineState, None, None]): # (18)! product: str async def run( self, ctx: GraphRunContext[MachineState] ) -> End | InsertCoin | SelectProduct: if price := PRODUCT_PRICES.get(self.product): # (8)! ctx.state.product = self.product # (9)! if ctx.state.user_balance >= price: # (10)! ctx.state.user_balance -= price return End(None) else: diff = price - ctx.state.user_balance print(f'Not enough money for {self.product}, need {diff:0.2f} more') #> Not enough money for crisps, need 0.75 more return InsertCoin() # (11)! else: print(f'No such product: {self.product}, try again') return SelectProduct() # (12)! vending_machine_graph = Graph( # (13)! nodes=[InsertCoin, CoinsInserted, SelectProduct, Purchase] ) async def main(): state = MachineState() # (14)! await vending_machine_graph.run(InsertCoin(), state=state) # (15)! print(f'purchase successful item={state.product} change={state.user_balance:0.2f}') #> purchase successful item=crisps change=0.25 ``` 1. The state of the vending machine is defined as a dataclass with the user's balance and the product they've selected, if any. 1. A dictionary of products mapped to prices. 1. The `InsertCoin` node, BaseNode is parameterized with `MachineState` as that's the state used in this graph. 1. The `InsertCoin` node prompts the user to insert coins. We keep things simple by just entering a monetary amount as a float. Before you start thinking this is a toy too since it's using rich's Prompt.ask within nodes, see [below](#example-human-in-the-loop) for how control flow can be managed when nodes require external input. 1. The `CoinsInserted` node; again this is a dataclass with one field `amount`. 1. Update the user's balance with the amount inserted. 1. If the user has already selected a product, go to `Purchase`, otherwise go to `SelectProduct`. 1. In the `Purchase` node, look up the price of the product if the user entered a valid product. 1. If the user did enter a valid product, set the product in the state so we don't revisit `SelectProduct`. 1. If the balance is enough to purchase the product, adjust the balance to reflect the purchase and return End to end the graph. We're not using the run return type, so we call `End` with `None`. 1. If the balance is insufficient, go to `InsertCoin` to prompt the user to insert more coins. 1. If the product is invalid, go to `SelectProduct` to prompt the user to select a product again. 1. The graph is created by passing a list of nodes to Graph. Order of nodes is not important, but it can affect how [diagrams](#mermaid-diagrams) are displayed. 1. Initialize the state. This will be passed to the graph run and mutated as the graph runs. 1. Run the graph with the initial state. Since the graph can be run from any node, we must pass the start node — in this case, `InsertCoin`. Graph.run returns a GraphRunResult that provides the final data and a history of the run. 1. The return type of the node's run method is important as it is used to determine the outgoing edges of the node. This information in turn is used to render [mermaid diagrams](#mermaid-diagrams) and is enforced at runtime to detect misbehavior as soon as possible. 1. The return type of `CoinsInserted`'s run method is a union, meaning multiple outgoing edges are possible. 1. Unlike other nodes, `Purchase` can end the run, so the RunEndT generic parameter must be set. In this case it's `None` since the graph run return type is `None`. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: vending_machine_diagram.py ```python from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin) ``` The diagram generated by the above code is: ``` --- title: vending_machine_graph --- stateDiagram-v2 [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` See [below](#mermaid-diagrams) for more information on generating diagrams. ## GenAI Example So far we haven't shown an example of a Graph that actually uses Pydantic AI or GenAI at all. In this example, one agent generates a welcome email to a user and the other agent provides feedback on the email. This graph has a very simple structure: ``` --- title: feedback_graph --- stateDiagram-v2 [*] --> WriteEmail WriteEmail --> Feedback Feedback --> WriteEmail Feedback --> [*] ``` [Learn about Gateway](https://ai.pydantic.dev/gateway) genai_email_feedback.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pydantic import BaseModel, EmailStr from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class User: name: str email: EmailStr interests: list[str] @dataclass class Email: subject: str body: str @dataclass class State: user: User write_agent_messages: list[ModelMessage] = field(default_factory=list) email_writer_agent = Agent( 'gateway/gemini:gemini-3-pro-preview', output_type=Email, instructions='Write a welcome email to our tech blog.', ) @dataclass class WriteEmail(BaseNode[State]): email_feedback: str | None = None async def run(self, ctx: GraphRunContext[State]) -> Feedback: if self.email_feedback: prompt = ( f'Rewrite the email for the user:\n' f'{format_as_xml(ctx.state.user)}\n' f'Feedback: {self.email_feedback}' ) else: prompt = ( f'Write a welcome email for the user:\n' f'{format_as_xml(ctx.state.user)}' ) result = await email_writer_agent.run( prompt, message_history=ctx.state.write_agent_messages, ) ctx.state.write_agent_messages += result.new_messages() return Feedback(result.output) class EmailRequiresWrite(BaseModel): feedback: str class EmailOk(BaseModel): pass feedback_agent = Agent[None, EmailRequiresWrite | EmailOk]( 'openai:gpt-5.2', output_type=EmailRequiresWrite | EmailOk, # type: ignore instructions=( 'Review the email and provide feedback, email must reference the users specific interests.' ), ) @dataclass class Feedback(BaseNode[State, None, Email]): email: Email async def run( self, ctx: GraphRunContext[State], ) -> WriteEmail | End[Email]: prompt = format_as_xml({'user': ctx.state.user, 'email': self.email}) result = await feedback_agent.run(prompt) if isinstance(result.output, EmailRequiresWrite): return WriteEmail(email_feedback=result.output.feedback) else: return End(self.email) async def main(): user = User( name='John Doe', email='john.joe@example.com', interests=['Haskel', 'Lisp', 'Fortran'], ) state = State(user) feedback_graph = Graph(nodes=(WriteEmail, Feedback)) result = await feedback_graph.run(WriteEmail(), state=state) print(result.output) """ Email( subject='Welcome to our tech blog!', body='Hello John, Welcome to our tech blog! ...', ) """ ``` genai_email_feedback.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pydantic import BaseModel, EmailStr from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class User: name: str email: EmailStr interests: list[str] @dataclass class Email: subject: str body: str @dataclass class State: user: User write_agent_messages: list[ModelMessage] = field(default_factory=list) email_writer_agent = Agent( 'google-gla:gemini-3-pro-preview', output_type=Email, instructions='Write a welcome email to our tech blog.', ) @dataclass class WriteEmail(BaseNode[State]): email_feedback: str | None = None async def run(self, ctx: GraphRunContext[State]) -> Feedback: if self.email_feedback: prompt = ( f'Rewrite the email for the user:\n' f'{format_as_xml(ctx.state.user)}\n' f'Feedback: {self.email_feedback}' ) else: prompt = ( f'Write a welcome email for the user:\n' f'{format_as_xml(ctx.state.user)}' ) result = await email_writer_agent.run( prompt, message_history=ctx.state.write_agent_messages, ) ctx.state.write_agent_messages += result.new_messages() return Feedback(result.output) class EmailRequiresWrite(BaseModel): feedback: str class EmailOk(BaseModel): pass feedback_agent = Agent[None, EmailRequiresWrite | EmailOk]( 'openai:gpt-5.2', output_type=EmailRequiresWrite | EmailOk, # type: ignore instructions=( 'Review the email and provide feedback, email must reference the users specific interests.' ), ) @dataclass class Feedback(BaseNode[State, None, Email]): email: Email async def run( self, ctx: GraphRunContext[State], ) -> WriteEmail | End[Email]: prompt = format_as_xml({'user': ctx.state.user, 'email': self.email}) result = await feedback_agent.run(prompt) if isinstance(result.output, EmailRequiresWrite): return WriteEmail(email_feedback=result.output.feedback) else: return End(self.email) async def main(): user = User( name='John Doe', email='john.joe@example.com', interests=['Haskel', 'Lisp', 'Fortran'], ) state = State(user) feedback_graph = Graph(nodes=(WriteEmail, Feedback)) result = await feedback_graph.run(WriteEmail(), state=state) print(result.output) """ Email( subject='Welcome to our tech blog!', body='Hello John, Welcome to our tech blog! ...', ) """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Iterating Over a Graph ### Using `Graph.iter` for `async for` iteration Sometimes you want direct control or insight into each node as the graph executes. The easiest way to do that is with the Graph.iter method, which returns a **context manager** that yields a GraphRun object. The `GraphRun` is an async-iterable over the nodes of your graph, allowing you to record or modify them as they execute. Here's an example: count_down.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass from pydantic_graph import Graph, BaseNode, End, GraphRunContext @dataclass class CountDownState: counter: int @dataclass class CountDown(BaseNode[CountDownState, None, int]): async def run(self, ctx: GraphRunContext[CountDownState]) -> CountDown | End[int]: if ctx.state.counter <= 0: return End(ctx.state.counter) ctx.state.counter -= 1 return CountDown() count_down_graph = Graph(nodes=[CountDown]) async def main(): state = CountDownState(counter=3) async with count_down_graph.iter(CountDown(), state=state) as run: # (1)! async for node in run: # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) print('Final output:', run.result.output) # (3)! #> Final output: 0 ``` 1. `Graph.iter(...)` returns a GraphRun. 1. Here, we step through each node as it is executed. 1. Once the graph returns an End, the loop ends, and `run.result` becomes a GraphRunResult containing the final outcome (`0` here). ### Using `GraphRun.next(node)` manually Alternatively, you can drive iteration manually with the GraphRun.next method, which allows you to pass in whichever node you want to run next. You can modify or selectively skip nodes this way. Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that: count_down_next.py ```python from pydantic_graph import End, FullStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): state = CountDownState(counter=5) persistence = FullStatePersistence() # (7)! async with count_down_graph.iter( CountDown(), state=state, persistence=persistence ) as run: node = run.next_node # (1)! while not isinstance(node, End): # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() if state.counter == 2: break # (3)! node = await run.next(node) # (4)! print(run.result) # (5)! #> None for step in persistence.history: # (6)! print('History Step:', step.state, step.state) #> History Step: CountDownState(counter=5) CountDownState(counter=5) #> History Step: CountDownState(counter=4) CountDownState(counter=4) #> History Step: CountDownState(counter=3) CountDownState(counter=3) #> History Step: CountDownState(counter=2) CountDownState(counter=2) ``` 1. We start by grabbing the first node that will be run in the agent's graph. 1. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. 1. If the user decides to stop early, we break out of the loop. The graph run won't have a real final result in that case (`run.result` remains `None`). 1. At each step, we call `await run.next(node)` to run it and get the next node (or an `End`). 1. Because we did not continue the run until it finished, the `result` is not set. 1. The run's history is still populated with the steps we executed so far. 1. Use FullStatePersistence so we can show the history of the run, see [State Persistence](#state-persistence) below for more information. ## State Persistence One of the biggest benefits of finite state machine (FSM) graphs is how they simplify the handling of interrupted execution. This might happen for a variety of reasons: - the state machine logic might fundamentally need to be paused — e.g. the returns workflow for an e-commerce order needs to wait for the item to be posted to the returns center or because execution of the next node needs input from a user so needs to wait for a new http request, - the execution takes so long that the entire graph can't reliably be executed in a single continuous run — e.g. a deep research agent that might take hours to run, - you want to run multiple graph nodes in parallel in different processes / hardware instances (note: parallel node execution is not yet supported in `pydantic-graph`, see [#704](https://github.com/pydantic/pydantic-ai/issues/704)). Trying to make a conventional control flow (i.e., boolean logic and nested function calls) implementation compatible with these usage scenarios generally results in brittle and over-complicated spaghetti code, with the logic required to interrupt and resume execution dominating the implementation. To allow graph runs to be interrupted and resumed, `pydantic-graph` provides state persistence — a system for snapshotting the state of a graph run before and after each node is run, allowing a graph run to be resumed from any point in the graph. `pydantic-graph` includes three state persistence implementations: - SimpleStatePersistence — Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. - FullStatePersistence — In memory state persistence that hold a list of snapshots. - FileStatePersistence — File-based state persistence that saves snapshots to a JSON file. In production applications, developers should implement their own state persistence by subclassing BaseStatePersistence abstract base class, which might persist runs in a relational database like PostgresQL. At a high level the role of `StatePersistence` implementations is to store and retrieve NodeSnapshot and EndSnapshot objects. graph.iter_from_persistence() may be used to run the graph based on the state stored in persistence. We can run the `count_down_graph` from [above](#iterating-over-a-graph), using graph.iter_from_persistence() and FileStatePersistence. As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems. count_down_from_persistence.py ```python from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): run_id = 'run_abc123' persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) # (1)! state = CountDownState(counter=5) await count_down_graph.initialize( # (2)! CountDown(), state=state, persistence=persistence ) done = False while not done: done = await run_node(run_id) async def run_node(run_id: str) -> bool: # (3)! persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) async with count_down_graph.iter_from_persistence(persistence) as run: # (4)! node_or_end = await run.next() # (5)! print('Node:', node_or_end) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) return isinstance(node_or_end, End) # (6)! ``` 1. Create a FileStatePersistence to use to start the graph. 1. Call graph.initialize() to set the initial graph state in the persistence object. 1. `run_node` is a pure function that doesn't need access to any other process state to run the next node of the graph, except the ID of the run. 1. Call graph.iter_from_persistence() create a GraphRun object that will run the next node of the graph from the state stored in persistence. This will return either a node or an `End` object. 1. graph.run() will return either a node or an End object. 1. Check if the node is an End object, if it is, the graph run is complete. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Example: Human in the loop. As noted above, state persistence allows graphs to be interrupted and resumed. One use case of this is to allow user input to continue. In this example, an AI asks the user a question, the user provides an answer, the AI evaluates the answer and ends if the user got it right or asks another question if they got it wrong. Instead of running the entire graph in a single process invocation, we run the graph by running the process repeatedly, optionally providing an answer to the question as a command line argument. `ai_q_and_a_graph.py` — `question_graph` definition [Learn about Gateway](https://ai.pydantic.dev/gateway) ai_q_and_a_graph.py ```python from __future__ import annotations as _annotations from typing import Annotated from pydantic_graph import Edge from dataclasses import dataclass, field from pydantic import BaseModel from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_ai import Agent, format_as_xml from pydantic_ai import ModelMessage ask_agent = Agent('gateway/openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'gateway/openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` ai_q_and_a_graph.py ```python from __future__ import annotations as _annotations from typing import Annotated from pydantic_graph import Edge from dataclasses import dataclass, field from pydantic import BaseModel from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_ai import Agent, format_as_xml from pydantic_ai import ModelMessage ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is complete, it can be run "as is")* ai_q_and_a_run.py ```python import sys from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from pydantic_ai import ModelMessage # noqa: F401 from ai_q_and_a_graph import Ask, question_graph, Evaluate, QuestionState, Answer async def main(): answer: str | None = sys.argv[1] if len(sys.argv) > 1 else None # (1)! persistence = FileStatePersistence(Path('question_graph.json')) # (2)! persistence.set_graph_types(question_graph) # (3)! if snapshot := await persistence.load_next(): # (4)! state = snapshot.state assert answer is not None node = Evaluate(answer) else: state = QuestionState() node = Ask() # (5)! async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() # (6)! if isinstance(node, End): # (7)! print('END:', node.data) history = await persistence.load_all() # (8)! print([e.node for e in history]) break elif isinstance(node, Answer): # (9)! print(node.question) #> What is the capital of France? break # otherwise just continue ``` 1. Get the user's answer from the command line, if provided. See [question graph example](https://ai.pydantic.dev/examples/question-graph/index.md) for a complete example. 1. Create a state persistence instance the `'question_graph.json'` file may or may not already exist. 1. Since we're using the persistence interface outside a graph, we need to call set_graph_types to set the graph generic types `StateT` and `RunEndT` for the persistence instance. This is necessary to allow the persistence instance to know how to serialize and deserialize graph nodes. 1. If we're run the graph before, load_next will return a snapshot of the next node to run, here we use `state` from that snapshot, and create a new `Evaluate` node with the answer provided on the command line. 1. If the graph hasn't been run before, we create a new `QuestionState` and start with the `Ask` node. 1. Call GraphRun.next() to run the node. This will return either a node or an `End` object. 1. If the node is an `End` object, the graph run is complete. The `data` field of the `End` object contains the comment returned by the `evaluate_agent` about the correct answer. 1. To demonstrate the state persistence, we call load_all to get all the snapshots from the persistence instance. This will return a list of Snapshot objects. 1. If the node is an `Answer` object, we print the question and break out of the loop to end the process and wait for user input. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* For a complete example of this graph, see the [question graph example](https://ai.pydantic.dev/examples/question-graph/index.md). ## Dependency Injection As with Pydantic AI, `pydantic-graph` supports dependency injection via a generic parameter on Graph and BaseNode, and the GraphRunContext.deps field. As an example of dependency injection, let's modify the `DivisibleBy5` example [above](#graph) to use a ProcessPoolExecutor to run the compute load in a separate process (this is a contrived example, `ProcessPoolExecutor` wouldn't actually improve performance in this example): deps_example.py ```python from __future__ import annotations import asyncio from concurrent.futures import ProcessPoolExecutor from dataclasses import dataclass from pydantic_graph import BaseNode, End, FullStatePersistence, Graph, GraphRunContext @dataclass class GraphDeps: executor: ProcessPoolExecutor @dataclass class DivisibleBy5(BaseNode[None, GraphDeps, int]): foo: int async def run( self, ctx: GraphRunContext[None, GraphDeps], ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode[None, GraphDeps]): foo: int async def run(self, ctx: GraphRunContext[None, GraphDeps]) -> DivisibleBy5: loop = asyncio.get_running_loop() compute_result = await loop.run_in_executor( ctx.deps.executor, self.compute, ) return DivisibleBy5(compute_result) def compute(self) -> int: return self.foo + 1 fives_graph = Graph(nodes=[DivisibleBy5, Increment]) async def main(): with ProcessPoolExecutor() as executor: deps = GraphDeps(executor) result = await fives_graph.run(DivisibleBy5(3), deps=deps, persistence=FullStatePersistence()) print(result.output) #> 5 # the full history is quite verbose (see below), so we'll just print the summary print([item.node for item in result.persistence.history]) """ [ DivisibleBy5(foo=3), Increment(foo=3), DivisibleBy5(foo=4), Increment(foo=4), DivisibleBy5(foo=5), End(data=5), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Mermaid Diagrams Pydantic Graph can generate [mermaid](https://mermaid.js.org/) [`stateDiagram-v2`](https://mermaid.js.org/syntax/stateDiagram.html) diagrams for graphs, as shown above. These diagrams can be generated with: - Graph.mermaid_code to generate the mermaid code for a graph - Graph.mermaid_image to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) - Graph.mermaid_save to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) and save it to a file Beyond the diagrams shown above, you can also customize mermaid diagrams with the following options: - Edge allows you to apply a label to an edge - BaseNode.docstring_notes and BaseNode.get_note allows you to add notes to nodes - The highlighted_nodes parameter allows you to highlight specific node(s) in the diagram Putting that together, we can edit the last [`ai_q_and_a_graph.py`](#example-human-in-the-loop) example to: - add labels to some edges - add a note to the `Ask` node - highlight the `Answer` node - save the diagram as a `PNG` image to file [Learn about Gateway](https://ai.pydantic.dev/gateway) ai_q_and_a_graph_extra.py ```python from typing import Annotated from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge ask_agent = Agent('gateway/openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'gateway/openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` ai_q_and_a_graph_extra.py ```python from typing import Annotated from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge ask_agent = Agent('openai:gpt-5.2', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-5.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationResult, instructions='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is not complete and cannot be run directly)* This would generate an image that looks like this: ``` --- title: question_graph --- stateDiagram-v2 Ask --> Answer: Ask the question note right of Ask Judge the answer. Decide on next step. end note Answer --> Evaluate Evaluate --> Reprimand Evaluate --> [*]: success Reprimand --> Ask classDef highlighted fill:#fdff32 class Answer highlighted ``` ### Setting Direction of the State Diagram You can specify the direction of the state diagram using one of the following values: - `'TB'`: Top to bottom, the diagram flows vertically from top to bottom. - `'LR'`: Left to right, the diagram flows horizontally from left to right. - `'RL'`: Right to left, the diagram flows horizontally from right to left. - `'BT'`: Bottom to top, the diagram flows vertically from bottom to top. Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB): vending_machine_diagram.py ```python from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR') ``` ``` --- title: vending_machine_graph --- stateDiagram-v2 direction LR [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` # API Reference # `pydantic_ai.ag_ui` Provides an AG-UI protocol adapter for the Pydantic AI agent. This package provides seamless integration between pydantic-ai agents and ag-ui for building interactive AI applications with streaming event-based communication. ### SSE_CONTENT_TYPE ```python SSE_CONTENT_TYPE = 'text/event-stream' ``` Content type header value for Server-Sent Events (SSE). ### OnCompleteFunc ```python OnCompleteFunc: TypeAlias = ( Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]] ) ``` Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events. ### StateDeps Bases: `Generic[StateT]` Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @dataclass class StateDeps(Generic[StateT]): """Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. """ state: StateT ``` ### StateHandler Bases: `Protocol` Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @runtime_checkable class StateHandler(Protocol): """Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.""" # Has to be a dataclass so we can use `replace` to update the state. # From https://github.com/python/typeshed/blob/9ab7fde0a0cd24ed7a72837fcb21093b811b80d8/stdlib/_typeshed/__init__.pyi#L352 __dataclass_fields__: ClassVar[dict[str, Field[Any]]] @property def state(self) -> Any: """Get the current state of the agent run.""" ... @state.setter def state(self, state: Any) -> None: """Set the state of the agent run. This method is called to update the state of the agent run with the provided state. Args: state: The run state. """ ... ``` #### state ```python state: Any ``` Get the current state of the agent run. ### AGUIApp Bases: `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/app.py` ```python class AGUIApp(Generic[AgentDepsT, OutputDataT], Starlette): """ASGI application for running Pydantic AI agents with AG-UI protocol support.""" def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # AGUIAdapter.dispatch_request parameters output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, # Starlette parameters debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None: """An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`AGUIAdapter.run_stream()`][pydantic_ai.ui.ag_ui.AGUIAdapter.run_stream] or [`AGUIAdapter.dispatch_request()`][pydantic_ai.ui.ag_ui.AGUIAdapter.dispatch_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def run_agent(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" # `dispatch_request` will store the frontend state from the request on `deps.state` (if it implements the `StateHandler` protocol), # so we need to copy the deps to avoid different requests mutating the same deps object. nonlocal deps if isinstance(deps, StateHandler): # pragma: no branch deps = replace(deps) return await AGUIAdapter[AgentDepsT, OutputDataT].dispatch_request( request, agent=agent, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ) self.router.add_route('/', run_agent, methods=['POST']) ``` #### __init__ ```python __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, on_complete: OnCompleteFunc[Any] | None = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None ) -> None ``` An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use AGUIAdapter.run_stream() or AGUIAdapter.dispatch_request() instead. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `agent` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to run. | *required* | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence[AbstractBuiltinTool] | None\` | Optional additional builtin tools for this run. | | `on_complete` | \`OnCompleteFunc[Any] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | \`Sequence[BaseRoute] | None\` | A list of routes to serve incoming HTTP and WebSocket requests. | | `middleware` | \`Sequence[Middleware] | None\` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | | `exception_handlers` | \`Mapping[Any, ExceptionHandler] | None\` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | | `on_startup` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | | `on_shutdown` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | | `lifespan` | \`Lifespan[Self] | None\` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/app.py` ```python def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # AGUIAdapter.dispatch_request parameters output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, # Starlette parameters debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None: """An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`AGUIAdapter.run_stream()`][pydantic_ai.ui.ag_ui.AGUIAdapter.run_stream] or [`AGUIAdapter.dispatch_request()`][pydantic_ai.ui.ag_ui.AGUIAdapter.dispatch_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def run_agent(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" # `dispatch_request` will store the frontend state from the request on `deps.state` (if it implements the `StateHandler` protocol), # so we need to copy the deps to avoid different requests mutating the same deps object. nonlocal deps if isinstance(deps, StateHandler): # pragma: no branch deps = replace(deps) return await AGUIAdapter[AgentDepsT, OutputDataT].dispatch_request( request, agent=agent, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ) self.router.add_route('/', run_agent, methods=['POST']) ``` ### handle_ag_ui_request ```python handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc[BaseEvent] | None = None ) -> Response ``` Handle an AG-UI request by running the agent and returning a streaming response. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `request` | `Request` | The Starlette request (e.g. from FastAPI) containing the AG-UI run input. | *required* | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `on_complete` | \`OnCompleteFunc[BaseEvent] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | Returns: | Type | Description | | ---------- | ---------------------------------------------------------- | | `Response` | A streaming Starlette response with AG-UI protocol events. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python async def handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc[BaseEvent] | None = None, ) -> Response: """Handle an AG-UI request by running the agent and returning a streaming response. Args: agent: The agent to run. request: The Starlette request (e.g. from FastAPI) containing the AG-UI run input. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Returns: A streaming Starlette response with AG-UI protocol events. """ return await AGUIAdapter[AgentDepsT].dispatch_request( request, agent=agent, deps=deps, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, on_complete=on_complete, ) ``` ### run_ag_ui ```python run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc[BaseEvent] | None = None ) -> AsyncIterator[str] ``` Run the agent with the AG-UI run input and stream AG-UI protocol events. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `run_input` | `RunAgentInput` | The AG-UI run input containing thread_id, run_id, messages, etc. | *required* | | `accept` | `str` | The accept header value for the run. | `SSE_CONTENT_TYPE` | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `on_complete` | \`OnCompleteFunc[BaseEvent] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | Yields: | Type | Description | | -------------------- | ------------------------------------------------------------------------------- | | `AsyncIterator[str]` | Streaming event chunks encoded as strings according to the accept header value. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python def run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc[BaseEvent] | None = None, ) -> AsyncIterator[str]: """Run the agent with the AG-UI run input and stream AG-UI protocol events. Args: agent: The agent to run. run_input: The AG-UI run input containing thread_id, run_id, messages, etc. accept: The accept header value for the run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Yields: Streaming event chunks encoded as strings according to the accept header value. """ adapter = AGUIAdapter(agent=agent, run_input=run_input, accept=accept) return adapter.encode_stream( adapter.run_stream( output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, on_complete=on_complete, ), ) ``` # `pydantic_ai.agent` ### Agent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take AgentDepsT and the output type they return, OutputDataT. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @dataclasses.dataclass(init=False) class Agent(AbstractAgent[AgentDepsT, OutputDataT]): """Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take [`AgentDepsT`][pydantic_ai.tools.AgentDepsT] and the output type they return, [`OutputDataT`][pydantic_ai.output.OutputDataT]. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` """ _model: models.Model | models.KnownModelName | str | None _name: str | None end_strategy: EndStrategy """The strategy for handling multiple tool calls when a final result is found. - `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped - `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output """ model_settings: ModelSettings | None """Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. """ _output_type: OutputSpec[OutputDataT] instrument: InstrumentationSettings | bool | None """Options to automatically instrument with OpenTelemetry.""" _instrument_default: ClassVar[InstrumentationSettings | bool] = False _metadata: AgentMetadata[AgentDepsT] | None = dataclasses.field(repr=False) _deps_type: type[AgentDepsT] = dataclasses.field(repr=False) _output_schema: _output.OutputSchema[OutputDataT] = dataclasses.field(repr=False) _output_validators: list[_output.OutputValidator[AgentDepsT, OutputDataT]] = dataclasses.field(repr=False) _instructions: list[str | _system_prompt.SystemPromptFunc[AgentDepsT]] = dataclasses.field(repr=False) _system_prompts: tuple[str, ...] = dataclasses.field(repr=False) _system_prompt_functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field(repr=False) _system_prompt_dynamic_functions: dict[str, _system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field( repr=False ) _function_toolset: FunctionToolset[AgentDepsT] = dataclasses.field(repr=False) _output_toolset: OutputToolset[AgentDepsT] | None = dataclasses.field(repr=False) _user_toolsets: list[AbstractToolset[AgentDepsT]] = dataclasses.field(repr=False) _prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _max_result_retries: int = dataclasses.field(repr=False) _max_tool_retries: int = dataclasses.field(repr=False) _tool_timeout: float | None = dataclasses.field(repr=False) _validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = dataclasses.field(repr=False) _event_stream_handler: EventStreamHandler[AgentDepsT] | None = dataclasses.field(repr=False) _concurrency_limiter: _concurrency.AbstractConcurrencyLimiter | None = dataclasses.field(repr=False) _enter_lock: Lock = dataclasses.field(repr=False) _entered_count: int = dataclasses.field(repr=False) _exit_stack: AsyncExitStack | None = dataclasses.field(repr=False) @overload def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, ) -> None: ... @overload @deprecated('`mcp_servers` is deprecated, use `toolsets` instead.') def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, ) -> None: ... def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.agent.Agent.instructions] or pass additional, temporary, instructions when executing a run. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.agent.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the [HTTP Request Retries](../retries.md) documentation. validation_context: Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) used to validate tool arguments and outputs. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.agent.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.agent.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.agent.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. metadata: Optional metadata to store with each run. Provide a dictionary of primitives, or a callable returning one computed from the [`RunContext`][pydantic_ai.tools.RunContext] on each run. Metadata is resolved when a run starts and recomputed after a successful run finishes so it can reflect the final state. Resolved metadata can be read after the run completes via [`AgentRun.metadata`][pydantic_ai.agent.AgentRun], [`AgentRunResult.metadata`][pydantic_ai.agent.AgentRunResult], and [`StreamedRunResult.metadata`][pydantic_ai.result.StreamedRunResult], and is attached to the agent run span when instrumentation is enabled. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. tool_timeout: Default timeout in seconds for tool execution. If a tool takes longer than this, the tool is considered to have failed and a retry prompt is returned to the model (counting towards the retry limit). Individual tools can override this with their own timeout. Defaults to None (no timeout). max_concurrency: Optional limit on concurrent agent runs. Can be an integer for simple limiting, a [`ConcurrencyLimit`][pydantic_ai.ConcurrencyLimit] for advanced configuration with backpressure, a [`ConcurrencyLimiter`][pydantic_ai.ConcurrencyLimiter] for sharing limits across multiple agents, or None (default) for no limiting. When the limit is reached, additional calls to `run()` or `iter()` will wait until a slot becomes available. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._metadata = metadata self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type) self._output_validators = [] self._instructions = self._normalize_instructions(instructions) self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._tool_timeout = tool_timeout self._validation_context = validation_context self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, timeout=self._tool_timeout, output_schema=self._output_schema, ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._concurrency_limiter = _concurrency.normalize_to_limiter(max_concurrency) self._override_name: ContextVar[_utils.Option[str]] = ContextVar('_override_name', default=None) self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._override_instructions: ContextVar[ _utils.Option[list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]] ] = ContextVar('_override_instructions', default=None) self._override_metadata: ContextVar[_utils.Option[AgentMetadata[AgentDepsT]]] = ContextVar( '_override_metadata', default=None ) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument @property def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" return self._model @model.setter def model(self, value: models.Model | models.KnownModelName | str | None) -> None: """Set the default model configured for this agent. We allow `str` here since the actual list of allowed models changes frequently. """ self._model = value @property def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ name_ = self._override_name.get() return name_.value if name_ else self._name @name.setter def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" self._name = value @property def deps_type(self) -> type: """The type of dependencies used by the agent.""" return self._deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" return self._output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" return self._event_stream_handler def __repr__(self) -> str: return f'{type(self).__name__}(model={self.model!r}, name={self.name!r}, end_strategy={self.end_strategy!r}, model_settings={self.model_settings!r}, output_type={self.output_type!r}, instrument={self.instrument!r})' @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) output_schema = self._prepare_output_schema(output_type) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = self._output_validators output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = output_schema.toolset if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset, default_max_retries=self._max_tool_retries) # Build the graph graph = _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=list(message_history) if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() instructions_literal, instructions_functions = self._get_instructions(additional_instructions=instructions) async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ instructions_literal, *[await func.run(run_context) for func in instructions_functions], ] parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, OutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=len(message_history) if message_history else 0, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, validation_context=self._validation_context, history_processors=self.history_processors, builtin_tools=[*self._builtin_tools, *(builtin_tools or [])], tool_manager=tool_manager, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) user_prompt_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, deferred_tool_results=deferred_tool_results, instructions=instructions_literal, instructions_functions=instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' instrumentation_names = InstrumentationNames.for_version( instrumentation_settings.version if instrumentation_settings else DEFAULT_INSTRUMENTATION_VERSION ) run_span = tracer.start_span( instrumentation_names.get_agent_run_span_name(agent_name), attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'gen_ai.agent.name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) run_metadata: dict[str, Any] | None = None try: async with ( _concurrency.get_concurrency_context(self._concurrency_limiter, f'agent:{agent_name}'), graph.iter( inputs=user_prompt_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run, ): async with toolset: agent_run = AgentRun(graph_run) run_metadata = self._resolve_and_store_metadata(agent_run.ctx, metadata) try: yield agent_run finally: if agent_run.result is not None: run_metadata = self._resolve_and_store_metadata(agent_run.ctx, metadata) else: run_metadata = graph_run.state.metadata final_result = agent_run.result if ( instrumentation_settings and instrumentation_settings.include_content and run_span.is_recording() and final_result is not None ): run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes( self._run_span_end_attributes( instrumentation_settings, usage, state.message_history, graph_deps.new_message_index, run_metadata, ) ) finally: run_span.end() def _get_metadata( self, ctx: RunContext[AgentDepsT], additional_metadata: AgentMetadata[AgentDepsT] | None = None, ) -> dict[str, Any] | None: metadata_override = self._override_metadata.get() if metadata_override is not None: return self._resolve_metadata_config(metadata_override.value, ctx) base_metadata = self._resolve_metadata_config(self._metadata, ctx) run_metadata = self._resolve_metadata_config(additional_metadata, ctx) if base_metadata and run_metadata: return {**base_metadata, **run_metadata} return run_metadata or base_metadata def _resolve_metadata_config( self, config: AgentMetadata[AgentDepsT] | None, ctx: RunContext[AgentDepsT], ) -> dict[str, Any] | None: if config is None: return None metadata = config(ctx) if callable(config) else config return metadata def _resolve_and_store_metadata( self, graph_run_ctx: GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]], metadata: AgentMetadata[AgentDepsT] | None, ) -> dict[str, Any] | None: run_context = build_run_context(graph_run_ctx) resolved_metadata = self._get_metadata(run_context, metadata) graph_run_ctx.state.metadata = resolved_metadata return resolved_metadata def _run_span_end_attributes( self, settings: InstrumentationSettings, usage: _usage.RunUsage, message_history: list[_messages.ModelMessage], new_message_index: int, metadata: dict[str, Any] | None = None, ): if settings.version == 1: attrs = { 'all_messages_events': json.dumps( [InstrumentedModel.event_to_dict(e) for e in settings.messages_to_otel_events(message_history)] ) } else: # Store the last instructions here for convenience last_instructions = InstrumentedModel._get_instructions(message_history) # pyright: ignore[reportPrivateUsage] attrs: dict[str, Any] = { 'pydantic_ai.all_messages': json.dumps(settings.messages_to_otel_messages(list(message_history))), **settings.system_instructions_attributes(last_instructions), } # If this agent run was provided with existing history, store an attribute indicating the point at which the # new messages begin. if new_message_index > 0: attrs['pydantic_ai.new_message_index'] = new_message_index # If the instructions for this agent run were not always the same, store an attribute that indicates that. # This can signal to an observability UI that different steps in the agent run had different instructions. # Note: We purposely only look at "new" messages because they are the only ones produced by this agent run. if any( ( isinstance(m, _messages.ModelRequest) and m.instructions is not None and m.instructions != last_instructions ) for m in message_history[new_message_index:] ): attrs['pydantic_ai.variable_instructions'] = True if metadata is not None: attrs['metadata'] = json.dumps(InstrumentedModel.serialize_any(metadata)) return { **usage.opentelemetry_attributes(), **attrs, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { **{k: {'type': 'array'} if isinstance(v, str) else {} for k, v in attrs.items()}, 'final_result': {'type': 'object'}, }, } ), } @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, metadata: AgentMetadata[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. metadata: The metadata to use instead of the metadata passed to the agent constructor. When set, any per-run `metadata` argument is ignored. """ if _utils.is_set(name): name_token = self._override_name.set(_utils.Some(name)) else: name_token = None if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None if _utils.is_set(instructions): normalized_instructions = self._normalize_instructions(instructions) instructions_token = self._override_instructions.set(_utils.Some(normalized_instructions)) else: instructions_token = None if _utils.is_set(metadata): metadata_token = self._override_metadata.set(_utils.Some(metadata)) else: metadata_token = None try: yield finally: if name_token is not None: self._override_name.reset(name_token) if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) if instructions_token is not None: self._override_instructions.reset(instructions_token) if metadata_token is not None: self._override_metadata.reset(metadata_token) @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], str | None], / ) -> Callable[[RunContext[AgentDepsT]], str | None]: ... @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]]: ... @overload def instructions(self, func: Callable[[], str | None], /) -> Callable[[], str | None]: ... @overload def instructions(self, func: Callable[[], Awaitable[str | None]], /) -> Callable[[], Awaitable[str | None]]: ... @overload def instructions( self, / ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions.append(func_) return func_ return decorator else: self._instructions.append(func) return func @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], str | None], / ) -> Callable[[RunContext[AgentDepsT]], str | None]: ... @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str | None]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str | None]]: ... @overload def system_prompt(self, func: Callable[[], str | None], /) -> Callable[[], str | None]: ... @overload def system_prompt(self, func: Callable[[], Awaitable[str | None]], /) -> Callable[[], Awaitable[str | None]]: ... @overload def system_prompt( self, /, *, dynamic: bool = False ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]]: ... @overload def output_validator( self, func: Callable[[OutputDataT], OutputDataT], / ) -> Callable[[OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[OutputDataT], Awaitable[OutputDataT]]: ... def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func @overload def tool(self, func: ToolFuncContext[AgentDepsT, ToolParams], /) -> ToolFuncContext[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=True, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def tool_plain(self, func: ToolFuncPlain[ToolParams], /) -> ToolFuncPlain[ToolParams]: ... @overload def tool_plain( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]]: ... def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=False, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def toolset(self, func: ToolsetFunc[AgentDepsT], /) -> ToolsetFunc[AgentDepsT]: ... @overload def toolset( self, /, *, per_run_step: bool = True, id: str | None = None, ) -> Callable[[ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT]]: ... def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, id: str | None = None, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. id: An optional unique ID for the dynamic toolset. Required for use with durable execution environments like Temporal, where the ID identifies the toolset's activities within the workflow. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step, id=id)) return func_ return toolset_decorator if func is None else toolset_decorator(func) def _get_model(self, model: models.Model | models.KnownModelName | str | None) -> models.Model: """Create a model configured for this agent. Args: model: model to use for this run, required if `model` was not set when creating the agent. Returns: The model used """ model_: models.Model if some_model := self._override_model.get(): # we don't want `override()` to cover up errors from the model not being defined, hence this check if model is None and self.model is None: raise exceptions.UserError( '`model` must either be set on the agent or included when calling it. ' '(Even when `override(model=...)` is customizing the model that will actually be called)' ) model_ = some_model.value elif model is not None: model_ = models.infer_model(model) elif self.model is not None: # noinspection PyTypeChecker model_ = self.model = models.infer_model(self.model) else: raise exceptions.UserError('`model` must either be set on the agent or included when calling it.') instrument = self.instrument if instrument is None: instrument = self._instrument_default return instrument_model(model_, instrument) def _get_deps(self: Agent[T, OutputDataT], deps: T) -> T: """Get deps for a run. If we've overridden deps via `_override_deps`, use that, otherwise use the deps passed to the call. We could do runtime type checking of deps against `self._deps_type`, but that's a slippery slope. """ if some_deps := self._override_deps.get(): return some_deps.value else: return deps def _normalize_instructions( self, instructions: Instructions[AgentDepsT], ) -> list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]: if instructions is None: return [] if isinstance(instructions, str) or callable(instructions): return [instructions] return list(instructions) def _get_instructions( self, additional_instructions: Instructions[AgentDepsT] = None, ) -> tuple[str | None, list[_system_prompt.SystemPromptRunner[AgentDepsT]]]: override_instructions = self._override_instructions.get() if override_instructions: instructions = override_instructions.value else: instructions = self._instructions.copy() if additional_instructions is not None: instructions.extend(self._normalize_instructions(additional_instructions)) literal_parts: list[str] = [] functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = [] for instruction in instructions: if isinstance(instruction, str): literal_parts.append(instruction) else: functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](instruction)) literal = '\n'.join(literal_parts).strip() or None return literal, functions def _get_toolset( self, output_toolset: AbstractToolset[AgentDepsT] | None | _utils.Unset = _utils.UNSET, additional_toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractToolset[AgentDepsT]: """Get the complete toolset. Args: output_toolset: The output toolset to use instead of the one built at agent construction time. additional_toolsets: Additional toolsets to add, unless toolsets have been overridden. """ toolsets = self.toolsets # Don't add additional toolsets if the toolsets have been overridden if additional_toolsets and self._override_toolsets.get() is None: toolsets = [*toolsets, *additional_toolsets] toolset = CombinedToolset(toolsets) def copy_dynamic_toolsets(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: if isinstance(toolset, DynamicToolset): return toolset.copy() else: return toolset toolset = toolset.visit_and_replace(copy_dynamic_toolsets) if self._prepare_tools: toolset = PreparedToolset(toolset, self._prepare_tools) output_toolset = output_toolset if _utils.is_set(output_toolset) else self._output_toolset if output_toolset is not None: if self._prepare_output_tools: output_toolset = PreparedToolset(output_toolset, self._prepare_output_tools) toolset = CombinedToolset([output_toolset, toolset]) return toolset @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. """ toolsets: list[AbstractToolset[AgentDepsT]] = [] if some_tools := self._override_tools.get(): function_toolset = _AgentFunctionToolset( some_tools.value, max_retries=self._max_tool_retries, timeout=self._tool_timeout, output_schema=self._output_schema, ) else: function_toolset = self._function_toolset toolsets.append(function_toolset) if some_user_toolsets := self._override_toolsets.get(): user_toolsets = some_user_toolsets.value else: user_toolsets = [*self._user_toolsets, *self._dynamic_toolsets] toolsets.extend(user_toolsets) return toolsets @overload def _prepare_output_schema(self, output_type: None) -> _output.OutputSchema[OutputDataT]: ... @overload def _prepare_output_schema( self, output_type: OutputSpec[RunOutputDataT] ) -> _output.OutputSchema[RunOutputDataT]: ... def _prepare_output_schema(self, output_type: OutputSpec[Any] | None) -> _output.OutputSchema[Any]: if output_type is not None: if self._output_validators: raise exceptions.UserError('Cannot set a custom run `output_type` when the agent has output validators') schema = _output.OutputSchema.build(output_type) else: schema = self._output_schema return schema async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) def to_web( self, *, models: ModelsParam = None, builtin_tools: list[AbstractBuiltinTool] | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, instructions: str | None = None, html_source: str | Path | None = None, ) -> Starlette: """Create a Starlette app that serves a web chat UI for this agent. This method returns a pre-configured Starlette application that provides a web-based chat interface for interacting with the agent. By default, the UI is fetched from a CDN and cached on first use. The returned Starlette application can be mounted into a FastAPI app or run directly with any ASGI server (uvicorn, hypercorn, etc.). Note that the `deps` and `model_settings` will be the same for each request. To provide different `deps` for each request use the lower-level adapters directly. Args: models: Additional models to make available in the UI. Can be: - A sequence of model names/instances (e.g., `['openai:gpt-5', 'anthropic:claude-sonnet-4-5']`) - A dict mapping display labels to model names/instances (e.g., `{'GPT 5': 'openai:gpt-5', 'Claude': 'anthropic:claude-sonnet-4-5'}`) The agent's model is always included. Builtin tool support is automatically determined from each model's profile. builtin_tools: Additional builtin tools to make available in the UI. The agent's configured builtin tools are always included. Tool labels in the UI are derived from the tool's `label` property. deps: Optional dependencies to use for all requests. model_settings: Optional settings to use for all model requests. instructions: Optional extra instructions to pass to each agent run. html_source: Path or URL for the chat UI HTML. Can be: - None (default): Fetches from CDN and caches locally - A Path instance: Reads from the local file - A URL string (http:// or https://): Fetches from the URL - A file path string: Reads from the local file Returns: A configured Starlette application ready to be served (e.g., with uvicorn) Example: ```python from pydantic_ai import Agent from pydantic_ai.builtin_tools import WebSearchTool agent = Agent('openai:gpt-5', builtin_tools=[WebSearchTool()]) # Simple usage - uses agent's model and builtin tools app = agent.to_web() # Or provide additional models for UI selection app = agent.to_web(models=['openai:gpt-5', 'anthropic:claude-sonnet-4-5']) # Then run with: uvicorn app:app --reload ``` """ from ..ui._web import create_web_app return create_web_app( self, models=models, builtin_tools=builtin_tools, deps=deps, model_settings=model_settings, instructions=instructions, html_source=html_source, ) @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ```` #### __init__ ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: ( Any | Callable[[RunContext[AgentDepsT]], Any] ) = None, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, tool_timeout: float | None = None, max_concurrency: AnyConcurrencyLimit = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: ( Any | Callable[[RunContext[AgentDepsT]], Any] ) = None, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, tool_timeout: float | None = None, max_concurrency: AnyConcurrencyLimit = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: ( Any | Callable[[RunContext[AgentDepsT]], Any] ) = None, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, tool_timeout: float | None = None, max_concurrency: AnyConcurrencyLimit = None, **_deprecated_kwargs: Any ) ``` Create an agent. Parameters: | Name | Type | Description | Default | | ---------------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | \`Model | KnownModelName | str | | `output_type` | `OutputSpec[OutputDataT]` | The type of the output data, used to validate the data returned by the model, defaults to str. | `str` | | `instructions` | `Instructions[AgentDepsT]` | Instructions to use for this agent, you can also register instructions via a function with instructions or pass additional, temporary, instructions when executing a run. | `None` | | `system_prompt` | \`str | Sequence[str]\` | Static system prompts to use for this agent, you can also register system prompts via a function with system_prompt. | | `deps_type` | `type[AgentDepsT]` | The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set deps=None to satisfy Pyright or add a type hint : Agent\[None, \]. | `NoneType` | | `name` | \`str | None\` | The name of the agent, used for logging. If None, we try to infer the agent name from the call frame when the agent is first run. | | `model_settings` | \`ModelSettings | None\` | Optional model request settings to use for this agent's runs, by default. | | `retries` | `int` | The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the HTTP Request Retries documentation. | `1` | | `validation_context` | \`Any | Callable\[\[RunContext[AgentDepsT]\], Any\]\` | Pydantic validation context used to validate tool arguments and outputs. | | `output_retries` | \`int | None\` | The maximum number of retries to allow for output validation, defaults to retries. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\]\` | Tools to register with the agent, you can also register tools via the decorators @agent.tool and @agent.tool_plain. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\]\` | The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. | | `prepare_tools` | \`ToolsPrepareFunc[AgentDepsT] | None\` | Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See ToolsPrepareFunc | | `prepare_output_tools` | \`ToolsPrepareFunc[AgentDepsT] | None\` | Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See ToolsPrepareFunc | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]\] | None\` | | `defer_model_check` | `bool` | by default, if you provide a named model, it's evaluated to create a Model instance immediately, which checks for the necessary environment variables. Set this to false to defer the evaluation until the first run. Useful if you want to override the model for testing. | `False` | | `end_strategy` | `EndStrategy` | Strategy for handling tool calls that are requested alongside a final result. See EndStrategy for more information. | `'early'` | | `instrument` | \`InstrumentationSettings | bool | None\` | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to store with each run. Provide a dictionary of primitives, or a callable returning one computed from the RunContext on each run. Metadata is resolved when a run starts and recomputed after a successful run finishes so it can reflect the final state. Resolved metadata can be read after the run completes via AgentRun.metadata, AgentRunResult.metadata, and StreamedRunResult.metadata, and is attached to the agent run span when instrumentation is enabled. | | `history_processors` | \`Sequence\[HistoryProcessor[AgentDepsT]\] | None\` | Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional handler for events from the model's streaming response and the agent's execution of tools. | | `tool_timeout` | \`float | None\` | Default timeout in seconds for tool execution. If a tool takes longer than this, the tool is considered to have failed and a retry prompt is returned to the model (counting towards the retry limit). Individual tools can override this with their own timeout. Defaults to None (no timeout). | | `max_concurrency` | `AnyConcurrencyLimit` | Optional limit on concurrent agent runs. Can be an integer for simple limiting, a ConcurrencyLimit for advanced configuration with backpressure, a ConcurrencyLimiter for sharing limits across multiple agents, or None (default) for no limiting. When the limit is reached, additional calls to run() or iter() will wait until a slot becomes available. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, validation_context: Any | Callable[[RunContext[AgentDepsT]], Any] = None, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, tool_timeout: float | None = None, max_concurrency: _concurrency.AnyConcurrencyLimit = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.agent.Agent.instructions] or pass additional, temporary, instructions when executing a run. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.agent.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the [HTTP Request Retries](../retries.md) documentation. validation_context: Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) used to validate tool arguments and outputs. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.agent.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.agent.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.agent.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. metadata: Optional metadata to store with each run. Provide a dictionary of primitives, or a callable returning one computed from the [`RunContext`][pydantic_ai.tools.RunContext] on each run. Metadata is resolved when a run starts and recomputed after a successful run finishes so it can reflect the final state. Resolved metadata can be read after the run completes via [`AgentRun.metadata`][pydantic_ai.agent.AgentRun], [`AgentRunResult.metadata`][pydantic_ai.agent.AgentRunResult], and [`StreamedRunResult.metadata`][pydantic_ai.result.StreamedRunResult], and is attached to the agent run span when instrumentation is enabled. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. tool_timeout: Default timeout in seconds for tool execution. If a tool takes longer than this, the tool is considered to have failed and a retry prompt is returned to the model (counting towards the retry limit). Individual tools can override this with their own timeout. Defaults to None (no timeout). max_concurrency: Optional limit on concurrent agent runs. Can be an integer for simple limiting, a [`ConcurrencyLimit`][pydantic_ai.ConcurrencyLimit] for advanced configuration with backpressure, a [`ConcurrencyLimiter`][pydantic_ai.ConcurrencyLimiter] for sharing limits across multiple agents, or None (default) for no limiting. When the limit is reached, additional calls to `run()` or `iter()` will wait until a slot becomes available. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._metadata = metadata self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type) self._output_validators = [] self._instructions = self._normalize_instructions(instructions) self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._tool_timeout = tool_timeout self._validation_context = validation_context self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, timeout=self._tool_timeout, output_schema=self._output_schema, ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._concurrency_limiter = _concurrency.normalize_to_limiter(max_concurrency) self._override_name: ContextVar[_utils.Option[str]] = ContextVar('_override_name', default=None) self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._override_instructions: ContextVar[ _utils.Option[list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]] ] = ContextVar('_override_instructions', default=None) self._override_metadata: ContextVar[_utils.Option[AgentMetadata[AgentDepsT]]] = ContextVar( '_override_metadata', default=None ) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None ``` #### end_strategy ```python end_strategy: EndStrategy = end_strategy ``` The strategy for handling multiple tool calls when a final result is found. - `'early'` (default): Output tools are executed first. Once a valid final result is found, remaining function and output tool calls are skipped - `'exhaustive'`: Output tools are executed first, then all function tools are executed. The first valid output tool result becomes the final output #### model_settings ```python model_settings: ModelSettings | None = model_settings ``` Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. #### instrument ```python instrument: InstrumentationSettings | bool | None = ( instrument ) ``` Options to automatically instrument with OpenTelemetry. #### instrument_all ```python instrument_all( instrument: InstrumentationSettings | bool = True, ) -> None ``` Set the instrumentation options for all agents where `instrument` is not set. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument ``` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) output_schema = self._prepare_output_schema(output_type) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = self._output_validators output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = output_schema.toolset if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset, default_max_retries=self._max_tool_retries) # Build the graph graph = _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=list(message_history) if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() instructions_literal, instructions_functions = self._get_instructions(additional_instructions=instructions) async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ instructions_literal, *[await func.run(run_context) for func in instructions_functions], ] parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, OutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=len(message_history) if message_history else 0, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, validation_context=self._validation_context, history_processors=self.history_processors, builtin_tools=[*self._builtin_tools, *(builtin_tools or [])], tool_manager=tool_manager, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) user_prompt_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, deferred_tool_results=deferred_tool_results, instructions=instructions_literal, instructions_functions=instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' instrumentation_names = InstrumentationNames.for_version( instrumentation_settings.version if instrumentation_settings else DEFAULT_INSTRUMENTATION_VERSION ) run_span = tracer.start_span( instrumentation_names.get_agent_run_span_name(agent_name), attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'gen_ai.agent.name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) run_metadata: dict[str, Any] | None = None try: async with ( _concurrency.get_concurrency_context(self._concurrency_limiter, f'agent:{agent_name}'), graph.iter( inputs=user_prompt_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run, ): async with toolset: agent_run = AgentRun(graph_run) run_metadata = self._resolve_and_store_metadata(agent_run.ctx, metadata) try: yield agent_run finally: if agent_run.result is not None: run_metadata = self._resolve_and_store_metadata(agent_run.ctx, metadata) else: run_metadata = graph_run.state.metadata final_result = agent_run.result if ( instrumentation_settings and instrumentation_settings.include_content and run_span.is_recording() and final_result is not None ): run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes( self._run_span_end_attributes( instrumentation_settings, usage, state.message_history, graph_deps.new_message_index, run_metadata, ) ) finally: run_span.end() ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET, metadata: AgentMetadata[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | | `metadata` | \`AgentMetadata[AgentDepsT] | Unset\` | The metadata to use instead of the metadata passed to the agent constructor. When set, any per-run metadata argument is ignored. | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, metadata: AgentMetadata[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. metadata: The metadata to use instead of the metadata passed to the agent constructor. When set, any per-run `metadata` argument is ignored. """ if _utils.is_set(name): name_token = self._override_name.set(_utils.Some(name)) else: name_token = None if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None if _utils.is_set(instructions): normalized_instructions = self._normalize_instructions(instructions) instructions_token = self._override_instructions.set(_utils.Some(normalized_instructions)) else: instructions_token = None if _utils.is_set(metadata): metadata_token = self._override_metadata.set(_utils.Some(metadata)) else: metadata_token = None try: yield finally: if name_token is not None: self._override_name.reset(name_token) if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) if instructions_token is not None: self._override_instructions.reset(instructions_token) if metadata_token is not None: self._override_metadata.reset(metadata_token) ``` #### instructions ```python instructions( func: Callable[[RunContext[AgentDepsT]], str | None], ) -> Callable[[RunContext[AgentDepsT]], str | None] ``` ```python instructions( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str | None] ], ) -> Callable[ [RunContext[AgentDepsT]], Awaitable[str | None] ] ``` ```python instructions( func: Callable[[], str | None], ) -> Callable[[], str | None] ``` ```python instructions( func: Callable[[], Awaitable[str | None]], ) -> Callable[[], Awaitable[str | None]] ``` ```python instructions() -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python instructions( func: SystemPromptFunc[AgentDepsT] | None = None, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register an instructions function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions.append(func_) return func_ return decorator else: self._instructions.append(func) return func ```` #### system_prompt ```python system_prompt( func: Callable[[RunContext[AgentDepsT]], str | None], ) -> Callable[[RunContext[AgentDepsT]], str | None] ``` ```python system_prompt( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str | None] ], ) -> Callable[ [RunContext[AgentDepsT]], Awaitable[str | None] ] ``` ```python system_prompt( func: Callable[[], str | None], ) -> Callable[[], str | None] ``` ```python system_prompt( func: Callable[[], Awaitable[str | None]], ) -> Callable[[], Awaitable[str | None]] ``` ```python system_prompt(*, dynamic: bool = False) -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python system_prompt( func: SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register a system prompt function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------ | ----------------------------------------------------------------------------------------------------------------------- | ------------------------ | | `func` | \`SystemPromptFunc[AgentDepsT] | None\` | The function to decorate | | `dynamic` | `bool` | If True, the system prompt will be reevaluated even when messages_history is provided, see SystemPromptPart.dynamic_ref | `False` | Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func ```` #### output_validator ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ] ``` ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ] ``` ```python output_validator( func: Callable[[OutputDataT], OutputDataT], ) -> Callable[[OutputDataT], OutputDataT] ``` ```python output_validator( func: Callable[[OutputDataT], Awaitable[OutputDataT]], ) -> Callable[[OutputDataT], Awaitable[OutputDataT]] ``` ```python output_validator( func: OutputValidatorFunc[AgentDepsT, OutputDataT], ) -> OutputValidatorFunc[AgentDepsT, OutputDataT] ``` Decorator to register an output validator function. Optionally takes RunContext as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func ```` #### tool ```python tool( func: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None ) -> Callable[ [ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncContext[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](https://ai.pydantic.dev/tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | -------------------------------- | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `func` | \`ToolFuncContext[AgentDepsT, ToolParams] | None\` | The tool function to register. | | `name` | \`str | None\` | The name of the tool, defaults to the function name. | | `description` | \`str | None\` | The description of the tool, defaults to the function docstring. | | `retries` | \`int | None\` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | | `prepare` | \`ToolPrepareFunc[AgentDepsT] | None\` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level tool_timeout if set. Defaults to None (no timeout). | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=True, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### tool_plain ```python tool_plain( func: ToolFuncPlain[ToolParams], ) -> ToolFuncPlain[ToolParams] ``` ```python tool_plain( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None ) -> Callable[ [ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams] ] ``` ```python tool_plain( func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any ``` Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](https://ai.pydantic.dev/tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Parameters: | Name | Type | Description | Default | | -------------------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `func` | \`ToolFuncPlain[ToolParams] | None\` | The tool function to register. | | `name` | \`str | None\` | The name of the tool, defaults to the function name. | | `description` | \`str | None\` | The description of the tool, defaults to the function docstring. | | `retries` | \`int | None\` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | | `prepare` | \`ToolPrepareFunc[AgentDepsT] | None\` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level tool_timeout if set. Defaults to None (no timeout). | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Overrides the agent-level `tool_timeout` if set. Defaults to None (no timeout). """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=False, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### toolset ```python toolset( func: ToolsetFunc[AgentDepsT], ) -> ToolsetFunc[AgentDepsT] ``` ```python toolset( *, per_run_step: bool = True, id: str | None = None ) -> Callable[ [ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT] ] ``` ```python toolset( func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, id: str | None = None, ) -> Any ``` Decorator to register a toolset function which takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Parameters: | Name | Type | Description | Default | | -------------- | ------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `func` | \`ToolsetFunc[AgentDepsT] | None\` | The toolset function to register. | | `per_run_step` | `bool` | Whether to re-evaluate the toolset for each run step. Defaults to True. | `True` | | `id` | \`str | None\` | An optional unique ID for the dynamic toolset. Required for use with durable execution environments like Temporal, where the ID identifies the toolset's activities within the workflow. | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, id: str | None = None, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. id: An optional unique ID for the dynamic toolset. Required for use with durable execution environments like Temporal, where the ID identifies the toolset's activities within the workflow. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step, id=id)) return func_ return toolset_decorator if func is None else toolset_decorator(func) ```` #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. #### __aenter__ ```python __aenter__() -> Self ``` Enter the agent context. This will start all MCPServerStdios registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self ``` #### set_mcp_sampling_model ```python set_mcp_sampling_model( model: Model | KnownModelName | str | None = None, ) -> None ``` Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) ``` #### to_web ```python to_web( *, models: ModelsParam = None, builtin_tools: list[AbstractBuiltinTool] | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, instructions: str | None = None, html_source: str | Path | None = None ) -> Starlette ``` Create a Starlette app that serves a web chat UI for this agent. This method returns a pre-configured Starlette application that provides a web-based chat interface for interacting with the agent. By default, the UI is fetched from a CDN and cached on first use. The returned Starlette application can be mounted into a FastAPI app or run directly with any ASGI server (uvicorn, hypercorn, etc.). Note that the `deps` and `model_settings` will be the same for each request. To provide different `deps` for each request use the lower-level adapters directly. Parameters: | Name | Type | Description | Default | | ---------------- | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `models` | `ModelsParam` | Additional models to make available in the UI. Can be: - A sequence of model names/instances (e.g., ['openai:gpt-5', 'anthropic:claude-sonnet-4-5']) - A dict mapping display labels to model names/instances (e.g., {'GPT 5': 'openai:gpt-5', 'Claude': 'anthropic:claude-sonnet-4-5'}) The agent's model is always included. Builtin tool support is automatically determined from each model's profile. | `None` | | `builtin_tools` | \`list[AbstractBuiltinTool] | None\` | Additional builtin tools to make available in the UI. The agent's configured builtin tools are always included. Tool labels in the UI are derived from the tool's label property. | | `deps` | `AgentDepsT` | Optional dependencies to use for all requests. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for all model requests. | | `instructions` | \`str | None\` | Optional extra instructions to pass to each agent run. | | `html_source` | \`str | Path | None\` | Returns: | Type | Description | | ----------- | -------------------------------------------------------------------------- | | `Starlette` | A configured Starlette application ready to be served (e.g., with uvicorn) | Example ```python from pydantic_ai import Agent from pydantic_ai.builtin_tools import WebSearchTool agent = Agent('openai:gpt-5', builtin_tools=[WebSearchTool()]) # Simple usage - uses agent's model and builtin tools app = agent.to_web() # Or provide additional models for UI selection app = agent.to_web(models=['openai:gpt-5', 'anthropic:claude-sonnet-4-5']) # Then run with: uvicorn app:app --reload ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def to_web( self, *, models: ModelsParam = None, builtin_tools: list[AbstractBuiltinTool] | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, instructions: str | None = None, html_source: str | Path | None = None, ) -> Starlette: """Create a Starlette app that serves a web chat UI for this agent. This method returns a pre-configured Starlette application that provides a web-based chat interface for interacting with the agent. By default, the UI is fetched from a CDN and cached on first use. The returned Starlette application can be mounted into a FastAPI app or run directly with any ASGI server (uvicorn, hypercorn, etc.). Note that the `deps` and `model_settings` will be the same for each request. To provide different `deps` for each request use the lower-level adapters directly. Args: models: Additional models to make available in the UI. Can be: - A sequence of model names/instances (e.g., `['openai:gpt-5', 'anthropic:claude-sonnet-4-5']`) - A dict mapping display labels to model names/instances (e.g., `{'GPT 5': 'openai:gpt-5', 'Claude': 'anthropic:claude-sonnet-4-5'}`) The agent's model is always included. Builtin tool support is automatically determined from each model's profile. builtin_tools: Additional builtin tools to make available in the UI. The agent's configured builtin tools are always included. Tool labels in the UI are derived from the tool's `label` property. deps: Optional dependencies to use for all requests. model_settings: Optional settings to use for all model requests. instructions: Optional extra instructions to pass to each agent run. html_source: Path or URL for the chat UI HTML. Can be: - None (default): Fetches from CDN and caches locally - A Path instance: Reads from the local file - A URL string (http:// or https://): Fetches from the URL - A file path string: Reads from the local file Returns: A configured Starlette application ready to be served (e.g., with uvicorn) Example: ```python from pydantic_ai import Agent from pydantic_ai.builtin_tools import WebSearchTool agent = Agent('openai:gpt-5', builtin_tools=[WebSearchTool()]) # Simple usage - uses agent's model and builtin tools app = agent.to_web() # Or provide additional models for UI selection app = agent.to_web(models=['openai:gpt-5', 'anthropic:claude-sonnet-4-5']) # Then run with: uvicorn app:app --reload ``` """ from ..ui._web import create_web_app return create_web_app( self, models=models, builtin_tools=builtin_tools, deps=deps, model_settings=model_settings, instructions=instructions, html_source=html_source, ) ```` #### run_mcp_servers ```python run_mcp_servers( model: Model | KnownModelName | str | None = None, ) -> AsyncIterator[None] ``` Deprecated `run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`. Run MCPServerStdios so they can be used by the agent. Deprecated: use async with agent instead. If you need to set a sampling model on all MCP servers, use agent.set_mcp_sampling_model(). Returns: a context manager to start and shutdown the servers. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ``` ### AbstractAgent Bases: `Generic[AgentDepsT, OutputDataT]`, `ABC` Abstract superclass for Agent, WrapperAgent, and your own custom agent implementations. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python class AbstractAgent(Generic[AgentDepsT, OutputDataT], ABC): """Abstract superclass for [`Agent`][pydantic_ai.agent.Agent], [`WrapperAgent`][pydantic_ai.agent.WrapperAgent], and your own custom agent implementations.""" @property @abstractmethod def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" raise NotImplementedError @property @abstractmethod def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ raise NotImplementedError @name.setter @abstractmethod def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" raise NotImplementedError @property @abstractmethod def deps_type(self) -> type: """The type of dependencies used by the agent.""" raise NotImplementedError @property @abstractmethod def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" raise NotImplementedError @property @abstractmethod def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" raise NotImplementedError @property @abstractmethod def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent. Output tools are not included. """ raise NotImplementedError def output_json_schema(self, output_type: OutputSpec[OutputDataT | RunOutputDataT] | None = None) -> JsonSchema: """The output return JSON schema.""" if output_type is None: output_type = self.output_type return_types = types_from_output_spec(output_spec=output_type) json_schemas: list[JsonSchema] = [] for return_type in return_types: json_schema = TypeAdapter(return_type).json_schema(mode='serialization') if json_schema not in json_schemas: json_schemas.append(json_schema) if len(json_schemas) == 1: return json_schemas[0] else: json_schemas, all_defs = _utils.merge_json_schema_defs(json_schemas) json_schema: JsonSchema = {'anyOf': json_schemas} if all_defs: json_schema['$defs'] = all_defs return json_schema @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( # noqa: C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, instructions=instructions, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( None, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ nonlocal final_result final_result = FinalResult( await stream.get_output(), final_result.tool_name, final_result.tool_call_id ) # When we get here, the `ModelRequestNode` has completed streaming after the final result was found. # When running an agent with `agent.run`, we'd then move to `CallToolsNode` to execute the tool calls and # find the final result. # We also want to execute tool calls (in case `agent.end_strategy == 'exhaustive'`) here, but # we don't want to use run the `CallToolsNode` logic to determine the final output, as it would be # wasteful and could produce a different result (e.g. when text output is followed by tool calls). # So we call `process_tool_calls` directly and then end the run with the found final result. parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_tool_calls( tool_manager=graph_ctx.deps.tool_manager, tool_calls=stream.response.tool_calls, tool_call_results=None, tool_call_metadata=None, final_result=final_result, ctx=graph_ctx, output_parts=parts, ): pass # To allow this message history to be used in a future run without dangling tool calls, # append a new ModelRequest using the tool returns and retries if parts: messages.append( _messages.ModelRequest( parts, run_id=graph_ctx.state.run_id, timestamp=_utils.now_utc() ) ) await agent_run.next(_agent_graph.SetFinalResult(final_result)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover @overload def run_stream_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, OutputDataT]: ... @overload def run_stream_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, RunOutputDataT]: ... def run_stream_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, Any]: """Run the agent with a user prompt in sync streaming mode. This is a convenience method that wraps [`run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') def main(): response = agent.run_stream_sync('What is the capital of the UK?') print(response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async def _consume_stream(): async with self.run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) as stream_result: yield stream_result async_result = _utils.get_event_loop().run_until_complete(anext(_consume_stream())) return result.StreamedRunResultSync(async_result) @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) # unfortunately this hack of returning a generator rather than defining it right here is # required to allow overloads of this method to work in python's typing system, or at least with pyright # or at least I couldn't make it work without return self._run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, toolsets=toolsets, builtin_tools=builtin_tools, ) async def _run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: send_stream, receive_stream = anyio.create_memory_object_stream[ _messages.AgentStreamEvent | AgentRunResultEvent[Any] ]() async def event_stream_handler( _: RunContext[AgentDepsT], events: AsyncIterable[_messages.AgentStreamEvent] ) -> None: async for event in events: await send_stream.send(event) async def run_agent() -> AgentRunResult[Any]: async with send_stream: return await self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) task = asyncio.create_task(run_agent()) try: async with receive_stream: async for message in receive_stream: yield message result = await task except asyncio.CancelledError as e: task.cancel(msg=e.args[0] if len(e.args) != 0 else None) raise yield AgentRunResultEvent(result) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ raise NotImplementedError yield @contextmanager @abstractmethod def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ raise NotImplementedError yield def _infer_name(self, function_frame: FrameType | None) -> None: """Infer the agent name from the call frame. RunUsage should be `self._infer_name(inspect.currentframe())`. """ assert self.name is None, 'Name already set' if function_frame is not None: # pragma: no branch if parent_frame := function_frame.f_back: # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): if item is self: self.name = name return @staticmethod @contextmanager def parallel_tool_call_execution_mode(mode: _tool_manager.ParallelExecutionMode = 'parallel') -> Iterator[None]: """Set the parallel execution mode during the context. Args: mode: The execution mode for tool calls: - 'parallel': Run tool calls in parallel, yielding events as they complete (default). - 'sequential': Run tool calls one at a time in order. - 'parallel_ordered_events': Run tool calls in parallel, but events are emitted in order, after all calls complete. """ with ToolManager.parallel_execution_mode(mode): yield @staticmethod @contextmanager @deprecated('Use `parallel_execution_mode("sequential")` instead.') def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.parallel_execution_mode('sequential'): yield @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) @abstractmethod async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: raise NotImplementedError @abstractmethod async def __aexit__(self, *args: Any) -> bool | None: raise NotImplementedError # TODO (v2): Remove in favor of using `AGUIApp` directly -- we don't have `to_temporal()` or `to_vercel_ai()` either. def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ui/ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from pydantic_ai.ui.ag_ui.app import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, model_settings=model_settings, usage_limits=usage_limits, ) def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return _utils.get_event_loop().run_until_complete( self.to_cli( deps=deps, prog_name=prog_name, message_history=message_history, model_settings=model_settings, usage_limits=usage_limits, ) ) ```` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent. Output tools are not included. #### output_json_schema ```python output_json_schema( output_type: ( OutputSpec[OutputDataT | RunOutputDataT] | None ) = None, ) -> JsonSchema ``` The output return JSON schema. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python def output_json_schema(self, output_type: OutputSpec[OutputDataT | RunOutputDataT] | None = None) -> JsonSchema: """The output return JSON schema.""" if output_type is None: output_type = self.output_type return_types = types_from_output_spec(output_spec=output_type) json_schemas: list[JsonSchema] = [] for return_type in return_types: json_schema = TypeAdapter(return_type).json_schema(mode='serialization') if json_schema not in json_schemas: json_schemas.append(json_schema) if len(json_schemas) == 1: return json_schemas[0] else: json_schemas, all_defs = _utils.merge_json_schema_defs(json_schemas) json_schema: JsonSchema = {'anyOf': json_schemas} if all_defs: json_schema['$defs'] = all_defs return json_schema ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use agent.run() with an `event_stream_handler` or agent.iter() instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does not receive any events after the final result is found. | Returns: | Type | Description | | --------------------------------------------------- | ---------------------- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager async def run_stream( # noqa: C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, instructions=instructions, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( None, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ nonlocal final_result final_result = FinalResult( await stream.get_output(), final_result.tool_name, final_result.tool_call_id ) # When we get here, the `ModelRequestNode` has completed streaming after the final result was found. # When running an agent with `agent.run`, we'd then move to `CallToolsNode` to execute the tool calls and # find the final result. # We also want to execute tool calls (in case `agent.end_strategy == 'exhaustive'`) here, but # we don't want to use run the `CallToolsNode` logic to determine the final output, as it would be # wasteful and could produce a different result (e.g. when text output is followed by tool calls). # So we call `process_tool_calls` directly and then end the run with the found final result. parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_tool_calls( tool_manager=graph_ctx.deps.tool_manager, tool_calls=stream.response.tool_calls, tool_call_results=None, tool_call_metadata=None, final_result=final_result, ctx=graph_ctx, output_parts=parts, ): pass # To allow this message history to be used in a future run without dangling tool calls, # append a new ModelRequest using the tool returns and retries if parts: messages.append( _messages.ModelRequest( parts, run_id=graph_ctx.state.run_id, timestamp=_utils.now_utc() ) ) await agent_run.next(_agent_graph.SetFinalResult(final_result)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover ```` #### run_stream_sync ```python run_stream_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> StreamedRunResultSync[AgentDepsT, OutputDataT] ``` ```python run_stream_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> StreamedRunResultSync[AgentDepsT, RunOutputDataT] ``` ```python run_stream_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> StreamedRunResultSync[AgentDepsT, Any] ``` Run the agent with a user prompt in sync streaming mode. This is a convenience method that wraps run_stream() with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use agent.run() with an `event_stream_handler` or agent.iter() instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') def main(): response = agent.run_stream_sync('What is the capital of the UK?') print(response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does not receive any events after the final result is found. | Returns: | Type | Description | | ---------------------------------------- | ---------------------- | | `StreamedRunResultSync[AgentDepsT, Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_stream_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> result.StreamedRunResultSync[AgentDepsT, Any]: """Run the agent with a user prompt in sync streaming mode. This is a convenience method that wraps [`run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') def main(): response = agent.run_stream_sync('What is the capital of the UK?') print(response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async def _consume_stream(): async with self.run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) as stream_result: yield stream_result async_result = _utils.get_event_loop().run_until_complete(anext(_consume_stream())) return result.StreamedRunResultSync(async_result) ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------------------- | ---------------------------- | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) # unfortunately this hack of returning a generator rather than defining it right here is # required to allow overloads of this method to work in python's typing system, or at least with pyright # or at least I couldn't make it work without return self._run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ raise NotImplementedError yield ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------ | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @contextmanager @abstractmethod def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ raise NotImplementedError yield ``` #### parallel_tool_call_execution_mode ```python parallel_tool_call_execution_mode( mode: ParallelExecutionMode = "parallel", ) -> Iterator[None] ``` Set the parallel execution mode during the context. Parameters: | Name | Type | Description | Default | | ------ | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ | | `mode` | `ParallelExecutionMode` | The execution mode for tool calls: - 'parallel': Run tool calls in parallel, yielding events as they complete (default). - 'sequential': Run tool calls one at a time in order. - 'parallel_ordered_events': Run tool calls in parallel, but events are emitted in order, after all calls complete. | `'parallel'` | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod @contextmanager def parallel_tool_call_execution_mode(mode: _tool_manager.ParallelExecutionMode = 'parallel') -> Iterator[None]: """Set the parallel execution mode during the context. Args: mode: The execution mode for tool calls: - 'parallel': Run tool calls in parallel, yielding events as they complete (default). - 'sequential': Run tool calls one at a time in order. - 'parallel_ordered_events': Run tool calls in parallel, but events are emitted in order, after all calls complete. """ with ToolManager.parallel_execution_mode(mode): yield ``` #### sequential_tool_calls ```python sequential_tool_calls() -> Iterator[None] ``` Deprecated Use `parallel_execution_mode("sequential")` instead. Run tool calls sequentially during the context. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod @contextmanager @deprecated('Use `parallel_execution_mode("sequential")` instead.') def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.parallel_execution_mode('sequential'): yield ``` #### is_model_request_node ```python is_model_request_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[ModelRequestNode[T, S]] ``` Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) ``` #### is_call_tools_node ```python is_call_tools_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[CallToolsNode[T, S]] ``` Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) ``` #### is_user_prompt_node ```python is_user_prompt_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[UserPromptNode[T, S]] ``` Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) ``` #### is_end_node ```python is_end_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[End[FinalResult[S]]] ``` Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) ``` #### to_ag_ui ```python to_ag_ui( *, output_type: OutputSpec[OutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: ( Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None ) = None ) -> AGUIApp[AgentDepsT, OutputDataT] ``` Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use pydantic_ai.ag_ui.run_ag_ui or pydantic_ai.ag_ui.handle_ag_ui_request instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](https://ai.pydantic.dev/ui/ag-ui/index.md) for more information. Parameters: | Name | Type | Description | Default | | ----------------------- | ---------------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `output_type` | \`OutputSpec[OutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | \`Sequence[BaseRoute] | None\` | A list of routes to serve incoming HTTP and WebSocket requests. | | `middleware` | \`Sequence[Middleware] | None\` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | | `exception_handlers` | \`Mapping[Any, ExceptionHandler] | None\` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | | `on_startup` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | | `on_shutdown` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | | `lifespan` | \`Lifespan\[AGUIApp[AgentDepsT, OutputDataT]\] | None\` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | Returns: | Type | Description | | ---------------------------------- | ------------------------------------------------------------------------------- | | `AGUIApp[AgentDepsT, OutputDataT]` | An ASGI application for running Pydantic AI agents with AG-UI protocol support. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ui/ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from pydantic_ai.ui.ag_ui.app import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) ```` #### to_a2a ```python to_a2a( *, storage: Storage | None = None, broker: Broker | None = None, name: str | None = None, url: str = "http://localhost:8000", version: str = "1.0.0", description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( dict[Any, ExceptionHandler] | None ) = None, lifespan: Lifespan[FastA2A] | None = None ) -> FastA2A ``` Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) ```` #### to_cli ```python to_cli( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: Sequence[ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, ) -> None ``` Run the agent in a CLI chat interface. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------ | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | Example: agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, model_settings=model_settings, usage_limits=usage_limits, ) ```` #### to_cli_sync ```python to_cli_sync( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: Sequence[ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, ) -> None ``` Run the agent in a CLI chat interface with the non-async interface. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------------ | ---------------------------------------------------------------------- | ------------------------------------------------------ | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return _utils.get_event_loop().run_until_complete( self.to_cli( deps=deps, prog_name=prog_name, message_history=message_history, model_settings=model_settings, usage_limits=usage_limits, ) ) ```` ### WrapperAgent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Agent which wraps another agent. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python class WrapperAgent(AbstractAgent[AgentDepsT, OutputDataT]): """Agent which wraps another agent. Does nothing on its own, used as a base class. """ def __init__(self, wrapped: AbstractAgent[AgentDepsT, OutputDataT]): self.wrapped = wrapped @property def model(self) -> models.Model | models.KnownModelName | str | None: return self.wrapped.model @property def name(self) -> str | None: return self.wrapped.name @name.setter def name(self, value: str | None) -> None: self.wrapped.name = value @property def deps_type(self) -> type: return self.wrapped.deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: return self.wrapped.output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: return self.wrapped.event_stream_handler @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: return self.wrapped.toolsets async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: return await self.wrapped.__aenter__() async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) def output_json_schema(self, output_type: OutputSpec[OutputDataT | RunOutputDataT] | None = None) -> JsonSchema: return self.wrapped.output_json_schema(output_type=output_type) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ with self.wrapped.override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------ | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ with self.wrapped.override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### AgentRun Bases: `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an Agent. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an End is reached, the run finishes and result becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the next method for more granular control. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python @dataclasses.dataclass(repr=False) class AgentRun(Generic[AgentDepsT, OutputDataT]): """A stateful, async-iterable run of an [`Agent`][pydantic_ai.agent.Agent]. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`][pydantic_graph.nodes.End] is reached, the run finishes and [`result`][pydantic_ai.agent.AgentRun.result] becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`][pydantic_ai.agent.AgentRun.next] method for more granular control. """ _graph_run: GraphRun[ _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[OutputDataT] ] @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: traceparent = self._graph_run._traceparent(required=False) # type: ignore[reportPrivateUsage] if traceparent is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return traceparent @property def ctx(self) -> GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]: """The current context of the agent run.""" return GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]( state=self._graph_run.state, deps=self._graph_run.deps ) @property def next_node( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ task = self._graph_run.next_task return self._task_to_node(task) @property def result(self) -> AgentRunResult[OutputDataT] | None: """The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`][pydantic_graph.nodes.End] node, `result` is populated with an [`AgentRunResult`][pydantic_ai.agent.AgentRunResult]. """ graph_run_output = self._graph_run.output if graph_run_output is None: return None return AgentRunResult( graph_run_output.output, graph_run_output.tool_name, self._graph_run.state, self._graph_run.deps.new_message_index, self._traceparent(required=False), ) def all_messages(self) -> list[_messages.ModelMessage]: """Return all messages for the run so far. Messages from older runs are included. """ return self.ctx.state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRun.all_messages] as JSON bytes. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.all_messages()) def new_messages(self) -> list[_messages.ModelMessage]: """Return new messages for the run so far. Messages from older runs are excluded. """ return self.all_messages()[self.ctx.deps.new_message_index :] def new_messages_json(self) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRun.new_messages] as JSON bytes. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.new_messages()) def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" task = await anext(self._graph_run) return self._task_to_node(task) def _task_to_node( self, task: EndMarker[FinalResult[OutputDataT]] | JoinItem | Sequence[GraphTaskRequest] ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: if isinstance(task, Sequence) and len(task) == 1: first_task = task[0] if isinstance(first_task.inputs, BaseNode): # pragma: no branch base_node: BaseNode[ # pyright: ignore[reportUnknownVariableType] _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, OutputDataT], FinalResult[OutputDataT], ] = first_task.inputs # pyright: ignore[reportUnknownMemberType] if _agent_graph.is_agent_node(node=base_node): # pragma: no branch return base_node if isinstance(task, EndMarker): return End(task.value) raise exceptions.AgentRunError(f'Unexpected node: {task}') # pragma: no cover def _node_to_task(self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT]) -> GraphTaskRequest: return GraphTaskRequest(NodeStep(type(node)).id, inputs=node, fork_stack=()) async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. task = [self._node_to_task(node)] try: task = await self._graph_run.next(task) except StopAsyncIteration: pass return self._task_to_node(task) # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" return self._graph_run.state.metadata @property def run_id(self) -> str: """The unique identifier for the agent run.""" return self._graph_run.state.run_id def __repr__(self) -> str: # pragma: no cover result = self._graph_run.output result_repr = '' if result is None else repr(result.output) return f'<{type(self).__name__} result={result_repr} usage={self.usage()}>' ```` #### ctx ```python ctx: GraphRunContext[ GraphAgentState, GraphAgentDeps[AgentDepsT, Any] ] ``` The current context of the agent run. #### next_node ```python next_node: ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: AgentRunResult[OutputDataT] | None ``` The final result of the run if it has ended, otherwise `None`. Once the run returns an End node, `result` is populated with an AgentRunResult. #### all_messages ```python all_messages() -> list[ModelMessage] ``` Return all messages for the run so far. Messages from older runs are included. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self) -> list[_messages.ModelMessage]: """Return all messages for the run so far. Messages from older runs are included. """ return self.ctx.state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRun.all_messages] as JSON bytes. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.all_messages()) ``` #### new_messages ```python new_messages() -> list[ModelMessage] ``` Return new messages for the run so far. Messages from older runs are excluded. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self) -> list[_messages.ModelMessage]: """Return new messages for the run so far. Messages from older runs are excluded. """ return self.all_messages()[self.ctx.deps.new_message_index :] ``` #### new_messages_json ```python new_messages_json() -> bytes ``` Return new messages from new_messages as JSON bytes. Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRun.new_messages] as JSON bytes. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.new_messages()) ``` #### __aiter__ ```python __aiter__() -> ( AsyncIterator[ AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ] ) ``` Provide async-iteration over the nodes in the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self ``` #### __anext__ ```python __anext__() -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Advance to the next node automatically based on the last returned node. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" task = await anext(self._graph_run) return self._task_to_node(task) ``` #### next ```python next( node: AgentNode[AgentDepsT, OutputDataT], ) -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an End node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ------ | ------------------------------------ | ---------------------------------- | ---------- | | `node` | `AgentNode[AgentDepsT, OutputDataT]` | The node to run next in the graph. | *required* | Returns: | Type | Description | | ------------------------------------ | --------------------------------- | | \`AgentNode[AgentDepsT, OutputDataT] | End\[FinalResult[OutputDataT]\]\` | | \`AgentNode[AgentDepsT, OutputDataT] | End\[FinalResult[OutputDataT]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. task = [self._node_to_task(node)] try: task = await self._graph_run.next(task) except StopAsyncIteration: pass return self._task_to_node(task) ```` #### usage ```python usage() -> RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage ``` #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### run_id ```python run_id: str ``` The unique identifier for the agent run. ### AgentRunResult Bases: `Generic[OutputDataT]` The final result of an agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass class AgentRunResult(Generic[OutputDataT]): """The final result of an agent run.""" output: OutputDataT """The output data from the agent run.""" _output_tool_name: str | None = dataclasses.field(repr=False, compare=False, default=None) _state: _agent_graph.GraphAgentState = dataclasses.field( repr=False, compare=False, default_factory=_agent_graph.GraphAgentState ) _new_message_index: int = dataclasses.field(repr=False, compare=False, default=0) _traceparent_value: str | None = dataclasses.field(repr=False, compare=False, default=None) @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self._traceparent_value is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return self._traceparent_value def _set_output_tool_return(self, return_content: str) -> list[_messages.ModelMessage]: """Set return content for the output tool. Useful if you want to continue the conversation and want to set the response to the output tool call. """ if not self._output_tool_name: raise ValueError('Cannot set output tool return content when the return type is `str`.') messages = self._state.message_history last_message = messages[-1] for idx, part in enumerate(last_message.parts): if isinstance(part, _messages.ToolReturnPart) and part.tool_name == self._output_tool_name: # Only do deepcopy when we have to modify copied_messages = list(messages) copied_last = deepcopy(last_message) copied_last.parts[idx].content = return_content # type: ignore[misc] copied_messages[-1] = copied_last return copied_messages raise LookupError(f'No tool call found with tool name {self._output_tool_name!r}.') def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @property def response(self) -> _messages.ModelResponse: """Return the last response from the message history.""" # The response may not be the very last item if it contained an output tool call. See `CallToolsNode._handle_final_result`. for message in reversed(self.all_messages()): if isinstance(message, _messages.ModelResponse): return message raise ValueError('No response found in the message history') # pragma: no cover # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage # TODO (v2): Make this a property def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" return self._state.metadata @property def run_id(self) -> str: """The unique identifier for the agent run.""" return self._state.run_id ``` #### output ```python output: OutputDataT ``` The output data from the agent run. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | ----------------- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | --------------------- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### response ```python response: ModelResponse ``` Return the last response from the message history. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage ``` #### timestamp ```python timestamp() -> datetime ``` Return the timestamp of last response. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### run_id ```python run_id: str ``` The unique identifier for the agent run. ### EndStrategy ```python EndStrategy = Literal['early', 'exhaustive'] ``` ### RunOutputDataT ```python RunOutputDataT = TypeVar('RunOutputDataT') ``` Type variable for the result data of a run where `output_type` was customized on the run call. ### capture_run_messages ```python capture_run_messages() -> Iterator[list[ModelMessage]] ``` Context manager to access the messages used in a run, run_sync, or run_stream call. Useful when a run may raise an exception, see [model errors](https://ai.pydantic.dev/agent/#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` Note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. Source code in `pydantic_ai_slim/pydantic_ai/_agent_graph.py` ````python @contextmanager def capture_run_messages() -> Iterator[list[_messages.ModelMessage]]: """Context manager to access the messages used in a [`run`][pydantic_ai.agent.AbstractAgent.run], [`run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], or [`run_stream`][pydantic_ai.agent.AbstractAgent.run_stream] call. Useful when a run may raise an exception, see [model errors](../agent.md#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` !!! note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. """ token = None messages: list[_messages.ModelMessage] = [] # Try to reuse existing message context if available try: messages = _messages_ctx_var.get().messages except LookupError: # No existing context, create a new one token = _messages_ctx_var.set(_RunMessages(messages)) try: yield messages finally: # Clean up context if we created it if token is not None: _messages_ctx_var.reset(token) ```` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) logger: Logger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. logger_provider: The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() logger_provider = logger_provider or get_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.logger = logger_provider.get_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) def messages_to_otel_events( self, messages: list[ModelMessage], parameters: ModelRequestParameters | None = None ) -> list[LogRecord]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. parameters: The model request parameters. Returns: A list of OpenTelemetry events. """ events: list[LogRecord] = [] instructions = InstrumentedModel._get_instructions(messages, parameters) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( LogRecord( attributes={'event.name': 'gen_ai.system.message'}, body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[LogRecord] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages( self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span, parameters: ModelRequestParameters | None = None, ): if self.version == 1: events = self.messages_to_otel_events(input_messages, parameters) for event in self.messages_to_otel_events([response], parameters): events.append( LogRecord( attributes={'event.name': 'gen_ai.choice'}, body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages, parameters) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[LogRecord]) -> None: if self.event_mode == 'logs': for event in events: self.logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) def record_metrics( self, response: ModelResponse, price_calculation: PriceCalculation | None, attributes: dict[str, AttributeValue], ): for typ in ['input', 'output']: if not (tokens := getattr(response.usage, f'{typ}_tokens', 0)): # pragma: no cover continue token_attributes = {**attributes, 'gen_ai.token.type': typ} self.tokens_histogram.record(tokens, token_attributes) if price_calculation: cost = float(getattr(price_calculation, f'{typ}_price')) self.cost_histogram.record(cost, token_attributes) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[ 1, 2, 3 ] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal[ "attributes", "logs" ] = "attributes", logger_provider: LoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | ------------------------ | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `tracer_provider` | \`TracerProvider | None\` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | | `meter_provider` | \`MeterProvider | None\` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2, 3]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `DEFAULT_INSTRUMENTATION_VERSION` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `logger_provider` | \`LoggerProvider | None\` | The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling logfire.configure() sets the global logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. logger_provider: The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() logger_provider = logger_provider or get_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.logger = logger_provider.get_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], parameters: ModelRequestParameters | None = None, ) -> list[LogRecord] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------ | ------------------------ | ----------------------------- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | | `parameters` | \`ModelRequestParameters | None\` | The model request parameters. | Returns: | Type | Description | | ----------------- | ------------------------------- | | `list[LogRecord]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events( self, messages: list[ModelMessage], parameters: ModelRequestParameters | None = None ) -> list[LogRecord]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. parameters: The model request parameters. Returns: A list of OpenTelemetry events. """ events: list[LogRecord] = [] instructions = InstrumentedModel._get_instructions(messages, parameters) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( LogRecord( attributes={'event.name': 'gen_ai.system.message'}, body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[LogRecord] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### EventStreamHandler ```python EventStreamHandler: TypeAlias = Callable[ [ RunContext[AgentDepsT], AsyncIterable[AgentStreamEvent], ], Awaitable[None], ] ``` A function that receives agent RunContext and an async iterable of events from the model's streaming response and the agent's execution of tools. # `pydantic_ai.builtin_tools` ### BUILTIN_TOOL_TYPES ```python BUILTIN_TOOL_TYPES: dict[str, type[AbstractBuiltinTool]] = ( {} ) ``` Registry of all builtin tool types, keyed by their kind string. This dict is populated automatically via `__init_subclass__` when tool classes are defined. ### ImageAspectRatio ```python ImageAspectRatio = Literal[ "21:9", "16:9", "4:3", "3:2", "1:1", "9:16", "3:4", "2:3", "5:4", "4:5", ] ``` Supported aspect ratios for image generation tools. ### AbstractBuiltinTool Bases: `ABC` A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class AbstractBuiltinTool(ABC): """A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. """ kind: str = 'unknown_builtin_tool' """Built-in tool identifier, this should be available on all built-in tools as a discriminator.""" @property def unique_id(self) -> str: """A unique identifier for the builtin tool. If multiple instances of the same builtin tool can be passed to the model, subclasses should override this property to allow them to be distinguished. """ return self.kind @property def label(self) -> str: """Human-readable label for UI display. Subclasses should override this to provide a meaningful label. """ return self.kind.replace('_', ' ').title() def __init_subclass__(cls, **kwargs: Any) -> None: super().__init_subclass__(**kwargs) BUILTIN_TOOL_TYPES[cls.kind] = cls @classmethod def __get_pydantic_core_schema__( cls, _source_type: Any, handler: pydantic.GetCoreSchemaHandler ) -> core_schema.CoreSchema: if cls is not AbstractBuiltinTool: return handler(cls) tools = BUILTIN_TOOL_TYPES.values() if len(tools) == 1: # pragma: no cover tools_type = next(iter(tools)) else: tools_annotated = [Annotated[tool, pydantic.Tag(tool.kind)] for tool in tools] tools_type = Annotated[Union[tuple(tools_annotated)], pydantic.Discriminator(_tool_discriminator)] # noqa: UP007 return handler(tools_type) ``` #### kind ```python kind: str = 'unknown_builtin_tool' ``` Built-in tool identifier, this should be available on all built-in tools as a discriminator. #### unique_id ```python unique_id: str ``` A unique identifier for the builtin tool. If multiple instances of the same builtin tool can be passed to the model, subclasses should override this property to allow them to be distinguished. #### label ```python label: str ``` Human-readable label for UI display. Subclasses should override this to provide a meaningful label. ### WebSearchTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - OpenAI Responses - Groq - Google - xAI Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class WebSearchTool(AbstractBuiltinTool): """A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: * Anthropic * OpenAI Responses * Groq * Google * xAI """ search_context_size: Literal['low', 'medium', 'high'] = 'medium' """The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: * OpenAI Responses """ user_location: WebSearchUserLocation | None = None """The `user_location` parameter allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ blocked_domains: list[str] | None = None """If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see * xAI, see """ allowed_domains: list[str] | None = None """If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see * OpenAI Responses, see * xAI, see """ max_uses: int | None = None """If provided, the tool will stop searching the web after the given number of uses. Supported by: * Anthropic """ kind: str = 'web_search' """The kind of tool.""" ``` #### search_context_size ```python search_context_size: Literal["low", "medium", "high"] = ( "medium" ) ``` The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: - OpenAI Responses #### user_location ```python user_location: WebSearchUserLocation | None = None ``` The `user_location` parameter allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses #### blocked_domains ```python blocked_domains: list[str] | None = None ``` If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see - xAI, see #### allowed_domains ```python allowed_domains: list[str] | None = None ``` If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see - OpenAI Responses, see - xAI, see #### max_uses ```python max_uses: int | None = None ``` If provided, the tool will stop searching the web after the given number of uses. Supported by: - Anthropic #### kind ```python kind: str = 'web_search' ``` The kind of tool. ### WebSearchUserLocation Bases: `TypedDict` Allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class WebSearchUserLocation(TypedDict, total=False): """Allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ city: str """The city where the user is located.""" country: str """The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB').""" region: str """The region or state where the user is located.""" timezone: str """The timezone of the user's location.""" ``` #### city ```python city: str ``` The city where the user is located. #### country ```python country: str ``` The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB'). #### region ```python region: str ``` The region or state where the user is located. #### timezone ```python timezone: str ``` The timezone of the user's location. ### CodeExecutionTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to execute code. Supported by: - Anthropic - OpenAI Responses - Google - Bedrock (Nova2.0) - xAI Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class CodeExecutionTool(AbstractBuiltinTool): """A builtin tool that allows your agent to execute code. Supported by: * Anthropic * OpenAI Responses * Google * Bedrock (Nova2.0) * xAI """ kind: str = 'code_execution' """The kind of tool.""" ``` #### kind ```python kind: str = 'code_execution' ``` The kind of tool. ### WebFetchTool Bases: `AbstractBuiltinTool` Allows your agent to access contents from URLs. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class WebFetchTool(AbstractBuiltinTool): """Allows your agent to access contents from URLs. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: * Anthropic * Google """ max_uses: int | None = None """If provided, the tool will stop fetching URLs after the given number of uses. Supported by: * Anthropic """ allowed_domains: list[str] | None = None """If provided, only these domains will be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see """ blocked_domains: list[str] | None = None """If provided, these domains will never be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see """ enable_citations: bool = False """If True, enables citations for fetched content. Supported by: * Anthropic """ max_content_tokens: int | None = None """Maximum content length in tokens for fetched content. Supported by: * Anthropic """ kind: str = 'web_fetch' """The kind of tool.""" ``` #### max_uses ```python max_uses: int | None = None ``` If provided, the tool will stop fetching URLs after the given number of uses. Supported by: - Anthropic #### allowed_domains ```python allowed_domains: list[str] | None = None ``` If provided, only these domains will be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see #### blocked_domains ```python blocked_domains: list[str] | None = None ``` If provided, these domains will never be fetched. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see #### enable_citations ```python enable_citations: bool = False ``` If True, enables citations for fetched content. Supported by: - Anthropic #### max_content_tokens ```python max_content_tokens: int | None = None ``` Maximum content length in tokens for fetched content. Supported by: - Anthropic #### kind ```python kind: str = 'web_fetch' ``` The kind of tool. ### UrlContextTool Bases: `WebFetchTool` Deprecated Use `WebFetchTool` instead. Deprecated alias for WebFetchTool. Use WebFetchTool instead. Overrides kind to 'url_context' so old serialized payloads with {"kind": "url_context", ...} can be deserialized to UrlContextTool for backward compatibility. Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @deprecated('Use `WebFetchTool` instead.') @dataclass(kw_only=True) class UrlContextTool(WebFetchTool): """Deprecated alias for WebFetchTool. Use WebFetchTool instead. Overrides kind to 'url_context' so old serialized payloads with {"kind": "url_context", ...} can be deserialized to UrlContextTool for backward compatibility. """ kind: str = 'url_context' """The kind of tool (deprecated value for backward compatibility).""" ``` #### kind ```python kind: str = 'url_context' ``` The kind of tool (deprecated value for backward compatibility). ### ImageGenerationTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to generate images. Supported by: - OpenAI Responses - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class ImageGenerationTool(AbstractBuiltinTool): """A builtin tool that allows your agent to generate images. Supported by: * OpenAI Responses * Google """ background: Literal['transparent', 'opaque', 'auto'] = 'auto' """Background type for the generated image. Supported by: * OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats. """ input_fidelity: Literal['high', 'low'] | None = None """ Control how much effort the model will exert to match the style and features, especially facial features, of input images. Supported by: * OpenAI Responses. Default: 'low'. """ moderation: Literal['auto', 'low'] = 'auto' """Moderation level for the generated image. Supported by: * OpenAI Responses """ output_compression: int | None = None """Compression level for the output image. Supported by: * OpenAI Responses. Only supported for 'jpeg' and 'webp' output formats. Default: 100. * Google (Vertex AI only). Only supported for 'jpeg' output format. Default: 75. Setting this will default `output_format` to 'jpeg' if not specified. """ output_format: Literal['png', 'webp', 'jpeg'] | None = None """The output format of the generated image. Supported by: * OpenAI Responses. Default: 'png'. * Google (Vertex AI only). Default: 'png', or 'jpeg' if `output_compression` is set. """ partial_images: int = 0 """ Number of partial images to generate in streaming mode. Supported by: * OpenAI Responses. Supports 0 to 3. """ quality: Literal['low', 'medium', 'high', 'auto'] = 'auto' """The quality of the generated image. Supported by: * OpenAI Responses """ size: Literal['auto', '1024x1024', '1024x1536', '1536x1024', '1K', '2K', '4K'] | None = None """The size of the generated image. * OpenAI Responses: 'auto' (default: model selects the size based on the prompt), '1024x1024', '1024x1536', '1536x1024' * Google (Gemini 3 Pro Image and later): '1K' (default), '2K', '4K' """ aspect_ratio: ImageAspectRatio | None = None """The aspect ratio to use for generated images. Supported by: * Google image-generation models (Gemini) * OpenAI Responses (maps '1:1', '2:3', and '3:2' to supported sizes) """ kind: str = 'image_generation' """The kind of tool.""" ``` #### background ```python background: Literal["transparent", "opaque", "auto"] = ( "auto" ) ``` Background type for the generated image. Supported by: - OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats. #### input_fidelity ```python input_fidelity: Literal['high', 'low'] | None = None ``` Control how much effort the model will exert to match the style and features, especially facial features, of input images. Supported by: - OpenAI Responses. Default: 'low'. #### moderation ```python moderation: Literal['auto', 'low'] = 'auto' ``` Moderation level for the generated image. Supported by: - OpenAI Responses #### output_compression ```python output_compression: int | None = None ``` Compression level for the output image. Supported by: - OpenAI Responses. Only supported for 'jpeg' and 'webp' output formats. Default: 100. - Google (Vertex AI only). Only supported for 'jpeg' output format. Default: 75. Setting this will default `output_format` to 'jpeg' if not specified. #### output_format ```python output_format: Literal['png', 'webp', 'jpeg'] | None = None ``` The output format of the generated image. Supported by: - OpenAI Responses. Default: 'png'. - Google (Vertex AI only). Default: 'png', or 'jpeg' if `output_compression` is set. #### partial_images ```python partial_images: int = 0 ``` Number of partial images to generate in streaming mode. Supported by: - OpenAI Responses. Supports 0 to 3. #### quality ```python quality: Literal['low', 'medium', 'high', 'auto'] = 'auto' ``` The quality of the generated image. Supported by: - OpenAI Responses #### size ```python size: ( Literal[ "auto", "1024x1024", "1024x1536", "1536x1024", "1K", "2K", "4K", ] | None ) = None ``` The size of the generated image. - OpenAI Responses: 'auto' (default: model selects the size based on the prompt), '1024x1024', '1024x1536', '1536x1024' - Google (Gemini 3 Pro Image and later): '1K' (default), '2K', '4K' #### aspect_ratio ```python aspect_ratio: ImageAspectRatio | None = None ``` The aspect ratio to use for generated images. Supported by: - Google image-generation models (Gemini) - OpenAI Responses (maps '1:1', '2:3', and '3:2' to supported sizes) #### kind ```python kind: str = 'image_generation' ``` The kind of tool. ### MemoryTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to use memory. Supported by: - Anthropic Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class MemoryTool(AbstractBuiltinTool): """A builtin tool that allows your agent to use memory. Supported by: * Anthropic """ kind: str = 'memory' """The kind of tool.""" ``` #### kind ```python kind: str = 'memory' ``` The kind of tool. ### MCPServerTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to use MCP servers. Supported by: - OpenAI Responses - Anthropic - xAI Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class MCPServerTool(AbstractBuiltinTool): """A builtin tool that allows your agent to use MCP servers. Supported by: * OpenAI Responses * Anthropic * xAI """ id: str """A unique identifier for the MCP server.""" url: str """The URL of the MCP server to use. For OpenAI Responses, it is possible to use `connector_id` by providing it as `x-openai-connector:`. """ authorization_token: str | None = None """Authorization header to use when making requests to the MCP server. Supported by: * OpenAI Responses * Anthropic * xAI """ description: str | None = None """A description of the MCP server. Supported by: * OpenAI Responses * xAI """ allowed_tools: list[str] | None = None """A list of tools that the MCP server can use. Supported by: * OpenAI Responses * Anthropic * xAI """ headers: dict[str, str] | None = None """Optional HTTP headers to send to the MCP server. Use for authentication or other purposes. Supported by: * OpenAI Responses * xAI """ kind: str = 'mcp_server' @property def unique_id(self) -> str: return ':'.join([self.kind, self.id]) @property def label(self) -> str: return f'MCP: {self.id}' ``` #### id ```python id: str ``` A unique identifier for the MCP server. #### url ```python url: str ``` The URL of the MCP server to use. For OpenAI Responses, it is possible to use `connector_id` by providing it as `x-openai-connector:`. #### authorization_token ```python authorization_token: str | None = None ``` Authorization header to use when making requests to the MCP server. Supported by: - OpenAI Responses - Anthropic - xAI #### description ```python description: str | None = None ``` A description of the MCP server. Supported by: - OpenAI Responses - xAI #### allowed_tools ```python allowed_tools: list[str] | None = None ``` A list of tools that the MCP server can use. Supported by: - OpenAI Responses - Anthropic - xAI #### headers ```python headers: dict[str, str] | None = None ``` Optional HTTP headers to send to the MCP server. Use for authentication or other purposes. Supported by: - OpenAI Responses - xAI ### FileSearchTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to search through uploaded files using vector search. This tool provides a fully managed Retrieval-Augmented Generation (RAG) system that handles file storage, chunking, embedding generation, and context injection into prompts. Supported by: - OpenAI Responses - Google (Gemini) Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class FileSearchTool(AbstractBuiltinTool): """A builtin tool that allows your agent to search through uploaded files using vector search. This tool provides a fully managed Retrieval-Augmented Generation (RAG) system that handles file storage, chunking, embedding generation, and context injection into prompts. Supported by: * OpenAI Responses * Google (Gemini) """ file_store_ids: Sequence[str] """The file store IDs to search through. For OpenAI, these are the IDs of vector stores created via the OpenAI API. For Google, these are file search store names that have been uploaded and processed via the Gemini Files API. """ kind: str = 'file_search' """The kind of tool.""" ``` #### file_store_ids ```python file_store_ids: Sequence[str] ``` The file store IDs to search through. For OpenAI, these are the IDs of vector stores created via the OpenAI API. For Google, these are file search store names that have been uploaded and processed via the Gemini Files API. #### kind ```python kind: str = 'file_search' ``` The kind of tool. ### DEPRECATED_BUILTIN_TOOLS ```python DEPRECATED_BUILTIN_TOOLS: frozenset[ type[AbstractBuiltinTool] ] = frozenset({UrlContextTool}) ``` Set of deprecated builtin tool IDs that should not be offered in new UIs. ### SUPPORTED_BUILTIN_TOOLS ```python SUPPORTED_BUILTIN_TOOLS = frozenset( cls for cls in (values()) if cls not in DEPRECATED_BUILTIN_TOOLS ) ``` Get the set of all builtin tool types (excluding deprecated tools). # `pydantic_ai.common_tools` ### DuckDuckGoResult Bases: `TypedDict` A DuckDuckGo search result. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python class DuckDuckGoResult(TypedDict): """A DuckDuckGo search result.""" title: str """The title of the search result.""" href: str """The URL of the search result.""" body: str """The body of the search result.""" ``` #### title ```python title: str ``` The title of the search result. #### href ```python href: str ``` The URL of the search result. #### body ```python body: str ``` The body of the search result. ### DuckDuckGoSearchTool The DuckDuckGo search tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python @dataclass class DuckDuckGoSearchTool: """The DuckDuckGo search tool.""" client: DDGS """The DuckDuckGo search client.""" _: KW_ONLY max_results: int | None """The maximum number of results. If None, returns results only from the first response.""" async def __call__(self, query: str) -> list[DuckDuckGoResult]: """Searches DuckDuckGo for the given query and returns the results. Args: query: The query to search for. Returns: The search results. """ search = functools.partial(self.client.text, max_results=self.max_results) results = await anyio.to_thread.run_sync(search, query) return duckduckgo_ta.validate_python(results) ``` #### client ```python client: DDGS ``` The DuckDuckGo search client. #### max_results ```python max_results: int | None ``` The maximum number of results. If None, returns results only from the first response. #### __call__ ```python __call__(query: str) -> list[DuckDuckGoResult] ``` Searches DuckDuckGo for the given query and returns the results. Parameters: | Name | Type | Description | Default | | ------- | ----- | ------------------------ | ---------- | | `query` | `str` | The query to search for. | *required* | Returns: | Type | Description | | ------------------------ | ------------------- | | `list[DuckDuckGoResult]` | The search results. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python async def __call__(self, query: str) -> list[DuckDuckGoResult]: """Searches DuckDuckGo for the given query and returns the results. Args: query: The query to search for. Returns: The search results. """ search = functools.partial(self.client.text, max_results=self.max_results) results = await anyio.to_thread.run_sync(search, query) return duckduckgo_ta.validate_python(results) ``` ### duckduckgo_search_tool ```python duckduckgo_search_tool( duckduckgo_client: DDGS | None = None, max_results: int | None = None, ) ``` Creates a DuckDuckGo search tool. Parameters: | Name | Type | Description | Default | | ------------------- | ------ | ----------- | ------------------------------------------------------------------------------------- | | `duckduckgo_client` | \`DDGS | None\` | The DuckDuckGo search client. | | `max_results` | \`int | None\` | The maximum number of results. If None, returns results only from the first response. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python def duckduckgo_search_tool(duckduckgo_client: DDGS | None = None, max_results: int | None = None): """Creates a DuckDuckGo search tool. Args: duckduckgo_client: The DuckDuckGo search client. max_results: The maximum number of results. If None, returns results only from the first response. """ return Tool[Any]( DuckDuckGoSearchTool(client=duckduckgo_client or DDGS(), max_results=max_results).__call__, name='duckduckgo_search', description='Searches DuckDuckGo for the given query and returns the results.', ) ``` Exa tools for Pydantic AI agents. Provides web search, content retrieval, and AI-powered answer capabilities using the Exa API, a neural search engine that finds high-quality, relevant results across billions of web pages. ### ExaSearchResult Bases: `TypedDict` An Exa search result with content. See [Exa Search API documentation](https://docs.exa.ai/reference/search) for more information. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python class ExaSearchResult(TypedDict): """An Exa search result with content. See [Exa Search API documentation](https://docs.exa.ai/reference/search) for more information. """ title: str """The title of the search result.""" url: str """The URL of the search result.""" published_date: str | None """The published date of the content, if available.""" author: str | None """The author of the content, if available.""" text: str """The text content of the search result.""" ``` #### title ```python title: str ``` The title of the search result. #### url ```python url: str ``` The URL of the search result. #### published_date ```python published_date: str | None ``` The published date of the content, if available. #### author ```python author: str | None ``` The author of the content, if available. #### text ```python text: str ``` The text content of the search result. ### ExaAnswerResult Bases: `TypedDict` An Exa answer result with citations. See [Exa Answer API documentation](https://docs.exa.ai/reference/answer) for more information. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python class ExaAnswerResult(TypedDict): """An Exa answer result with citations. See [Exa Answer API documentation](https://docs.exa.ai/reference/answer) for more information. """ answer: str """The AI-generated answer to the query.""" citations: list[dict[str, Any]] """Citations supporting the answer.""" ``` #### answer ```python answer: str ``` The AI-generated answer to the query. #### citations ```python citations: list[dict[str, Any]] ``` Citations supporting the answer. ### ExaContentResult Bases: `TypedDict` Content retrieved from a URL. See [Exa Contents API documentation](https://docs.exa.ai/reference/get-contents) for more information. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python class ExaContentResult(TypedDict): """Content retrieved from a URL. See [Exa Contents API documentation](https://docs.exa.ai/reference/get-contents) for more information. """ url: str """The URL of the content.""" title: str """The title of the page.""" text: str """The text content of the page.""" author: str | None """The author of the content, if available.""" published_date: str | None """The published date of the content, if available.""" ``` #### url ```python url: str ``` The URL of the content. #### title ```python title: str ``` The title of the page. #### text ```python text: str ``` The text content of the page. #### author ```python author: str | None ``` The author of the content, if available. #### published_date ```python published_date: str | None ``` The published date of the content, if available. ### ExaSearchTool The Exa search tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python @dataclass class ExaSearchTool: """The Exa search tool.""" client: AsyncExa """The Exa async client.""" num_results: int """The number of results to return.""" max_characters: int | None """Maximum characters of text content per result, or None for no limit.""" async def __call__( self, query: str, search_type: Literal['auto', 'keyword', 'neural', 'fast', 'deep'] = 'auto', ) -> list[ExaSearchResult]: """Searches Exa for the given query and returns the results with content. Args: query: The search query to execute with Exa. search_type: The type of search to perform. 'auto' automatically chooses the best search type, 'keyword' for exact matches, 'neural' for semantic search, 'fast' for speed-optimized search, 'deep' for comprehensive multi-query search. Returns: The search results with text content. """ text_config: bool | dict[str, int] = {'maxCharacters': self.max_characters} if self.max_characters else True response = await self.client.search( # pyright: ignore[reportUnknownMemberType] query, num_results=self.num_results, type=search_type, contents={'text': text_config}, ) return [ ExaSearchResult( title=result.title or '', url=result.url, published_date=result.published_date, author=result.author, text=result.text or '', ) for result in response.results ] ``` #### client ```python client: AsyncExa ``` The Exa async client. #### num_results ```python num_results: int ``` The number of results to return. #### max_characters ```python max_characters: int | None ``` Maximum characters of text content per result, or None for no limit. #### __call__ ```python __call__( query: str, search_type: Literal[ "auto", "keyword", "neural", "fast", "deep" ] = "auto", ) -> list[ExaSearchResult] ``` Searches Exa for the given query and returns the results with content. Parameters: | Name | Type | Description | Default | | ------------- | ------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | | `query` | `str` | The search query to execute with Exa. | *required* | | `search_type` | `Literal['auto', 'keyword', 'neural', 'fast', 'deep']` | The type of search to perform. 'auto' automatically chooses the best search type, 'keyword' for exact matches, 'neural' for semantic search, 'fast' for speed-optimized search, 'deep' for comprehensive multi-query search. | `'auto'` | Returns: | Type | Description | | ----------------------- | ------------------------------------- | | `list[ExaSearchResult]` | The search results with text content. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python async def __call__( self, query: str, search_type: Literal['auto', 'keyword', 'neural', 'fast', 'deep'] = 'auto', ) -> list[ExaSearchResult]: """Searches Exa for the given query and returns the results with content. Args: query: The search query to execute with Exa. search_type: The type of search to perform. 'auto' automatically chooses the best search type, 'keyword' for exact matches, 'neural' for semantic search, 'fast' for speed-optimized search, 'deep' for comprehensive multi-query search. Returns: The search results with text content. """ text_config: bool | dict[str, int] = {'maxCharacters': self.max_characters} if self.max_characters else True response = await self.client.search( # pyright: ignore[reportUnknownMemberType] query, num_results=self.num_results, type=search_type, contents={'text': text_config}, ) return [ ExaSearchResult( title=result.title or '', url=result.url, published_date=result.published_date, author=result.author, text=result.text or '', ) for result in response.results ] ``` ### ExaFindSimilarTool The Exa find similar tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python @dataclass class ExaFindSimilarTool: """The Exa find similar tool.""" client: AsyncExa """The Exa async client.""" num_results: int """The number of results to return.""" async def __call__( self, url: str, exclude_source_domain: bool = True, ) -> list[ExaSearchResult]: """Finds pages similar to the given URL and returns them with content. Args: url: The URL to find similar pages for. exclude_source_domain: Whether to exclude results from the same domain as the input URL. Defaults to True. Returns: Similar pages with text content. """ response = await self.client.find_similar( # pyright: ignore[reportUnknownMemberType] url, num_results=self.num_results, exclude_source_domain=exclude_source_domain, contents={'text': True}, ) return [ ExaSearchResult( title=result.title or '', url=result.url, published_date=result.published_date, author=result.author, text=result.text or '', ) for result in response.results ] ``` #### client ```python client: AsyncExa ``` The Exa async client. #### num_results ```python num_results: int ``` The number of results to return. #### __call__ ```python __call__( url: str, exclude_source_domain: bool = True ) -> list[ExaSearchResult] ``` Finds pages similar to the given URL and returns them with content. Parameters: | Name | Type | Description | Default | | ----------------------- | ------ | ----------------------------------------------------------------------------------- | ---------- | | `url` | `str` | The URL to find similar pages for. | *required* | | `exclude_source_domain` | `bool` | Whether to exclude results from the same domain as the input URL. Defaults to True. | `True` | Returns: | Type | Description | | ----------------------- | -------------------------------- | | `list[ExaSearchResult]` | Similar pages with text content. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python async def __call__( self, url: str, exclude_source_domain: bool = True, ) -> list[ExaSearchResult]: """Finds pages similar to the given URL and returns them with content. Args: url: The URL to find similar pages for. exclude_source_domain: Whether to exclude results from the same domain as the input URL. Defaults to True. Returns: Similar pages with text content. """ response = await self.client.find_similar( # pyright: ignore[reportUnknownMemberType] url, num_results=self.num_results, exclude_source_domain=exclude_source_domain, contents={'text': True}, ) return [ ExaSearchResult( title=result.title or '', url=result.url, published_date=result.published_date, author=result.author, text=result.text or '', ) for result in response.results ] ``` ### ExaGetContentsTool The Exa get contents tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python @dataclass class ExaGetContentsTool: """The Exa get contents tool.""" client: AsyncExa """The Exa async client.""" async def __call__( self, urls: list[str], ) -> list[ExaContentResult]: """Gets the content of the specified URLs. Args: urls: A list of URLs to get content for. Returns: The content of each URL. """ response = await self.client.get_contents(urls, text=True) # pyright: ignore[reportUnknownMemberType,reportUnknownVariableType] return [ ExaContentResult( url=result.url, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] title=result.title or '', # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] text=result.text or '', # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] author=result.author, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] published_date=result.published_date, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] ) for result in response.results # pyright: ignore[reportUnknownVariableType,reportUnknownMemberType] ] ``` #### client ```python client: AsyncExa ``` The Exa async client. #### __call__ ```python __call__(urls: list[str]) -> list[ExaContentResult] ``` Gets the content of the specified URLs. Parameters: | Name | Type | Description | Default | | ------ | ----------- | ---------------------------------- | ---------- | | `urls` | `list[str]` | A list of URLs to get content for. | *required* | Returns: | Type | Description | | ------------------------ | ------------------------ | | `list[ExaContentResult]` | The content of each URL. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python async def __call__( self, urls: list[str], ) -> list[ExaContentResult]: """Gets the content of the specified URLs. Args: urls: A list of URLs to get content for. Returns: The content of each URL. """ response = await self.client.get_contents(urls, text=True) # pyright: ignore[reportUnknownMemberType,reportUnknownVariableType] return [ ExaContentResult( url=result.url, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] title=result.title or '', # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] text=result.text or '', # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] author=result.author, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] published_date=result.published_date, # pyright: ignore[reportUnknownMemberType,reportUnknownArgumentType] ) for result in response.results # pyright: ignore[reportUnknownVariableType,reportUnknownMemberType] ] ``` ### ExaAnswerTool The Exa answer tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python @dataclass class ExaAnswerTool: """The Exa answer tool.""" client: AsyncExa """The Exa async client.""" async def __call__( self, query: str, ) -> ExaAnswerResult: """Generates an AI-powered answer to the query with citations. Args: query: The question to answer. Returns: An answer with supporting citations from web sources. """ response = await self.client.answer(query, text=True) return ExaAnswerResult( answer=response.answer, # pyright: ignore[reportUnknownMemberType,reportArgumentType,reportAttributeAccessIssue] citations=[ { 'url': citation.url, # pyright: ignore[reportUnknownMemberType] 'title': citation.title or '', # pyright: ignore[reportUnknownMemberType] 'text': citation.text or '', # pyright: ignore[reportUnknownMemberType] } for citation in response.citations # pyright: ignore[reportUnknownVariableType,reportUnknownMemberType,reportAttributeAccessIssue] ], ) ``` #### client ```python client: AsyncExa ``` The Exa async client. #### __call__ ```python __call__(query: str) -> ExaAnswerResult ``` Generates an AI-powered answer to the query with citations. Parameters: | Name | Type | Description | Default | | ------- | ----- | ----------------------- | ---------- | | `query` | `str` | The question to answer. | *required* | Returns: | Type | Description | | ----------------- | ----------------------------------------------------- | | `ExaAnswerResult` | An answer with supporting citations from web sources. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python async def __call__( self, query: str, ) -> ExaAnswerResult: """Generates an AI-powered answer to the query with citations. Args: query: The question to answer. Returns: An answer with supporting citations from web sources. """ response = await self.client.answer(query, text=True) return ExaAnswerResult( answer=response.answer, # pyright: ignore[reportUnknownMemberType,reportArgumentType,reportAttributeAccessIssue] citations=[ { 'url': citation.url, # pyright: ignore[reportUnknownMemberType] 'title': citation.title or '', # pyright: ignore[reportUnknownMemberType] 'text': citation.text or '', # pyright: ignore[reportUnknownMemberType] } for citation in response.citations # pyright: ignore[reportUnknownVariableType,reportUnknownMemberType,reportAttributeAccessIssue] ], ) ``` ### exa_search_tool ```python exa_search_tool( api_key: str, *, num_results: int = 5, max_characters: int | None = None ) -> Tool[Any] ``` ```python exa_search_tool( *, client: AsyncExa, num_results: int = 5, max_characters: int | None = None ) -> Tool[Any] ``` ```python exa_search_tool( api_key: str | None = None, *, client: AsyncExa | None = None, num_results: int = 5, max_characters: int | None = None ) -> Tool[Any] ``` Creates an Exa search tool. Parameters: | Name | Type | Description | Default | | ---------------- | ---------- | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | | `api_key` | \`str | None\` | The Exa API key. Required if client is not provided. You can get one by signing up at https://dashboard.exa.ai. | | `client` | \`AsyncExa | None\` | An existing AsyncExa client. If provided, api_key is ignored. This is useful for sharing a client across multiple tools. | | `num_results` | `int` | The number of results to return. Defaults to 5. | `5` | | `max_characters` | \`int | None\` | Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python def exa_search_tool( api_key: str | None = None, *, client: AsyncExa | None = None, num_results: int = 5, max_characters: int | None = None, ) -> Tool[Any]: """Creates an Exa search tool. Args: api_key: The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). client: An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. num_results: The number of results to return. Defaults to 5. max_characters: Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). """ if client is None: if api_key is None: raise ValueError('Either api_key or client must be provided') client = AsyncExa(api_key=api_key) return Tool[Any]( ExaSearchTool( client=client, num_results=num_results, max_characters=max_characters, ).__call__, name='exa_search', description='Searches Exa for the given query and returns the results with content. Exa is a neural search engine that finds high-quality, relevant results.', ) ``` ### exa_find_similar_tool ```python exa_find_similar_tool( api_key: str, *, num_results: int = 5 ) -> Tool[Any] ``` ```python exa_find_similar_tool( *, client: AsyncExa, num_results: int = 5 ) -> Tool[Any] ``` ```python exa_find_similar_tool( api_key: str | None = None, *, client: AsyncExa | None = None, num_results: int = 5 ) -> Tool[Any] ``` Creates an Exa find similar tool. Parameters: | Name | Type | Description | Default | | ------------- | ---------- | ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | | `api_key` | \`str | None\` | The Exa API key. Required if client is not provided. You can get one by signing up at https://dashboard.exa.ai. | | `client` | \`AsyncExa | None\` | An existing AsyncExa client. If provided, api_key is ignored. This is useful for sharing a client across multiple tools. | | `num_results` | `int` | The number of similar results to return. Defaults to 5. | `5` | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python def exa_find_similar_tool( api_key: str | None = None, *, client: AsyncExa | None = None, num_results: int = 5, ) -> Tool[Any]: """Creates an Exa find similar tool. Args: api_key: The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). client: An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. num_results: The number of similar results to return. Defaults to 5. """ if client is None: if api_key is None: raise ValueError('Either api_key or client must be provided') client = AsyncExa(api_key=api_key) return Tool[Any]( ExaFindSimilarTool(client=client, num_results=num_results).__call__, name='exa_find_similar', description='Finds web pages similar to a given URL. Useful for discovering related content, competitors, or alternative sources.', ) ``` ### exa_get_contents_tool ```python exa_get_contents_tool(api_key: str) -> Tool[Any] ``` ```python exa_get_contents_tool(*, client: AsyncExa) -> Tool[Any] ``` ```python exa_get_contents_tool( api_key: str | None = None, *, client: AsyncExa | None = None ) -> Tool[Any] ``` Creates an Exa get contents tool. Parameters: | Name | Type | Description | Default | | --------- | ---------- | ----------- | ------------------------------------------------------------------------------------------------------------------------ | | `api_key` | \`str | None\` | The Exa API key. Required if client is not provided. You can get one by signing up at https://dashboard.exa.ai. | | `client` | \`AsyncExa | None\` | An existing AsyncExa client. If provided, api_key is ignored. This is useful for sharing a client across multiple tools. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python def exa_get_contents_tool( api_key: str | None = None, *, client: AsyncExa | None = None, ) -> Tool[Any]: """Creates an Exa get contents tool. Args: api_key: The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). client: An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. """ if client is None: if api_key is None: raise ValueError('Either api_key or client must be provided') client = AsyncExa(api_key=api_key) return Tool[Any]( ExaGetContentsTool(client=client).__call__, name='exa_get_contents', description='Gets the full text content of specified URLs. Useful for reading articles, documentation, or any web page when you have the exact URL.', ) ``` ### exa_answer_tool ```python exa_answer_tool(api_key: str) -> Tool[Any] ``` ```python exa_answer_tool(*, client: AsyncExa) -> Tool[Any] ``` ```python exa_answer_tool( api_key: str | None = None, *, client: AsyncExa | None = None ) -> Tool[Any] ``` Creates an Exa answer tool. Parameters: | Name | Type | Description | Default | | --------- | ---------- | ----------- | ------------------------------------------------------------------------------------------------------------------------ | | `api_key` | \`str | None\` | The Exa API key. Required if client is not provided. You can get one by signing up at https://dashboard.exa.ai. | | `client` | \`AsyncExa | None\` | An existing AsyncExa client. If provided, api_key is ignored. This is useful for sharing a client across multiple tools. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python def exa_answer_tool( api_key: str | None = None, *, client: AsyncExa | None = None, ) -> Tool[Any]: """Creates an Exa answer tool. Args: api_key: The Exa API key. Required if `client` is not provided. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). client: An existing AsyncExa client. If provided, `api_key` is ignored. This is useful for sharing a client across multiple tools. """ if client is None: if api_key is None: raise ValueError('Either api_key or client must be provided') client = AsyncExa(api_key=api_key) return Tool[Any]( ExaAnswerTool(client=client).__call__, name='exa_answer', description='Generates an AI-powered answer to a question with citations from web sources. Returns a comprehensive answer backed by real sources.', ) ``` ### ExaToolset Bases: `FunctionToolset` A toolset that provides Exa search tools with a shared client. This is more efficient than creating individual tools when using multiple Exa tools, as it shares a single API client across all tools. Example: ```python from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset toolset = ExaToolset(api_key='your-api-key') agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ````python class ExaToolset(FunctionToolset): """A toolset that provides Exa search tools with a shared client. This is more efficient than creating individual tools when using multiple Exa tools, as it shares a single API client across all tools. Example: ```python from pydantic_ai import Agent from pydantic_ai.common_tools.exa import ExaToolset toolset = ExaToolset(api_key='your-api-key') agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` """ def __init__( self, api_key: str, *, num_results: int = 5, max_characters: int | None = None, include_search: bool = True, include_find_similar: bool = True, include_get_contents: bool = True, include_answer: bool = True, id: str | None = None, ): """Creates an Exa toolset with a shared client. Args: api_key: The Exa API key. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). num_results: The number of results to return for search and find_similar. Defaults to 5. max_characters: Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). include_search: Whether to include the search tool. Defaults to True. include_find_similar: Whether to include the find_similar tool. Defaults to True. include_get_contents: Whether to include the get_contents tool. Defaults to True. include_answer: Whether to include the answer tool. Defaults to True. id: Optional ID for the toolset, used for durable execution environments. """ client = AsyncExa(api_key=api_key) tools: list[Tool[Any]] = [] if include_search: tools.append(exa_search_tool(client=client, num_results=num_results, max_characters=max_characters)) if include_find_similar: tools.append(exa_find_similar_tool(client=client, num_results=num_results)) if include_get_contents: tools.append(exa_get_contents_tool(client=client)) if include_answer: tools.append(exa_answer_tool(client=client)) super().__init__(tools, id=id) ```` #### __init__ ```python __init__( api_key: str, *, num_results: int = 5, max_characters: int | None = None, include_search: bool = True, include_find_similar: bool = True, include_get_contents: bool = True, include_answer: bool = True, id: str | None = None ) ``` Creates an Exa toolset with a shared client. Parameters: | Name | Type | Description | Default | | ---------------------- | ------ | --------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- | | `api_key` | `str` | The Exa API key. You can get one by signing up at https://dashboard.exa.ai. | *required* | | `num_results` | `int` | The number of results to return for search and find_similar. Defaults to 5. | `5` | | `max_characters` | \`int | None\` | Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). | | `include_search` | `bool` | Whether to include the search tool. Defaults to True. | `True` | | `include_find_similar` | `bool` | Whether to include the find_similar tool. Defaults to True. | `True` | | `include_get_contents` | `bool` | Whether to include the get_contents tool. Defaults to True. | `True` | | `include_answer` | `bool` | Whether to include the answer tool. Defaults to True. | `True` | | `id` | \`str | None\` | Optional ID for the toolset, used for durable execution environments. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/exa.py` ```python def __init__( self, api_key: str, *, num_results: int = 5, max_characters: int | None = None, include_search: bool = True, include_find_similar: bool = True, include_get_contents: bool = True, include_answer: bool = True, id: str | None = None, ): """Creates an Exa toolset with a shared client. Args: api_key: The Exa API key. You can get one by signing up at [https://dashboard.exa.ai](https://dashboard.exa.ai). num_results: The number of results to return for search and find_similar. Defaults to 5. max_characters: Maximum characters of text content per result. Use this to limit token usage. Defaults to None (no limit). include_search: Whether to include the search tool. Defaults to True. include_find_similar: Whether to include the find_similar tool. Defaults to True. include_get_contents: Whether to include the get_contents tool. Defaults to True. include_answer: Whether to include the answer tool. Defaults to True. id: Optional ID for the toolset, used for durable execution environments. """ client = AsyncExa(api_key=api_key) tools: list[Tool[Any]] = [] if include_search: tools.append(exa_search_tool(client=client, num_results=num_results, max_characters=max_characters)) if include_find_similar: tools.append(exa_find_similar_tool(client=client, num_results=num_results)) if include_get_contents: tools.append(exa_get_contents_tool(client=client)) if include_answer: tools.append(exa_answer_tool(client=client)) super().__init__(tools, id=id) ``` ### TavilySearchResult Bases: `TypedDict` A Tavily search result. See [Tavily Search Endpoint documentation](https://docs.tavily.com/api-reference/endpoint/search) for more information. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python class TavilySearchResult(TypedDict): """A Tavily search result. See [Tavily Search Endpoint documentation](https://docs.tavily.com/api-reference/endpoint/search) for more information. """ title: str """The title of the search result.""" url: str """The URL of the search result..""" content: str """A short description of the search result.""" score: float """The relevance score of the search result.""" ``` #### title ```python title: str ``` The title of the search result. #### url ```python url: str ``` The URL of the search result.. #### content ```python content: str ``` A short description of the search result. #### score ```python score: float ``` The relevance score of the search result. ### TavilySearchTool The Tavily search tool. Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python @dataclass class TavilySearchTool: """The Tavily search tool.""" client: AsyncTavilyClient """The Tavily search client.""" async def __call__( self, query: str, search_deep: Literal['basic', 'advanced'] = 'basic', topic: Literal['general', 'news'] = 'general', time_range: Literal['day', 'week', 'month', 'year', 'd', 'w', 'm', 'y'] | None = None, ) -> list[TavilySearchResult]: """Searches Tavily for the given query and returns the results. Args: query: The search query to execute with Tavily. search_deep: The depth of the search. topic: The category of the search. time_range: The time range back from the current date to filter results. Returns: A list of search results from Tavily. """ results = await self.client.search(query, search_depth=search_deep, topic=topic, time_range=time_range) # type: ignore[reportUnknownMemberType] return tavily_search_ta.validate_python(results['results']) ``` #### client ```python client: AsyncTavilyClient ``` The Tavily search client. #### __call__ ```python __call__( query: str, search_deep: Literal["basic", "advanced"] = "basic", topic: Literal["general", "news"] = "general", time_range: ( Literal[ "day", "week", "month", "year", "d", "w", "m", "y", ] | None ) = None, ) -> list[TavilySearchResult] ``` Searches Tavily for the given query and returns the results. Parameters: | Name | Type | Description | Default | | ------------- | ------------------------------------------------------------- | ---------------------------------------- | ------------------------------------------------------------ | | `query` | `str` | The search query to execute with Tavily. | *required* | | `search_deep` | `Literal['basic', 'advanced']` | The depth of the search. | `'basic'` | | `topic` | `Literal['general', 'news']` | The category of the search. | `'general'` | | `time_range` | \`Literal['day', 'week', 'month', 'year', 'd', 'w', 'm', 'y'] | None\` | The time range back from the current date to filter results. | Returns: | Type | Description | | -------------------------- | ------------------------------------- | | `list[TavilySearchResult]` | A list of search results from Tavily. | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python async def __call__( self, query: str, search_deep: Literal['basic', 'advanced'] = 'basic', topic: Literal['general', 'news'] = 'general', time_range: Literal['day', 'week', 'month', 'year', 'd', 'w', 'm', 'y'] | None = None, ) -> list[TavilySearchResult]: """Searches Tavily for the given query and returns the results. Args: query: The search query to execute with Tavily. search_deep: The depth of the search. topic: The category of the search. time_range: The time range back from the current date to filter results. Returns: A list of search results from Tavily. """ results = await self.client.search(query, search_depth=search_deep, topic=topic, time_range=time_range) # type: ignore[reportUnknownMemberType] return tavily_search_ta.validate_python(results['results']) ``` ### tavily_search_tool ```python tavily_search_tool(api_key: str) ``` Creates a Tavily search tool. Parameters: | Name | Type | Description | Default | | --------- | ----- | --------------------------------------------------------------------------------- | ---------- | | `api_key` | `str` | The Tavily API key. You can get one by signing up at https://app.tavily.com/home. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python def tavily_search_tool(api_key: str): """Creates a Tavily search tool. Args: api_key: The Tavily API key. You can get one by signing up at [https://app.tavily.com/home](https://app.tavily.com/home). """ return Tool[Any]( TavilySearchTool(client=AsyncTavilyClient(api_key)).__call__, name='tavily_search', description='Searches Tavily for the given query and returns the results.', ) ``` # `pydantic_ai` — Concurrency Bases: `WrapperModel` A model wrapper that limits concurrent requests to the underlying model. This wrapper applies concurrency limiting at the model level, ensuring that the number of concurrent requests to the model does not exceed the configured limit. This is useful for: - Respecting API rate limits - Managing resource usage - Sharing a concurrency pool across multiple models Example usage: ```python from pydantic_ai import Agent from pydantic_ai.models.concurrency import ConcurrencyLimitedModel # Limit to 5 concurrent requests model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5) agent = Agent(model) # Or share a limiter across multiple models from pydantic_ai import ConcurrencyLimiter # noqa E402 shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool') model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter) model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter) ``` Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ````python @dataclass(init=False) class ConcurrencyLimitedModel(WrapperModel): """A model wrapper that limits concurrent requests to the underlying model. This wrapper applies concurrency limiting at the model level, ensuring that the number of concurrent requests to the model does not exceed the configured limit. This is useful for: - Respecting API rate limits - Managing resource usage - Sharing a concurrency pool across multiple models Example usage: ```python from pydantic_ai import Agent from pydantic_ai.models.concurrency import ConcurrencyLimitedModel # Limit to 5 concurrent requests model = ConcurrencyLimitedModel('openai:gpt-4o', limiter=5) agent = Agent(model) # Or share a limiter across multiple models from pydantic_ai import ConcurrencyLimiter # noqa E402 shared_limiter = ConcurrencyLimiter(max_running=10, name='openai-pool') model1 = ConcurrencyLimitedModel('openai:gpt-4o', limiter=shared_limiter) model2 = ConcurrencyLimitedModel('openai:gpt-4o-mini', limiter=shared_limiter) ``` """ _limiter: AbstractConcurrencyLimiter def __init__( self, wrapped: Model | KnownModelName, limiter: int | ConcurrencyLimit | AbstractConcurrencyLimiter, ): """Initialize the ConcurrencyLimitedModel. Args: wrapped: The model to wrap, either a Model instance or a known model name. limiter: The concurrency limit configuration. Can be: - An `int`: Simple limit on concurrent operations (unlimited queue). - A `ConcurrencyLimit`: Full configuration with optional backpressure. - An `AbstractConcurrencyLimiter`: A pre-created limiter for sharing across models. """ super().__init__(wrapped) if isinstance(limiter, AbstractConcurrencyLimiter): self._limiter = limiter else: self._limiter = ConcurrencyLimiter.from_limit(limiter) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): return await self.wrapped.request(messages, model_settings, model_request_parameters) async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Count tokens with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): return await self.wrapped.count_tokens(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream ```` ### __init__ ```python __init__( wrapped: Model | KnownModelName, limiter: ( int | ConcurrencyLimit | AbstractConcurrencyLimiter ), ) ``` Initialize the ConcurrencyLimitedModel. Parameters: | Name | Type | Description | Default | | --------- | ------- | ---------------- | ----------------------------------------------------------------- | | `wrapped` | \`Model | KnownModelName\` | The model to wrap, either a Model instance or a known model name. | | `limiter` | \`int | ConcurrencyLimit | AbstractConcurrencyLimiter\` | Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ```python def __init__( self, wrapped: Model | KnownModelName, limiter: int | ConcurrencyLimit | AbstractConcurrencyLimiter, ): """Initialize the ConcurrencyLimitedModel. Args: wrapped: The model to wrap, either a Model instance or a known model name. limiter: The concurrency limit configuration. Can be: - An `int`: Simple limit on concurrent operations (unlimited queue). - A `ConcurrencyLimit`: Full configuration with optional backpressure. - An `AbstractConcurrencyLimiter`: A pre-created limiter for sharing across models. """ super().__init__(wrapped) if isinstance(limiter, AbstractConcurrencyLimiter): self._limiter = limiter else: self._limiter = ConcurrencyLimiter.from_limit(limiter) ``` ### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model with concurrency limiting. Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): return await self.wrapped.request(messages, model_settings, model_request_parameters) ``` ### count_tokens ```python count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Count tokens with concurrency limiting. Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ```python async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Count tokens with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): return await self.wrapped.count_tokens(messages, model_settings, model_request_parameters) ``` ### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model with concurrency limiting. Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model with concurrency limiting.""" async with get_concurrency_context(self._limiter, f'model:{self.model_name}'): async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream ``` Wrap a model with concurrency limiting. This is a convenience function to wrap a model with concurrency limiting. If the limiter is None, the model is returned unchanged. Parameters: | Name | Type | Description | Default | | --------- | --------------------- | ------------------------------------ | ------------------ | | `model` | \`Model | KnownModelName\` | The model to wrap. | | `limiter` | `AnyConcurrencyLimit` | The concurrency limit configuration. | *required* | Returns: | Type | Description | | ------- | -------------------------------------------------------------------------------------- | | `Model` | The wrapped model with concurrency limiting, or the original model if limiter is None. | Example: ```python from pydantic_ai.models.concurrency import limit_model_concurrency model = limit_model_concurrency('openai:gpt-4o', limiter=5) ``` Source code in `pydantic_ai_slim/pydantic_ai/models/concurrency.py` ````python def limit_model_concurrency( model: Model | KnownModelName, limiter: AnyConcurrencyLimit, ) -> Model: """Wrap a model with concurrency limiting. This is a convenience function to wrap a model with concurrency limiting. If the limiter is None, the model is returned unchanged. Args: model: The model to wrap. limiter: The concurrency limit configuration. Returns: The wrapped model with concurrency limiting, or the original model if limiter is None. Example: ```python from pydantic_ai.models.concurrency import limit_model_concurrency model = limit_model_concurrency('openai:gpt-4o', limiter=5) ``` """ normalized_limiter = normalize_to_limiter(limiter) if normalized_limiter is None: from . import infer_model return infer_model(model) if isinstance(model, str) else model return ConcurrencyLimitedModel(model, normalized_limiter) ```` Bases: `ABC` Abstract base class for concurrency limiters. Subclass this to create custom concurrency limiters (e.g., Redis-backed distributed limiters). Example: ```python from pydantic_ai.concurrency import AbstractConcurrencyLimiter class RedisConcurrencyLimiter(AbstractConcurrencyLimiter): def __init__(self, redis_client, key: str, max_running: int): self._redis = redis_client self._key = key self._max_running = max_running async def acquire(self, source: str) -> None: # Implement Redis-based distributed locking ... def release(self) -> None: # Release the Redis lock ... ``` Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ````python class AbstractConcurrencyLimiter(ABC): """Abstract base class for concurrency limiters. Subclass this to create custom concurrency limiters (e.g., Redis-backed distributed limiters). Example: ```python from pydantic_ai.concurrency import AbstractConcurrencyLimiter class RedisConcurrencyLimiter(AbstractConcurrencyLimiter): def __init__(self, redis_client, key: str, max_running: int): self._redis = redis_client self._key = key self._max_running = max_running async def acquire(self, source: str) -> None: # Implement Redis-based distributed locking ... def release(self) -> None: # Release the Redis lock ... ``` """ @abstractmethod async def acquire(self, source: str) -> None: """Acquire a slot, waiting if necessary. Args: source: Identifier for observability (e.g., 'model:gpt-4o'). """ ... @abstractmethod def release(self) -> None: """Release a slot.""" ... ```` ### acquire ```python acquire(source: str) -> None ``` Acquire a slot, waiting if necessary. Parameters: | Name | Type | Description | Default | | -------- | ----- | ---------------------------------------------------- | ---------- | | `source` | `str` | Identifier for observability (e.g., 'model:gpt-4o'). | *required* | Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python @abstractmethod async def acquire(self, source: str) -> None: """Acquire a slot, waiting if necessary. Args: source: Identifier for observability (e.g., 'model:gpt-4o'). """ ... ``` ### release ```python release() -> None ``` Release a slot. Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python @abstractmethod def release(self) -> None: """Release a slot.""" ... ``` Bases: `AbstractConcurrencyLimiter` A concurrency limiter that tracks waiting operations for observability. This class wraps an anyio.CapacityLimiter and tracks the number of waiting operations. When an operation has to wait to acquire a slot, a span is created for observability purposes. Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python class ConcurrencyLimiter(AbstractConcurrencyLimiter): """A concurrency limiter that tracks waiting operations for observability. This class wraps an anyio.CapacityLimiter and tracks the number of waiting operations. When an operation has to wait to acquire a slot, a span is created for observability purposes. """ def __init__( self, max_running: int, *, max_queued: int | None = None, name: str | None = None, tracer: Tracer | None = None, ): """Initialize the ConcurrencyLimiter. Args: max_running: Maximum number of concurrent operations. max_queued: Maximum queue depth before raising ConcurrencyLimitExceeded. name: Optional name for this limiter, used for observability when sharing a limiter across multiple models or agents. tracer: OpenTelemetry tracer for span creation. """ self._limiter = anyio.CapacityLimiter(max_running) self._max_queued = max_queued self._name = name self._tracer = tracer # Lock and counter to atomically check and track waiting tasks for max_queued enforcement self._queue_lock = anyio.Lock() self._waiting_count = 0 @classmethod def from_limit( cls, limit: int | ConcurrencyLimit, *, name: str | None = None, tracer: Tracer | None = None, ) -> Self: """Create a ConcurrencyLimiter from a ConcurrencyLimit configuration. Args: limit: Either an int for simple limiting or a ConcurrencyLimit for full config. name: Optional name for this limiter, used for observability. tracer: OpenTelemetry tracer for span creation. Returns: A configured ConcurrencyLimiter. """ if isinstance(limit, int): return cls(max_running=limit, name=name, tracer=tracer) else: return cls( max_running=limit.max_running, max_queued=limit.max_queued, name=name, tracer=tracer, ) @property def name(self) -> str | None: """Name of the limiter for observability.""" return self._name @property def waiting_count(self) -> int: """Number of operations currently waiting to acquire a slot.""" return self._waiting_count @property def running_count(self) -> int: """Number of operations currently running.""" return self._limiter.statistics().borrowed_tokens @property def available_count(self) -> int: """Number of slots available.""" return int(self._limiter.available_tokens) @property def max_running(self) -> int: """Maximum concurrent operations allowed.""" return int(self._limiter.total_tokens) def _get_tracer(self) -> Tracer: """Get the tracer, falling back to global tracer if not set.""" if self._tracer is not None: return self._tracer return get_tracer('pydantic-ai') async def acquire(self, source: str) -> None: """Acquire a slot, creating a span if waiting is required. Args: source: Identifier for the source of this acquisition (e.g., 'agent:my-agent' or 'model:gpt-4'). """ from .exceptions import ConcurrencyLimitExceeded # Try to acquire immediately without blocking try: self._limiter.acquire_nowait() return except anyio.WouldBlock: pass # We need to wait - atomically check queue limits and register ourselves as waiting # This prevents a race condition where multiple tasks could pass the check before # any of them actually start waiting on the limiter async with self._queue_lock: if self._max_queued is not None and self._waiting_count >= self._max_queued: # Use limiter name if set, otherwise use source for error messages display_name = self._name or source raise ConcurrencyLimitExceeded( f'Concurrency queue depth ({self._waiting_count + 1}) exceeds max_queued ({self._max_queued})' + (f' for {display_name}' if display_name else '') ) # Register ourselves as waiting before releasing the lock self._waiting_count += 1 # Now we're registered as waiting, proceed to wait on the limiter # Use try/finally to ensure we decrement the counter even on cancellation try: # Create a span for observability while waiting tracer = self._get_tracer() display_name = self._name or source attributes: dict[str, str | int] = { 'source': source, 'waiting_count': self._waiting_count, 'max_running': int(self._limiter.total_tokens), } if self._name is not None: attributes['limiter_name'] = self._name if self._max_queued is not None: attributes['max_queued'] = self._max_queued # Span name uses limiter name if set, otherwise source span_name = f'waiting for {display_name} concurrency' with tracer.start_as_current_span(span_name, attributes=attributes): await self._limiter.acquire() finally: # We're no longer waiting (either we acquired or we were cancelled) self._waiting_count -= 1 def release(self) -> None: """Release a slot.""" self._limiter.release() ``` ### __init__ ```python __init__( max_running: int, *, max_queued: int | None = None, name: str | None = None, tracer: Tracer | None = None ) ``` Initialize the ConcurrencyLimiter. Parameters: | Name | Type | Description | Default | | ------------- | -------- | ---------------------------------------- | --------------------------------------------------------------------------------------------------------------- | | `max_running` | `int` | Maximum number of concurrent operations. | *required* | | `max_queued` | \`int | None\` | Maximum queue depth before raising ConcurrencyLimitExceeded. | | `name` | \`str | None\` | Optional name for this limiter, used for observability when sharing a limiter across multiple models or agents. | | `tracer` | \`Tracer | None\` | OpenTelemetry tracer for span creation. | Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python def __init__( self, max_running: int, *, max_queued: int | None = None, name: str | None = None, tracer: Tracer | None = None, ): """Initialize the ConcurrencyLimiter. Args: max_running: Maximum number of concurrent operations. max_queued: Maximum queue depth before raising ConcurrencyLimitExceeded. name: Optional name for this limiter, used for observability when sharing a limiter across multiple models or agents. tracer: OpenTelemetry tracer for span creation. """ self._limiter = anyio.CapacityLimiter(max_running) self._max_queued = max_queued self._name = name self._tracer = tracer # Lock and counter to atomically check and track waiting tasks for max_queued enforcement self._queue_lock = anyio.Lock() self._waiting_count = 0 ``` ### from_limit ```python from_limit( limit: int | ConcurrencyLimit, *, name: str | None = None, tracer: Tracer | None = None ) -> Self ``` Create a ConcurrencyLimiter from a ConcurrencyLimit configuration. Parameters: | Name | Type | Description | Default | | -------- | -------- | ------------------ | ------------------------------------------------------------------------ | | `limit` | \`int | ConcurrencyLimit\` | Either an int for simple limiting or a ConcurrencyLimit for full config. | | `name` | \`str | None\` | Optional name for this limiter, used for observability. | | `tracer` | \`Tracer | None\` | OpenTelemetry tracer for span creation. | Returns: | Type | Description | | ------ | -------------------------------- | | `Self` | A configured ConcurrencyLimiter. | Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python @classmethod def from_limit( cls, limit: int | ConcurrencyLimit, *, name: str | None = None, tracer: Tracer | None = None, ) -> Self: """Create a ConcurrencyLimiter from a ConcurrencyLimit configuration. Args: limit: Either an int for simple limiting or a ConcurrencyLimit for full config. name: Optional name for this limiter, used for observability. tracer: OpenTelemetry tracer for span creation. Returns: A configured ConcurrencyLimiter. """ if isinstance(limit, int): return cls(max_running=limit, name=name, tracer=tracer) else: return cls( max_running=limit.max_running, max_queued=limit.max_queued, name=name, tracer=tracer, ) ``` ### name ```python name: str | None ``` Name of the limiter for observability. ### waiting_count ```python waiting_count: int ``` Number of operations currently waiting to acquire a slot. ### running_count ```python running_count: int ``` Number of operations currently running. ### available_count ```python available_count: int ``` Number of slots available. ### max_running ```python max_running: int ``` Maximum concurrent operations allowed. ### acquire ```python acquire(source: str) -> None ``` Acquire a slot, creating a span if waiting is required. Parameters: | Name | Type | Description | Default | | -------- | ----- | ---------------------------------------------------------------------------------------- | ---------- | | `source` | `str` | Identifier for the source of this acquisition (e.g., 'agent:my-agent' or 'model:gpt-4'). | *required* | Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python async def acquire(self, source: str) -> None: """Acquire a slot, creating a span if waiting is required. Args: source: Identifier for the source of this acquisition (e.g., 'agent:my-agent' or 'model:gpt-4'). """ from .exceptions import ConcurrencyLimitExceeded # Try to acquire immediately without blocking try: self._limiter.acquire_nowait() return except anyio.WouldBlock: pass # We need to wait - atomically check queue limits and register ourselves as waiting # This prevents a race condition where multiple tasks could pass the check before # any of them actually start waiting on the limiter async with self._queue_lock: if self._max_queued is not None and self._waiting_count >= self._max_queued: # Use limiter name if set, otherwise use source for error messages display_name = self._name or source raise ConcurrencyLimitExceeded( f'Concurrency queue depth ({self._waiting_count + 1}) exceeds max_queued ({self._max_queued})' + (f' for {display_name}' if display_name else '') ) # Register ourselves as waiting before releasing the lock self._waiting_count += 1 # Now we're registered as waiting, proceed to wait on the limiter # Use try/finally to ensure we decrement the counter even on cancellation try: # Create a span for observability while waiting tracer = self._get_tracer() display_name = self._name or source attributes: dict[str, str | int] = { 'source': source, 'waiting_count': self._waiting_count, 'max_running': int(self._limiter.total_tokens), } if self._name is not None: attributes['limiter_name'] = self._name if self._max_queued is not None: attributes['max_queued'] = self._max_queued # Span name uses limiter name if set, otherwise source span_name = f'waiting for {display_name} concurrency' with tracer.start_as_current_span(span_name, attributes=attributes): await self._limiter.acquire() finally: # We're no longer waiting (either we acquired or we were cancelled) self._waiting_count -= 1 ``` ### release ```python release() -> None ``` Release a slot. Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python def release(self) -> None: """Release a slot.""" self._limiter.release() ``` Configuration for concurrency limiting with optional backpressure. Parameters: | Name | Type | Description | Default | | ------------- | ----- | ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------- | | `max_running` | `int` | Maximum number of concurrent operations allowed. | *required* | | `max_queued` | \`int | None\` | Maximum number of operations waiting in the queue. If None, the queue is unlimited. If exceeded, raises ConcurrencyLimitExceeded. | Source code in `pydantic_ai_slim/pydantic_ai/concurrency.py` ```python @dataclass class ConcurrencyLimit: """Configuration for concurrency limiting with optional backpressure. Args: max_running: Maximum number of concurrent operations allowed. max_queued: Maximum number of operations waiting in the queue. If None, the queue is unlimited. If exceeded, raises `ConcurrencyLimitExceeded`. """ max_running: int max_queued: int | None = None ``` Type alias for concurrency limit configuration. Can be: - An `int`: Simple limit on concurrent operations (unlimited queue). - A `ConcurrencyLimit`: Full configuration with optional backpressure. - An `AbstractConcurrencyLimiter`: A pre-created limiter instance for sharing across multiple models/agents. - `None`: No concurrency limiting (default). Bases: `AgentRunError` Error raised when the concurrency queue depth exceeds max_queued. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ConcurrencyLimitExceeded(AgentRunError): """Error raised when the concurrency queue depth exceeds max_queued.""" ``` # `pydantic_ai.direct` Methods for making imperative requests to language models with minimal abstraction. These methods allow you to make requests to LLMs where the only abstraction is input and output schema translation so you can use all models with the same API. These methods are thin wrappers around Model implementations. ### model_request ```python model_request( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a non-streamed request to a model. model_request_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request async def main(): model_response = await model_request( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | -------------------------- | ------------------------- | ----------------------------- | --------------------------------- | | `model` | \`Model | KnownModelName | str\` | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | \`ModelSettings | None\` | optional model settings | | `model_request_parameters` | \`ModelRequestParameters | None\` | optional model request parameters | | `instrument` | \`InstrumentationSettings | bool | None\` | Returns: | Type | Description | | --------------- | --------------------------------------------------------------- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python async def model_request( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a non-streamed request to a model. ```py title="model_request_example.py" from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request async def main(): model_response = await model_request( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ model_instance = _prepare_model(model, instrument) return await model_instance.request( list(messages), model_settings, model_request_parameters or models.ModelRequestParameters(), ) ```` ### model_request_sync ```python model_request_sync( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps model_request with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. model_request_sync_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | -------------------------- | ------------------------- | ----------------------------- | --------------------------------- | | `model` | \`Model | KnownModelName | str\` | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | \`ModelSettings | None\` | optional model settings | | `model_request_parameters` | \`ModelRequestParameters | None\` | optional model request parameters | | `instrument` | \`InstrumentationSettings | bool | None\` | Returns: | Type | Description | | --------------- | --------------------------------------------------------------- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps [`model_request`][pydantic_ai.direct.model_request] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. ```py title="model_request_sync_example.py" from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync model_response = model_request_sync( 'anthropic:claude-haiku-4-5', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-haiku-4-5', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ return _get_event_loop().run_until_complete( model_request( model, list(messages), model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) ) ```` ### model_request_stream ```python model_request_stream( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> AbstractAsyncContextManager[StreamedResponse] ``` Make a streamed async request to a model. model_request_stream_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-5-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | -------------------------- | ------------------------- | ----------------------------- | --------------------------------- | | `model` | \`Model | KnownModelName | str\` | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | \`ModelSettings | None\` | optional model settings | | `model_request_parameters` | \`ModelRequestParameters | None\` | optional model request parameters | | `instrument` | \`InstrumentationSettings | bool | None\` | Returns: | Type | Description | | ----------------------------------------------- | ---------------------------------------- | | `AbstractAsyncContextManager[StreamedResponse]` | A stream response async context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> AbstractAsyncContextManager[models.StreamedResponse]: """Make a streamed async request to a model. ```py {title="model_request_stream_example.py"} from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-5-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [stream response][pydantic_ai.models.StreamedResponse] async context manager. """ model_instance = _prepare_model(model, instrument) return model_instance.request_stream( list(messages), model_settings, model_request_parameters or models.ModelRequestParameters(), ) ```` ### model_request_stream_sync ```python model_request_stream_sync( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> StreamedResponseSync ``` Make a streamed synchronous request to a model. This is the synchronous version of model_request_stream. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. model_request_stream_sync_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream_sync messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-5-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` Parameters: | Name | Type | Description | Default | | -------------------------- | ------------------------- | ----------------------------- | --------------------------------- | | `model` | \`Model | KnownModelName | str\` | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | \`ModelSettings | None\` | optional model settings | | `model_request_parameters` | \`ModelRequestParameters | None\` | optional model request parameters | | `instrument` | \`InstrumentationSettings | bool | None\` | Returns: | Type | Description | | ---------------------- | --------------------------------------- | | `StreamedResponseSync` | A sync stream response context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> StreamedResponseSync: """Make a streamed synchronous request to a model. This is the synchronous version of [`model_request_stream`][pydantic_ai.direct.model_request_stream]. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. ```py {title="model_request_stream_sync_example.py"} from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream_sync messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-5-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), PartEndEvent( index=0, part=TextPart( content='Albert Einstein was a German-born theoretical physicist.' ), ), ] ''' ``` Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [sync stream response][pydantic_ai.direct.StreamedResponseSync] context manager. """ async_stream_cm = model_request_stream( model=model, messages=list(messages), model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) return StreamedResponseSync(async_stream_cm) ```` ### StreamedResponseSync Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python @dataclass class StreamedResponseSync: """Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. """ _async_stream_cm: AbstractAsyncContextManager[StreamedResponse] _queue: queue.Queue[messages.ModelResponseStreamEvent | Exception | None] = field( default_factory=queue.Queue[messages.ModelResponseStreamEvent | Exception | None], init=False ) _thread: threading.Thread | None = field(default=None, init=False) _stream_response: StreamedResponse | None = field(default=None, init=False) _exception: Exception | None = field(default=None, init=False) _context_entered: bool = field(default=False, init=False) _stream_ready: threading.Event = field(default_factory=threading.Event, init=False) def __enter__(self) -> StreamedResponseSync: self._context_entered = True self._start_producer() return self def __exit__( self, _exc_type: type[BaseException] | None, _exc_val: BaseException | None, _exc_tb: TracebackType | None, ) -> None: self._cleanup() def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item def __repr__(self) -> str: if self._stream_response: return repr(self._stream_response) else: return f'{self.__class__.__name__}(context_entered={self._context_entered})' __str__ = __repr__ def _check_context_manager_usage(self) -> None: if not self._context_entered: raise RuntimeError( 'StreamedResponseSync must be used as a context manager. ' 'Use: `with model_request_stream_sync(...) as stream:`' ) def _ensure_stream_ready(self) -> StreamedResponse: self._check_context_manager_usage() if self._stream_response is None: # Wait for the background thread to signal that the stream is ready if not self._stream_ready.wait(timeout=STREAM_INITIALIZATION_TIMEOUT): raise RuntimeError('Stream failed to initialize within timeout') if self._stream_response is None: # pragma: no cover raise RuntimeError('Stream failed to initialize') return self._stream_response def _start_producer(self): self._thread = threading.Thread(target=self._async_producer, daemon=True) self._thread.start() def _async_producer(self): async def _consume_async_stream(): try: async with self._async_stream_cm as stream: self._stream_response = stream # Signal that the stream is ready self._stream_ready.set() async for event in stream: self._queue.put(event) except Exception as e: # Signal ready even on error so waiting threads don't hang self._stream_ready.set() self._queue.put(e) finally: self._queue.put(None) # Signal end _get_event_loop().run_until_complete(_consume_async_stream()) def _cleanup(self): if self._thread and self._thread.is_alive(): self._thread.join() # TODO (v2): Drop in favor of `response` property def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() @property def response(self) -> messages.ModelResponse: """Get the current state of the response.""" return self.get() # TODO (v2): Make this a property def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() @property def model_name(self) -> str: """Get the model name of the response.""" return self._ensure_stream_ready().model_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._ensure_stream_ready().timestamp ``` #### __iter__ ```python __iter__() -> Iterator[ModelResponseStreamEvent] ``` Stream the response as an iterable of ModelResponseStreamEvents. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() ``` #### response ```python response: ModelResponse ``` Get the current state of the response. #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.durable_exec` ### TemporalAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python class TemporalAgent(WrapperAgent[AgentDepsT, OutputDataT]): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, models: Mapping[str, Model] | None = None, provider_factory: TemporalProviderFactory | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. models: Optional mapping of model instances to register with the agent. Keys define the names that can be referenced at runtime and the values are `Model` instances. Registered model instances can be passed directly to `run(model=...)`. If the wrapped agent doesn't have a model set and none is provided to `run()`, the first model in this mapping will be used as the default. provider_factory: Optional callable used when instantiating models from provider strings (those supplied at runtime). The callable receives the provider name and the current run context, allowing custom configuration such as injecting API keys stored on `deps`. Note: This factory is only used inside Temporal workflows. Outside workflows, model strings are resolved using the default provider behavior. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) # Get wrapped agent's model if it's a Model instance wrapped_model = wrapped.model if isinstance(wrapped.model, Model) else None temporal_model = TemporalModel( wrapped_model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, models=models, provider_factory=provider_factory, ) activities.extend(temporal_model.temporal_activities) self._temporal_model = temporal_model def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) @property def name(self) -> str | None: return self._name or super().name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._temporal_model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif workflow.in_workflow(): return self._call_event_stream_handler_activity else: return handler async def _call_event_stream_handler_activity( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: serialized_run_context = self.run_context_type.serialize_run_context(ctx) async for event in stream: activity_config: ActivityConfig = {'summary': f'handle event: {event.event_kind}', **self.activity_config} await workflow.execute_activity( activity=self.event_stream_handler_activity, args=[ _EventStreamHandlerParams( event=event, serialized_run_context=serialized_run_context, ), ctx.deps, ], **activity_config, ) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._temporal_overrides(force=True): return super().toolsets @property def temporal_activities(self) -> list[Callable[..., Any]]: return self._temporal_activities @contextmanager def _temporal_overrides( self, *, model: models.Model | models.KnownModelName | str | None = None, force: bool = False ) -> Iterator[None]: """Context manager for workflow-specific overrides. When called outside a workflow, this is a no-op. When called inside a workflow, it overrides the model and toolsets. """ if not workflow.in_workflow() and not force: yield return # We reset tools here as the temporalized function toolset is already in self._toolsets. # Override model and set the model for workflow execution with ( super().override(model=self._temporal_model, toolsets=self._toolsets, tools=[]), self._temporal_model.using_model(model), _utils.disable_threads(), ): temporal_active_token = self._temporal_overrides_active.set(True) try: yield except PydanticSerializationError as e: raise UserError( "The `deps` object failed to be serialized. Temporal requires all objects that are passed to activities to be serializable using Pydantic's `TypeAdapter`." ) from e finally: self._temporal_overrides_active.reset(temporal_active_token) @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. Inside workflows, only registered model instances, registered names, or provider strings are valid. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) resolved_model = None else: resolved_model = self._temporal_model.resolve_model(model) with self._temporal_overrides(model=model): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=resolved_model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream_events()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) assert model is None, 'Temporal overrides must set the model before `agent.iter()` is invoked' if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) resolved_model = None else: resolved_model = self._temporal_model.resolve_model(model) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=resolved_model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, models: Mapping[str, Model] | None = None, provider_factory: TemporalProviderFactory | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: ( dict[str, ActivityConfig] | None ) = None, tool_activity_config: ( dict[ str, dict[str, ActivityConfig | Literal[False]] ] | None ) = None, run_context_type: type[ TemporalRunContext[AgentDepsT] ] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset ) ``` Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Parameters: | Name | Type | Description | Default | | -------------------------- | ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | \`str | None\` | Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's name will be used. | | `models` | \`Mapping[str, Model] | None\` | Optional mapping of model instances to register with the agent. Keys define the names that can be referenced at runtime and the values are Model instances. Registered model instances can be passed directly to run(model=...). If the wrapped agent doesn't have a model set and none is provided to run(), the first model in this mapping will be used as the default. | | `provider_factory` | \`TemporalProviderFactory | None\` | Optional callable used when instantiating models from provider strings (those supplied at runtime). The callable receives the provider name and the current run context, allowing custom configuration such as injecting API keys stored on deps. Note: This factory is only used inside Temporal workflows. Outside workflows, model strings are resolved using the default provider behavior. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use instead of the one set on the wrapped agent. | | `activity_config` | \`ActivityConfig | None\` | The base Temporal activity config to use for all activities. If no config is provided, a start_to_close_timeout of 60 seconds is used. | | `model_activity_config` | \`ActivityConfig | None\` | The Temporal activity config to use for model request activities. This is merged with the base activity config. | | `toolset_activity_config` | \`dict[str, ActivityConfig] | None\` | The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. | | `tool_activity_config` | \`dict\[str, dict\[str, ActivityConfig | Literal[False]\]\] | None\` | | `run_context_type` | `type[TemporalRunContext[AgentDepsT]]` | The TemporalRunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the deps, run_id, metadata, retries, tool_call_id, tool_name, tool_call_approved, retry, max_retries, run_step, usage, and partial_output attributes will be available. To make another attribute available, create a TemporalRunContext subclass with a custom serialize_run_context class method that returns a dictionary that includes the attribute. | `TemporalRunContext[AgentDepsT]` | | `temporalize_toolset_func` | \`Callable\[\[AbstractToolset[AgentDepsT], str, ActivityConfig, dict\[str, ActivityConfig | Literal[False]\], type[AgentDepsT], type\[TemporalRunContext[AgentDepsT]\]\], AbstractToolset[AgentDepsT]\]\` | Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a TemporalWrapperToolset that moves methods that require IO to Temporal activities. If not provided, only FunctionToolset and MCPServer will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, models: Mapping[str, Model] | None = None, provider_factory: TemporalProviderFactory | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. models: Optional mapping of model instances to register with the agent. Keys define the names that can be referenced at runtime and the values are `Model` instances. Registered model instances can be passed directly to `run(model=...)`. If the wrapped agent doesn't have a model set and none is provided to `run()`, the first model in this mapping will be used as the default. provider_factory: Optional callable used when instantiating models from provider strings (those supplied at runtime). The callable receives the provider name and the current run context, allowing custom configuration such as injecting API keys stored on `deps`. Note: This factory is only used inside Temporal workflows. Outside workflows, model strings are resolved using the default provider behavior. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) # Get wrapped agent's model if it's a Model instance wrapped_model = wrapped.model if isinstance(wrapped.model, Model) else None temporal_model = TemporalModel( wrapped_model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, models=models, provider_factory=provider_factory, ) activities.extend(temporal_model.temporal_activities) self._temporal_model = temporal_model def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. Inside workflows, only registered model instances, registered names, or provider strings are valid. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) resolved_model = None else: resolved_model = self._temporal_model.resolve_model(model) with self._temporal_overrides(model=model): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=resolved_model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | Returns: | Type | Description | | --------------------------------------------------- | ---------------------- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------------------- | ---------------------------- | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream_events()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) assert model is None, 'Temporal overrides must set the model before `agent.iter()` is invoked' if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) resolved_model = None else: resolved_model = self._temporal_model.resolve_model(model) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=resolved_model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------ | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### LogfirePlugin Bases: `SimplePlugin` Temporal client plugin for Logfire. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_logfire.py` ```python class LogfirePlugin(SimplePlugin): """Temporal client plugin for Logfire.""" def __init__(self, setup_logfire: Callable[[], Logfire] = _default_setup_logfire, *, metrics: bool = True): try: import logfire # noqa: F401 # pyright: ignore[reportUnusedImport] from opentelemetry.trace import get_tracer from temporalio.contrib.opentelemetry import TracingInterceptor except ImportError as _import_error: raise ImportError( 'Please install the `logfire` package to use the Logfire plugin, ' 'you can use the `logfire` optional group — `pip install "pydantic-ai-slim[logfire]"`' ) from _import_error self.setup_logfire = setup_logfire self.metrics = metrics super().__init__( # type: ignore[reportUnknownMemberType] name='LogfirePlugin', client_interceptors=[TracingInterceptor(get_tracer('temporalio'))], ) async def connect_service_client( self, config: ConnectConfig, next: Callable[[ConnectConfig], Awaitable[ServiceClient]] ) -> ServiceClient: logfire = self.setup_logfire() if self.metrics: logfire_config = logfire.config token = logfire_config.token if logfire_config.send_to_logfire and token is not None and logfire_config.metrics is not False: base_url = logfire_config.advanced.generate_base_url(token) metrics_url = base_url + '/v1/metrics' headers = {'Authorization': f'Bearer {token}'} config.runtime = Runtime( telemetry=TelemetryConfig(metrics=OpenTelemetryConfig(url=metrics_url, headers=headers)) ) return await next(config) ``` ### TemporalRunContext Bases: `RunContext[AgentDepsT]` The RunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to TemporalAgent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python class TemporalRunContext(RunContext[AgentDepsT]): """The [`RunContext`][pydantic_ai.tools.RunContext] subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent]. """ def __init__(self, deps: AgentDepsT, **kwargs: Any): self.__dict__ = {**kwargs, 'deps': deps} setattr( self, '__dataclass_fields__', {name: field for name, field in RunContext.__dataclass_fields__.items() if name in self.__dict__}, ) def __getattribute__(self, name: str) -> Any: try: return super().__getattribute__(name) except AttributeError as e: # pragma: no cover if name in RunContext.__dataclass_fields__: raise UserError( f'{self.__class__.__name__!r} object has no attribute {name!r}. ' 'To make the attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to `TemporalAgent`.' ) else: raise e @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'run_id': ctx.run_id, 'metadata': ctx.metadata, 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'tool_call_metadata': ctx.tool_call_metadata, 'retry': ctx.retry, 'max_retries': ctx.max_retries, 'run_step': ctx.run_step, 'partial_output': ctx.partial_output, 'usage': ctx.usage, } @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: Any) -> TemporalRunContext[Any]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` #### serialize_run_context ```python serialize_run_context( ctx: RunContext[Any], ) -> dict[str, Any] ``` Serialize the run context to a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'run_id': ctx.run_id, 'metadata': ctx.metadata, 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'tool_call_metadata': ctx.tool_call_metadata, 'retry': ctx.retry, 'max_retries': ctx.max_retries, 'run_step': ctx.run_step, 'partial_output': ctx.partial_output, 'usage': ctx.usage, } ``` #### deserialize_run_context ```python deserialize_run_context( ctx: dict[str, Any], deps: Any ) -> TemporalRunContext[Any] ``` Deserialize the run context from a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: Any) -> TemporalRunContext[Any]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` ### TemporalWrapperToolset Bases: `WrapperToolset[AgentDepsT]`, `ABC` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_toolset.py` ```python class TemporalWrapperToolset(WrapperToolset[AgentDepsT], ABC): @property def id(self) -> str: # An error is raised in `TemporalAgent` if no `id` is set. assert self.wrapped.id is not None return self.wrapped.id @property @abstractmethod def temporal_activities(self) -> list[Callable[..., Any]]: raise NotImplementedError def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: # Temporalized toolsets cannot be swapped out after the fact. return self async def __aenter__(self) -> Self: if not workflow.in_workflow(): # pragma: no cover await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: if not workflow.in_workflow(): # pragma: no cover return await self.wrapped.__aexit__(*args) return None async def _wrap_call_tool_result(self, coro: Awaitable[Any]) -> CallToolResult: try: result = await coro return _ToolReturn(result=result) except ApprovalRequired as e: return _ApprovalRequired(metadata=e.metadata) except CallDeferred as e: return _CallDeferred(metadata=e.metadata) except ModelRetry as e: return _ModelRetry(message=e.message) def _unwrap_call_tool_result(self, result: CallToolResult) -> Any: if isinstance(result, _ToolReturn): return result.result elif isinstance(result, _ApprovalRequired): raise ApprovalRequired(metadata=result.metadata) elif isinstance(result, _CallDeferred): raise CallDeferred(metadata=result.metadata) elif isinstance(result, _ModelRetry): raise ModelRetry(result.message) else: assert_never(result) async def _call_tool_in_activity( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> CallToolResult: """Call a tool inside an activity, re-validating args that were deserialized. The tool args will already have been validated into their proper types in the `ToolManager`, but `execute_activity` would have turned them into simple Python types again, so we need to re-validate them. """ args_dict = tool.args_validator.validate_python(tool_args) return await self._wrap_call_tool_result(self.wrapped.call_tool(name, args_dict, ctx, tool)) ``` ### PydanticAIWorkflow Temporal Workflow base class that provides `__pydantic_ai_agents__` for direct agent registration. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_workflow.py` ```python class PydanticAIWorkflow: """Temporal Workflow base class that provides `__pydantic_ai_agents__` for direct agent registration.""" __pydantic_ai_agents__: Sequence[TemporalAgent[Any, Any]] ``` ### PydanticAIPlugin Bases: `SimplePlugin` Temporal client and worker plugin for Pydantic AI. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class PydanticAIPlugin(SimplePlugin): """Temporal client and worker plugin for Pydantic AI.""" def __init__(self): super().__init__( # type: ignore[reportUnknownMemberType] name='PydanticAIPlugin', data_converter=_data_converter, workflow_runner=_workflow_runner, workflow_failure_exception_types=[UserError, PydanticUserError], ) def configure_worker(self, config: WorkerConfig) -> WorkerConfig: config = super().configure_worker(config) workflows = list(config.get('workflows', [])) # type: ignore[reportUnknownMemberType] activities = list(config.get('activities', [])) # type: ignore[reportUnknownMemberType] for workflow_class in workflows: # type: ignore[reportUnknownMemberType] agents = getattr(workflow_class, '__pydantic_ai_agents__', None) # type: ignore[reportUnknownMemberType] if agents is None: continue if not isinstance(agents, Sequence): raise TypeError( # pragma: no cover f'__pydantic_ai_agents__ must be a Sequence of TemporalAgent instances, got {type(agents)}' ) for agent in agents: # type: ignore[reportUnknownVariableType] if not isinstance(agent, TemporalAgent): raise TypeError( # pragma: no cover f'__pydantic_ai_agents__ must be a Sequence of TemporalAgent, got {type(agent)}' # type: ignore[reportUnknownVariableType] ) activities.extend(agent.temporal_activities) # type: ignore[reportUnknownMemberType] config['activities'] = activities return config ``` ### AgentPlugin Bases: `SimplePlugin` Temporal worker plugin for a specific Pydantic AI agent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class AgentPlugin(SimplePlugin): """Temporal worker plugin for a specific Pydantic AI agent.""" def __init__(self, agent: TemporalAgent[Any, Any]): super().__init__( # type: ignore[reportUnknownMemberType] name='AgentPlugin', activities=agent.temporal_activities, ) ``` ### DBOSAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]`, `DBOSConfiguredInstance` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @DBOS.dbos_class() class DBOSAgent(WrapperAgent[AgentDepsT, OutputDataT], DBOSConfiguredInstance): _parallel_execution_mode: ParallelExecutionMode def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, parallel_execution_mode: DBOSParallelExecutionMode = 'parallel_ordered_events', ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. parallel_execution_mode: The mode for executing tool calls: - 'parallel_ordered_events' (default): Run tool calls in parallel, but events are emitted in order, after all calls complete. - 'sequential': Run tool calls one at a time in order. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler self._parallel_execution_mode = cast(ParallelExecutionMode, parallel_execution_mode) if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer try: from pydantic_ai.mcp import MCPServer from ._mcp_server import DBOSMCPServer except ImportError: pass else: if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) # Replace FastMCPToolset with DBOSFastMCPToolset try: from pydantic_ai.toolsets.fastmcp import FastMCPToolset from ._fastmcp_toolset import DBOSFastMCPToolset except ImportError: pass else: if isinstance(toolset, FastMCPToolset): return DBOSFastMCPToolset( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, instructions: Instructions[AgentDepsT] = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow @property def name(self) -> str | None: return self._name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif DBOS.workflow_id is not None and DBOS.step_id is None: # Special case if it's in a DBOS workflow but not a step, we need to iterate through all events and call the handler. return self._call_event_stream_handler_in_workflow else: return handler async def _call_event_stream_handler_in_workflow( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: handler = self._event_stream_handler or super().event_stream_handler assert handler is not None async def streamed_response(event: _messages.AgentStreamEvent): yield event async for event in stream: await handler(ctx, streamed_response(event)) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._dbos_overrides(): return super().toolsets @contextmanager def _dbos_overrides(self) -> Iterator[None]: # Override with DBOSModel and DBOSMCPServer in the toolsets. # Use the configured parallel execution mode for deterministic event ordering during DBOS replay. with ( super().override(model=self._model, toolsets=self._toolsets, tools=[]), self.parallel_tool_call_execution_mode(self._parallel_execution_mode), ): yield @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): # pragma: lax no cover raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, instructions: Instructions[AgentDepsT] = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ raise UserError( '`agent.run_stream_events()` cannot be used with DBOS. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): # pragma: lax no cover raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, parallel_execution_mode: DBOSParallelExecutionMode = "parallel_ordered_events" ) ``` Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Parameters: | Name | Type | Description | Default | | ------------------------- | ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | \`str | None\` | Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's name will be used. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use instead of the one set on the wrapped agent. | | `mcp_step_config` | \`StepConfig | None\` | The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. | | `model_step_config` | \`StepConfig | None\` | The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. | | `parallel_execution_mode` | `DBOSParallelExecutionMode` | The mode for executing tool calls: - 'parallel_ordered_events' (default): Run tool calls in parallel, but events are emitted in order, after all calls complete. - 'sequential': Run tool calls one at a time in order. | `'parallel_ordered_events'` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, parallel_execution_mode: DBOSParallelExecutionMode = 'parallel_ordered_events', ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. parallel_execution_mode: The mode for executing tool calls: - 'parallel_ordered_events' (default): Run tool calls in parallel, but events are emitted in order, after all calls complete. - 'sequential': Run tool calls one at a time in order. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler self._parallel_execution_mode = cast(ParallelExecutionMode, parallel_execution_mode) if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer try: from pydantic_ai.mcp import MCPServer from ._mcp_server import DBOSMCPServer except ImportError: pass else: if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) # Replace FastMCPToolset with DBOSFastMCPToolset try: from pydantic_ai.toolsets.fastmcp import FastMCPToolset from ._fastmcp_toolset import DBOSFastMCPToolset except ImportError: pass else: if isinstance(toolset, FastMCPToolset): return DBOSFastMCPToolset( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, instructions: Instructions[AgentDepsT] = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): # pragma: lax no cover raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, instructions: Instructions[AgentDepsT] = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | Returns: | Type | Description | | --------------------------------------------------- | ---------------------- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------------------- | ---------------------------- | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ raise UserError( '`agent.run_stream_events()` cannot be used with DBOS. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): # pragma: lax no cover raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------ | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### DBOSParallelExecutionMode ```python DBOSParallelExecutionMode = Literal[ "sequential", "parallel_ordered_events" ] ``` The mode for executing tool calls in DBOS durable workflows. This is a subset of the ParallelExecutionMode because 'parallel' cannot guarantee deterministic ordering. ### DBOSMCPServer Bases: `DBOSMCPToolset[AgentDepsT]` A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_mcp_server.py` ```python class DBOSMCPServer(DBOSMCPToolset[AgentDepsT]): """A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps.""" def __init__( self, wrapped: MCPServer, *, step_name_prefix: str, step_config: StepConfig, ): super().__init__( wrapped, step_name_prefix=step_name_prefix, step_config=step_config, ) def tool_for_tool_def(self, tool_def: ToolDefinition) -> ToolsetTool[AgentDepsT]: assert isinstance(self.wrapped, MCPServer) return self.wrapped.tool_for_tool_def(tool_def) ``` ### DBOSModel Bases: `WrapperModel` A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_model.py` ```python class DBOSModel(WrapperModel): """A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps.""" def __init__( self, model: Model, *, step_name_prefix: str, step_config: StepConfig, event_stream_handler: EventStreamHandler[Any] | None = None, ): super().__init__(model) self.step_config = step_config self.event_stream_handler = event_stream_handler self._step_name_prefix = step_name_prefix # Wrap the request in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request', **self.step_config, ) async def wrapped_request_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await super(DBOSModel, self).request(messages, model_settings, model_request_parameters) self._dbos_wrapped_request_step = wrapped_request_step # Wrap the request_stream in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request_stream', **self.step_config, ) async def wrapped_request_stream_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> ModelResponse: async with super(DBOSModel, self).request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: if self.event_stream_handler is not None: assert run_context is not None, ( 'A DBOS model cannot be used with `pydantic_ai.direct.model_request_stream()` as it requires a `run_context`. Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) await self.event_stream_handler(run_context, streamed_response) async for _ in streamed_response: pass return streamed_response.get() self._dbos_wrapped_request_stream_step = wrapped_request_stream_step async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await self._dbos_wrapped_request_step(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: # If not in a workflow (could be in a step), just call the wrapped request_stream method. if DBOS.workflow_id is None or DBOS.step_id is not None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return response = await self._dbos_wrapped_request_stream_step( messages, model_settings, model_request_parameters, run_context ) yield DBOSStreamedResponse(model_request_parameters, response) ``` ### StepConfig Bases: `TypedDict` Configuration for a step in the DBOS workflow. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_utils.py` ```python class StepConfig(TypedDict, total=False): """Configuration for a step in the DBOS workflow.""" retries_allowed: bool interval_seconds: float max_attempts: int backoff_rate: float ``` ### PrefectAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python class PrefectAgent(WrapperAgent[AgentDepsT, OutputDataT]): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: dict[str, TaskConfig | None] | None = None, event_stream_handler_task_config: TaskConfig | None = None, prefectify_toolset_func: Callable[ [AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT], ] = prefectify_toolset, ): """Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_task_config: The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. model_task_config: The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. tool_task_config: The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. tool_task_config_by_name: Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). event_stream_handler_task_config: The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. prefectify_toolset_func: Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Prefect. The name will be used to identify the agent's flows and tasks." ) # Merge the config with the default Prefect config self._mcp_task_config = default_task_config | (mcp_task_config or {}) self._model_task_config = default_task_config | (model_task_config or {}) self._tool_task_config = default_task_config | (tool_task_config or {}) self._tool_task_config_by_name = tool_task_config_by_name or {} self._event_stream_handler_task_config = default_task_config | (event_stream_handler_task_config or {}) if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Prefect, it cannot be set at agent run time.' ) prefect_model = PrefectModel( wrapped.model, task_config=self._model_task_config, event_stream_handler=self.event_stream_handler, ) self._model = prefect_model def _prefectify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: """Convert a toolset to its Prefect equivalent.""" return prefectify_toolset_func( toolset, self._mcp_task_config, self._tool_task_config, self._tool_task_config_by_name, ) prefect_toolsets = [toolset.visit_and_replace(_prefectify_toolset) for toolset in wrapped.toolsets] self._toolsets = prefect_toolsets # Context variable to track when we're inside this agent's Prefect flow self._in_prefect_agent_flow: ContextVar[bool] = ContextVar( f'_in_prefect_agent_flow_{self._name}', default=False ) @property def name(self) -> str | None: return self._name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif FlowRunContext.get() is not None: # Special case if it's in a Prefect flow, we need to iterate through all events and call the handler. return self._call_event_stream_handler_in_flow else: return handler async def _call_event_stream_handler_in_flow( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: handler = self._event_stream_handler or super().event_stream_handler assert handler is not None # Create a task to handle each event @task(name='Handle Stream Event', **self._event_stream_handler_task_config) async def event_stream_handler_task(event: _messages.AgentStreamEvent) -> None: async def streamed_response(): yield event await handler(ctx, streamed_response()) async for event in stream: await event_stream_handler_task(event) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._prefect_overrides(): return super().toolsets @contextmanager def _prefect_overrides(self) -> Iterator[None]: # Override with PrefectModel and PrefectMCPServer in the toolsets. with super().override(model=self._model, toolsets=self._toolsets, tools=[]): yield @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Run') async def wrapped_run_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): result = await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) return result finally: self._in_prefect_agent_flow.reset(token) return await wrapped_run_flow() @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Sync Run') def wrapped_run_sync_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): # Using `run_coro_as_sync` from Prefect with async `run` to avoid event loop conflicts. result = run_coro_as_sync( super(PrefectAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) return result finally: self._in_prefect_agent_flow.reset(token) return wrapped_run_sync_flow() @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream_events()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. instructions: Optional additional instructions to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be set at agent run time inside a Prefect flow, it must be set at agent creation time.' ) with self._prefect_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be contextually overridden inside a Prefect flow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: ( dict[str, TaskConfig | None] | None ) = None, event_stream_handler_task_config: ( TaskConfig | None ) = None, prefectify_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None], ], AbstractToolset[AgentDepsT], ] = prefectify_toolset ) ``` Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Parameters: | Name | Type | Description | Default | | ---------------------------------- | ---------------------------------------------------------------------------------------- | ----------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | \`str | None\` | Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's name will be used. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use instead of the one set on the wrapped agent. | | `mcp_task_config` | \`TaskConfig | None\` | The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. | | `model_task_config` | \`TaskConfig | None\` | The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. | | `tool_task_config` | \`TaskConfig | None\` | The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. | | `tool_task_config_by_name` | \`dict\[str, TaskConfig | None\] | None\` | | `event_stream_handler_task_config` | \`TaskConfig | None\` | The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. | | `prefectify_toolset_func` | \`Callable\[\[AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict\[str, TaskConfig | None\]\], AbstractToolset[AgentDepsT]\]\` | Optional function to use to prepare toolsets for Prefect by wrapping them in a PrefectWrapperToolset that moves methods that require IO to Prefect tasks. If not provided, only FunctionToolset and MCPServer will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: dict[str, TaskConfig | None] | None = None, event_stream_handler_task_config: TaskConfig | None = None, prefectify_toolset_func: Callable[ [AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT], ] = prefectify_toolset, ): """Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_task_config: The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. model_task_config: The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. tool_task_config: The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. tool_task_config_by_name: Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). event_stream_handler_task_config: The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. prefectify_toolset_func: Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Prefect. The name will be used to identify the agent's flows and tasks." ) # Merge the config with the default Prefect config self._mcp_task_config = default_task_config | (mcp_task_config or {}) self._model_task_config = default_task_config | (model_task_config or {}) self._tool_task_config = default_task_config | (tool_task_config or {}) self._tool_task_config_by_name = tool_task_config_by_name or {} self._event_stream_handler_task_config = default_task_config | (event_stream_handler_task_config or {}) if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Prefect, it cannot be set at agent run time.' ) prefect_model = PrefectModel( wrapped.model, task_config=self._model_task_config, event_stream_handler=self.event_stream_handler, ) self._model = prefect_model def _prefectify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: """Convert a toolset to its Prefect equivalent.""" return prefectify_toolset_func( toolset, self._mcp_task_config, self._tool_task_config, self._tool_task_config_by_name, ) prefect_toolsets = [toolset.visit_and_replace(_prefectify_toolset) for toolset in wrapped.toolsets] self._toolsets = prefect_toolsets # Context variable to track when we're inside this agent's Prefect flow self._in_prefect_agent_flow: ContextVar[bool] = ContextVar( f'_in_prefect_agent_flow_{self._name}', default=False ) ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Run') async def wrapped_run_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): result = await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) return result finally: self._in_prefect_agent_flow.reset(token) return await wrapped_run_flow() ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------- | ---------------------- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Sync Run') def wrapped_run_sync_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): # Using `run_coro_as_sync` from Prefect with async `run` to avoid event loop conflicts. result = run_coro_as_sync( super(PrefectAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) return result finally: self._in_prefect_agent_flow.reset(token) return wrapped_run_sync_flow() ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | | `event_stream_handler` | \`EventStreamHandler[AgentDepsT] | None\` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | Returns: | Type | Description | | --------------------------------------------------- | ---------------------- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | --------------------------------- | ---------------------------- | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | | \`AsyncIterator\[AgentStreamEvent | AgentRunResultEvent[Any]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-5.2') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), PartEndEvent( index=0, part=TextPart(content='The capital of France is Paris. ') ), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream_events()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[ AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT] ] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `user_prompt` | \`str | Sequence[UserContent] | None\` | | `output_type` | \`OutputSpec[RunOutputDataT] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence\[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]\] | None\` | Returns: | Type | Description | | ------------------------------------------ | ---------------------- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool | BuiltinToolFunc[AgentDepsT]] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. instructions: Optional additional instructions to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be set at agent run time inside a Prefect flow, it must be set at agent creation time.' ) with self._prefect_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](https://ai.pydantic.dev/testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | -------------- | ----------------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------ | | `name` | \`str | Unset\` | The name to use instead of the name passed to the agent constructor and agent run. | | `deps` | \`AgentDepsT | Unset\` | The dependencies to use instead of the dependencies passed to the agent run. | | `model` | \`Model | KnownModelName | str | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | Unset\` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\] | Unset\` | | `instructions` | \`Instructions[AgentDepsT] | Unset\` | The instructions to use instead of the instructions registered with the agent. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be contextually overridden inside a Prefect flow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions ): yield ``` ### PrefectFunctionToolset Bases: `PrefectWrapperToolset[AgentDepsT]` A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_function_toolset.py` ```python class PrefectFunctionToolset(PrefectWrapperToolset[AgentDepsT]): """A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks.""" def __init__( self, wrapped: FunctionToolset[AgentDepsT], *, task_config: TaskConfig, tool_task_config: dict[str, TaskConfig | None], ): super().__init__(wrapped) self._task_config = default_task_config | (task_config or {}) self._tool_task_config = tool_task_config or {} @task async def _call_tool_task( tool_name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: return await super(PrefectFunctionToolset, self).call_tool(tool_name, tool_args, ctx, tool) self._call_tool_task = _call_tool_task async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: """Call a tool, wrapped as a Prefect task with a descriptive name.""" # Check if this specific tool has custom config or is disabled tool_specific_config = self._tool_task_config.get(name, default_task_config) if tool_specific_config is None: # None means this tool should not be wrapped as a task return await super().call_tool(name, tool_args, ctx, tool) # Merge tool-specific config with default config merged_config = self._task_config | tool_specific_config return await self._call_tool_task.with_options(name=f'Call Tool: {name}', **merged_config)( name, tool_args, ctx, tool ) ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool, wrapped as a Prefect task with a descriptive name. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_function_toolset.py` ```python async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: """Call a tool, wrapped as a Prefect task with a descriptive name.""" # Check if this specific tool has custom config or is disabled tool_specific_config = self._tool_task_config.get(name, default_task_config) if tool_specific_config is None: # None means this tool should not be wrapped as a task return await super().call_tool(name, tool_args, ctx, tool) # Merge tool-specific config with default config merged_config = self._task_config | tool_specific_config return await self._call_tool_task.with_options(name=f'Call Tool: {name}', **merged_config)( name, tool_args, ctx, tool ) ``` ### PrefectMCPServer Bases: `PrefectWrapperToolset[AgentDepsT]`, `ABC` A wrapper for MCPServer that integrates with Prefect, turning call_tool and get_tools into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_mcp_server.py` ```python class PrefectMCPServer(PrefectWrapperToolset[AgentDepsT], ABC): """A wrapper for MCPServer that integrates with Prefect, turning call_tool and get_tools into Prefect tasks.""" def __init__( self, wrapped: MCPServer, *, task_config: TaskConfig, ): super().__init__(wrapped) self._task_config = default_task_config | (task_config or {}) self._mcp_id = wrapped.id @task async def _call_tool_task( tool_name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await super(PrefectMCPServer, self).call_tool(tool_name, tool_args, ctx, tool) self._call_tool_task = _call_tool_task async def __aenter__(self) -> Self: await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: """Call an MCP tool, wrapped as a Prefect task with a descriptive name.""" return await self._call_tool_task.with_options(name=f'Call MCP Tool: {name}', **self._task_config)( name, tool_args, ctx, tool ) ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult ``` Call an MCP tool, wrapped as a Prefect task with a descriptive name. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_mcp_server.py` ```python async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: """Call an MCP tool, wrapped as a Prefect task with a descriptive name.""" return await self._call_tool_task.with_options(name=f'Call MCP Tool: {name}', **self._task_config)( name, tool_args, ctx, tool ) ``` ### PrefectModel Bases: `WrapperModel` A wrapper for Model that integrates with Prefect, turning request and request_stream into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python class PrefectModel(WrapperModel): """A wrapper for Model that integrates with Prefect, turning request and request_stream into Prefect tasks.""" def __init__( self, model: Any, *, task_config: TaskConfig, event_stream_handler: EventStreamHandler[Any] | None = None, ): super().__init__(model) self.task_config = default_task_config | (task_config or {}) self.event_stream_handler = event_stream_handler @task async def wrapped_request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: response = await super(PrefectModel, self).request(messages, model_settings, model_request_parameters) return response self._wrapped_request = wrapped_request @task async def request_stream_task( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ctx: RunContext[Any] | None, ) -> ModelResponse: async with super(PrefectModel, self).request_stream( messages, model_settings, model_request_parameters, ctx ) as streamed_response: if self.event_stream_handler is not None: assert ctx is not None, ( 'A Prefect model cannot be used with `pydantic_ai.direct.model_request_stream()` as it requires a `run_context`. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) await self.event_stream_handler(ctx, streamed_response) # Consume the entire stream async for _ in streamed_response: pass response = streamed_response.get() return response self._wrapped_request_stream = request_stream_task async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a model request, wrapped as a Prefect task when in a flow.""" return await self._wrapped_request.with_options( name=f'Model Request: {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. """ # Check if we're in a flow context flow_run_context = FlowRunContext.get() # If not in a flow, just call the wrapped request_stream method if flow_run_context is None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return # If in a flow, consume the stream in a task and return the final response response = await self._wrapped_request_stream.with_options( name=f'Model Request (Streaming): {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters, run_context) yield PrefectStreamedResponse(model_request_parameters, response) ``` #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a model request, wrapped as a Prefect task when in a flow. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a model request, wrapped as a Prefect task when in a flow.""" return await self._wrapped_request.with_options( name=f'Model Request: {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. """ # Check if we're in a flow context flow_run_context = FlowRunContext.get() # If not in a flow, just call the wrapped request_stream method if flow_run_context is None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return # If in a flow, consume the stream in a task and return the final response response = await self._wrapped_request_stream.with_options( name=f'Model Request (Streaming): {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters, run_context) yield PrefectStreamedResponse(model_request_parameters, response) ``` ### TaskConfig Bases: `TypedDict` Configuration for a task in Prefect. These options are passed to the `@task` decorator. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_types.py` ```python class TaskConfig(TypedDict, total=False): """Configuration for a task in Prefect. These options are passed to the `@task` decorator. """ retries: int """Maximum number of retries for the task.""" retry_delay_seconds: float | list[float] """Delay between retries in seconds. Can be a single value or a list for custom backoff.""" timeout_seconds: float """Maximum time in seconds for the task to complete.""" cache_policy: CachePolicy """Prefect cache policy for the task.""" persist_result: bool """Whether to persist the task result.""" result_storage: ResultStorage """Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`.""" log_prints: bool """Whether to log print statements from the task.""" ``` #### retries ```python retries: int ``` Maximum number of retries for the task. #### retry_delay_seconds ```python retry_delay_seconds: float | list[float] ``` Delay between retries in seconds. Can be a single value or a list for custom backoff. #### timeout_seconds ```python timeout_seconds: float ``` Maximum time in seconds for the task to complete. #### cache_policy ```python cache_policy: CachePolicy ``` Prefect cache policy for the task. #### persist_result ```python persist_result: bool ``` Whether to persist the task result. #### result_storage ```python result_storage: ResultStorage ``` Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`. #### log_prints ```python log_prints: bool ``` Whether to log print statements from the task. # `pydantic_ai.embeddings` ### EmbeddingModel Bases: `ABC` Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - OpenAIEmbeddingModel - CohereEmbeddingModel - GoogleEmbeddingModel - BedrockEmbeddingModel - SentenceTransformerEmbeddingModel Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python class EmbeddingModel(ABC): """Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - [`OpenAIEmbeddingModel`][pydantic_ai.embeddings.openai.OpenAIEmbeddingModel] - [`CohereEmbeddingModel`][pydantic_ai.embeddings.cohere.CohereEmbeddingModel] - [`GoogleEmbeddingModel`][pydantic_ai.embeddings.google.GoogleEmbeddingModel] - [`BedrockEmbeddingModel`][pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel] - [`SentenceTransformerEmbeddingModel`][pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel] """ _settings: EmbeddingSettings | None = None def __init__( self, *, settings: EmbeddingSettings | None = None, ) -> None: """Initialize the model with optional settings. Args: settings: Model-specific settings that will be used as defaults for this model. """ self._settings = settings @property def settings(self) -> EmbeddingSettings | None: """Get the default settings for this model.""" return self._settings @property def base_url(self) -> str | None: """The base URL for the provider API, if available.""" return None @property @abstractmethod def model_name(self) -> str: """The name of the embedding model.""" raise NotImplementedError() @property @abstractmethod def system(self) -> str: """The embedding model provider/system identifier (e.g., 'openai', 'cohere').""" raise NotImplementedError() @abstractmethod async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Generate embeddings for the given inputs. Args: inputs: A single string or sequence of strings to embed. input_type: Whether the inputs are queries or documents. settings: Optional settings to override the model's defaults. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata. """ raise NotImplementedError def prepare_embed( self, inputs: str | Sequence[str], settings: EmbeddingSettings | None = None ) -> tuple[list[str], EmbeddingSettings]: """Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Args: inputs: A single string or sequence of strings. settings: Optional settings to merge with defaults. Returns: A tuple of (normalized inputs list, merged settings). """ inputs = [inputs] if isinstance(inputs, str) else list(inputs) settings = merge_embedding_settings(self._settings, settings) or {} return inputs, settings async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model. Returns: The maximum token count, or `None` if unknown. """ return None # pragma: no cover async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ raise NotImplementedError ``` #### __init__ ```python __init__( *, settings: EmbeddingSettings | None = None ) -> None ``` Initialize the model with optional settings. Parameters: | Name | Type | Description | Default | | ---------- | ------------------- | ----------- | --------------------------------------------------------------------- | | `settings` | \`EmbeddingSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python def __init__( self, *, settings: EmbeddingSettings | None = None, ) -> None: """Initialize the model with optional settings. Args: settings: Model-specific settings that will be used as defaults for this model. """ self._settings = settings ``` #### settings ```python settings: EmbeddingSettings | None ``` Get the default settings for this model. #### base_url ```python base_url: str | None ``` The base URL for the provider API, if available. #### model_name ```python model_name: str ``` The name of the embedding model. #### system ```python system: str ``` The embedding model provider/system identifier (e.g., 'openai', 'cohere'). #### embed ```python embed( inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Generate embeddings for the given inputs. Parameters: | Name | Type | Description | Default | | ------------ | ------------------- | -------------------------------------------- | --------------------------------------------------- | | `inputs` | \`str | Sequence[str]\` | A single string or sequence of strings to embed. | | `input_type` | `EmbedInputType` | Whether the inputs are queries or documents. | *required* | | `settings` | \`EmbeddingSettings | None\` | Optional settings to override the model's defaults. | Returns: | Type | Description | | ----------------- | ----------------------------- | | `EmbeddingResult` | An EmbeddingResult containing | | `EmbeddingResult` | the embeddings and metadata. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python @abstractmethod async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Generate embeddings for the given inputs. Args: inputs: A single string or sequence of strings to embed. input_type: Whether the inputs are queries or documents. settings: Optional settings to override the model's defaults. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata. """ raise NotImplementedError ``` #### prepare_embed ```python prepare_embed( inputs: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> tuple[list[str], EmbeddingSettings] ``` Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Parameters: | Name | Type | Description | Default | | ---------- | ------------------- | --------------- | ----------------------------------------- | | `inputs` | \`str | Sequence[str]\` | A single string or sequence of strings. | | `settings` | \`EmbeddingSettings | None\` | Optional settings to merge with defaults. | Returns: | Type | Description | | ------------------------------------- | ----------------------------------------------------- | | `tuple[list[str], EmbeddingSettings]` | A tuple of (normalized inputs list, merged settings). | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python def prepare_embed( self, inputs: str | Sequence[str], settings: EmbeddingSettings | None = None ) -> tuple[list[str], EmbeddingSettings]: """Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Args: inputs: A single string or sequence of strings. settings: Optional settings to merge with defaults. Returns: A tuple of (normalized inputs list, merged settings). """ inputs = [inputs] if isinstance(inputs, str) else list(inputs) settings = merge_embedding_settings(self._settings, settings) or {} return inputs, settings ``` #### max_input_tokens ```python max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. Returns: | Type | Description | | ----- | ----------- | | \`int | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model. Returns: The maximum token count, or `None` if unknown. """ return None # pragma: no cover ``` #### count_tokens ```python count_tokens(text: str) -> int ``` Count the number of tokens in the given text. Parameters: | Name | Type | Description | Default | | ------ | ----- | ------------------------------- | ---------- | | `text` | `str` | The text to tokenize and count. | *required* | Returns: | Type | Description | | ----- | --------------------- | | `int` | The number of tokens. | Raises: | Type | Description | | --------------------- | -------------------------------------------- | | `NotImplementedError` | If the model doesn't support token counting. | | `UserError` | If the model or tokenizer is not supported. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ raise NotImplementedError ``` ### InstrumentedEmbeddingModel Bases: `WrapperEmbeddingModel` Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py` ```python @dataclass(init=False) class InstrumentedEmbeddingModel(WrapperEmbeddingModel): """Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ instrumentation_settings: InstrumentationSettings """Instrumentation settings for this model.""" def __init__( self, wrapped: EmbeddingModel | str, options: InstrumentationSettings | None = None, ) -> None: super().__init__(wrapped) self.instrumentation_settings = options or InstrumentationSettings() async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) with self._instrument(inputs, input_type, settings) as finish: result = await super().embed(inputs, input_type=input_type, settings=settings) finish(result) return result @contextmanager def _instrument( self, inputs: list[str], input_type: EmbedInputType, settings: EmbeddingSettings | None, ) -> Iterator[Callable[[EmbeddingResult], None]]: operation = 'embeddings' span_name = f'{operation} {self.model_name}' inputs_count = len(inputs) attributes: dict[str, AttributeValue] = { 'gen_ai.operation.name': operation, **self.model_attributes(self.wrapped), 'input_type': input_type, 'inputs_count': inputs_count, } if settings: attributes['embedding_settings'] = json.dumps(self.serialize_any(settings)) if self.instrumentation_settings.include_content: attributes['inputs'] = json.dumps(inputs) attributes['logfire.json_schema'] = json.dumps( { 'type': 'object', 'properties': { 'input_type': {'type': 'string'}, 'inputs_count': {'type': 'integer'}, 'embedding_settings': {'type': 'object'}, **( {'inputs': {'type': ['array']}, 'embeddings': {'type': 'array'}} if self.instrumentation_settings.include_content else {} ), }, } ) record_metrics: Callable[[], None] | None = None try: with self.instrumentation_settings.tracer.start_as_current_span(span_name, attributes=attributes) as span: def finish(result: EmbeddingResult): # Prepare metric recording closure first so metrics are recorded # even if the span is not recording. provider_name = attributes[GEN_AI_PROVIDER_NAME_ATTRIBUTE] request_model = attributes[GEN_AI_REQUEST_MODEL_ATTRIBUTE] response_model = result.model_name or request_model price_calculation = None def _record_metrics(): token_attributes = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: provider_name, 'gen_ai.operation.name': operation, GEN_AI_REQUEST_MODEL_ATTRIBUTE: request_model, 'gen_ai.response.model': response_model, 'gen_ai.token.type': 'input', } tokens = result.usage.input_tokens or 0 if tokens: # pragma: no branch self.instrumentation_settings.tokens_histogram.record(tokens, token_attributes) if price_calculation is not None: self.instrumentation_settings.cost_histogram.record( float(getattr(price_calculation, 'input_price', 0.0)), token_attributes, ) nonlocal record_metrics record_metrics = _record_metrics if not span.is_recording(): return # pragma: lax no cover attributes_to_set: dict[str, AttributeValue] = { **result.usage.opentelemetry_attributes(), 'gen_ai.response.model': response_model, } try: price_calculation = result.cost() except LookupError: # The cost of this provider/model is unknown, which is common. pass except Exception as e: # pragma: no cover warnings.warn( f'Failed to get cost from response: {type(e).__name__}: {e}', CostCalculationFailedWarning ) else: attributes_to_set['operation.cost'] = float(price_calculation.total_price) embeddings = result.embeddings if embeddings: # pragma: no branch attributes_to_set['gen_ai.embeddings.dimension.count'] = len(embeddings[0]) if self.instrumentation_settings.include_content: attributes['embeddings'] = json.dumps(embeddings) if result.provider_response_id is not None: attributes_to_set['gen_ai.response.id'] = result.provider_response_id span.set_attributes(attributes_to_set) yield finish finally: if record_metrics: # pragma: no branch # Record metrics after the span finishes to avoid duplication. record_metrics() @staticmethod def model_attributes(model: EmbeddingModel) -> dict[str, AttributeValue]: attributes: dict[str, AttributeValue] = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: model.system, GEN_AI_REQUEST_MODEL_ATTRIBUTE: model.model_name, } if base_url := model.base_url: try: parsed = urlparse(base_url) except Exception: # pragma: no cover pass else: if parsed.hostname: # pragma: no branch attributes['server.address'] = parsed.hostname if parsed.port: attributes['server.port'] = parsed.port # pragma: no cover return attributes @staticmethod def serialize_any(value: Any) -> str: try: return ANY_ADAPTER.dump_python(value, mode='json') except Exception: # pragma: no cover try: return str(value) except Exception as e: return f'Unable to serialize: {e}' ``` #### instrumentation_settings ```python instrumentation_settings: InstrumentationSettings = ( options or InstrumentationSettings() ) ``` Instrumentation settings for this model. ### instrument_embedding_model ```python instrument_embedding_model( model: EmbeddingModel, instrument: InstrumentationSettings | bool, ) -> EmbeddingModel ``` Instrument an embedding model with OpenTelemetry/logfire. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py` ```python def instrument_embedding_model(model: EmbeddingModel, instrument: InstrumentationSettings | bool) -> EmbeddingModel: """Instrument an embedding model with OpenTelemetry/logfire.""" if instrument and not isinstance(model, InstrumentedEmbeddingModel): if instrument is True: instrument = InstrumentationSettings() model = InstrumentedEmbeddingModel(model, instrument) return model ``` ### EmbeddingResult The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ````python @dataclass class EmbeddingResult: """The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` """ embeddings: Sequence[Sequence[float]] """The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. """ _: KW_ONLY inputs: Sequence[str] """The original input texts that were embedded.""" input_type: EmbedInputType """Whether the inputs were embedded as queries or documents.""" model_name: str """The name of the model that generated these embeddings.""" provider_name: str """The name of the provider (e.g., 'openai', 'cohere').""" timestamp: datetime = field(default_factory=_now_utc) """When the embedding request was made.""" usage: RequestUsage = field(default_factory=RequestUsage) """Token usage statistics for this request.""" provider_details: dict[str, Any] | None = None """Provider-specific details from the response.""" provider_response_id: str | None = None """Unique identifier for this response from the provider, if available.""" def __getitem__(self, item: int | str) -> Sequence[float]: """Get the embedding for an input by index or by the original input text. Args: item: Either an integer index or the original input string. Returns: The embedding vector for the specified input. Raises: IndexError: If the index is out of range. ValueError: If the string is not found in the inputs. """ if isinstance(item, str): item = self.inputs.index(item) return self.embeddings[item] def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: A price calculation object with `total_price`, `input_price`, and other cost details. Raises: LookupError: If pricing data is not available for this model/provider. """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ```` #### embeddings ```python embeddings: Sequence[Sequence[float]] ``` The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. #### inputs ```python inputs: Sequence[str] ``` The original input texts that were embedded. #### input_type ```python input_type: EmbedInputType ``` Whether the inputs were embedded as queries or documents. #### model_name ```python model_name: str ``` The name of the model that generated these embeddings. #### provider_name ```python provider_name: str ``` The name of the provider (e.g., 'openai', 'cohere'). #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` When the embedding request was made. #### usage ```python usage: RequestUsage = field(default_factory=RequestUsage) ``` Token usage statistics for this request. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Provider-specific details from the response. #### provider_response_id ```python provider_response_id: str | None = None ``` Unique identifier for this response from the provider, if available. #### __getitem__ ```python __getitem__(item: int | str) -> Sequence[float] ``` Get the embedding for an input by index or by the original input text. Parameters: | Name | Type | Description | Default | | ------ | ----- | ----------- | ----------------------------------------------------- | | `item` | \`int | str\` | Either an integer index or the original input string. | Returns: | Type | Description | | ----------------- | --------------------------------------------- | | `Sequence[float]` | The embedding vector for the specified input. | Raises: | Type | Description | | ------------ | ----------------------------------------- | | `IndexError` | If the index is out of range. | | `ValueError` | If the string is not found in the inputs. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ```python def __getitem__(self, item: int | str) -> Sequence[float]: """Get the embedding for an input by index or by the original input text. Args: item: Either an integer index or the original input string. Returns: The embedding vector for the specified input. Raises: IndexError: If the index is out of range. ValueError: If the string is not found in the inputs. """ if isinstance(item, str): item = self.inputs.index(item) return self.embeddings[item] ``` #### cost ```python cost() -> PriceCalculation ``` Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: | Type | Description | | ------------------ | --------------------------------------------------------------------------------- | | `PriceCalculation` | A price calculation object with total_price, input_price, and other cost details. | Raises: | Type | Description | | ------------- | --------------------------------------------------------- | | `LookupError` | If pricing data is not available for this model/provider. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ```python def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: A price calculation object with `total_price`, `input_price`, and other cost details. Raises: LookupError: If pricing data is not available for this model/provider. """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ``` ### EmbeddingSettings Bases: `TypedDict` Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., OpenAIEmbeddingSettings, CohereEmbeddingSettings) extend this with additional provider-prefixed options. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/settings.py` ```python class EmbeddingSettings(TypedDict, total=False): """Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`][pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings], [`CohereEmbeddingSettings`][pydantic_ai.embeddings.cohere.CohereEmbeddingSettings]) extend this with additional provider-prefixed options. """ dimensions: int """The number of dimensions for the output embeddings. Supported by: * OpenAI * Cohere * Google * Sentence Transformers * Bedrock * VoyageAI """ truncate: bool """Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use [`max_input_tokens()`][pydantic_ai.embeddings.Embedder.max_input_tokens] and [`count_tokens()`][pydantic_ai.embeddings.Embedder.count_tokens] to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: * Cohere * Bedrock (Cohere and Nova models) * VoyageAI """ extra_headers: dict[str, str] """Extra headers to send to the model. Supported by: * OpenAI * Cohere """ extra_body: object """Extra body to send to the model. Supported by: * OpenAI * Cohere """ ``` #### dimensions ```python dimensions: int ``` The number of dimensions for the output embeddings. Supported by: - OpenAI - Cohere - Google - Sentence Transformers - Bedrock - VoyageAI #### truncate ```python truncate: bool ``` Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use max_input_tokens() and count_tokens() to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: - Cohere - Bedrock (Cohere and Nova models) - VoyageAI #### extra_headers ```python extra_headers: dict[str, str] ``` Extra headers to send to the model. Supported by: - OpenAI - Cohere #### extra_body ```python extra_body: object ``` Extra body to send to the model. Supported by: - OpenAI - Cohere ### merge_embedding_settings ```python merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None, ) -> EmbeddingSettings | None ``` Merge two sets of embedding settings, with overrides taking precedence. Parameters: | Name | Type | Description | Default | | ----------- | ------------------- | ----------- | --------------------------------------------------------------------- | | `base` | \`EmbeddingSettings | None\` | Base settings (typically from the embedder or model). | | `overrides` | \`EmbeddingSettings | None\` | Settings that should override the base (typically per-call settings). | Returns: | Type | Description | | ------------------- | ----------- | | \`EmbeddingSettings | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/settings.py` ```python def merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None ) -> EmbeddingSettings | None: """Merge two sets of embedding settings, with overrides taking precedence. Args: base: Base settings (typically from the embedder or model). overrides: Settings that should override the base (typically per-call settings). Returns: Merged settings, or `None` if both inputs are `None`. """ # Note: we may want merge recursively if/when we add non-primitive values if base and overrides: return base | overrides else: return base or overrides ``` ### TestEmbeddingModel Bases: `EmbeddingModel` A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/test.py` ````python @dataclass(init=False) class TestEmbeddingModel(EmbeddingModel): """A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` """ # NOTE: Avoid test discovery by pytest. __test__ = False _model_name: str """The model name to report in results.""" _provider_name: str """The provider name to report in results.""" _dimensions: int """The number of dimensions for generated embeddings.""" last_settings: EmbeddingSettings | None = None """The settings used in the most recent embed call.""" def __init__( self, model_name: str = 'test', *, provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ): """Initialize the test embedding model. Args: model_name: The model name to report in results. provider_name: The provider name to report in results. dimensions: The number of dimensions for the generated embeddings. settings: Optional default settings for the model. """ self._model_name = model_name self._provider_name = provider_name self._dimensions = dimensions self.last_settings = None super().__init__(settings=settings) @property def model_name(self) -> str: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider_name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) self.last_settings = settings dimensions = settings.get('dimensions') or self._dimensions return EmbeddingResult( embeddings=[[1.0] * dimensions] * len(inputs), inputs=inputs, input_type=input_type, usage=RequestUsage(input_tokens=sum(_estimate_tokens(text) for text in inputs)), model_name=self.model_name, provider_name=self.system, provider_response_id=str(uuid.uuid4()), ) async def max_input_tokens(self) -> int | None: return 1024 async def count_tokens(self, text: str) -> int: return _estimate_tokens(text) ```` #### last_settings ```python last_settings: EmbeddingSettings | None = None ``` The settings used in the most recent embed call. #### __init__ ```python __init__( model_name: str = "test", *, provider_name: str = "test", dimensions: int = 8, settings: EmbeddingSettings | None = None ) ``` Initialize the test embedding model. Parameters: | Name | Type | Description | Default | | --------------- | ------------------- | ------------------------------------------------------ | ---------------------------------------- | | `model_name` | `str` | The model name to report in results. | `'test'` | | `provider_name` | `str` | The provider name to report in results. | `'test'` | | `dimensions` | `int` | The number of dimensions for the generated embeddings. | `8` | | `settings` | \`EmbeddingSettings | None\` | Optional default settings for the model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/test.py` ```python def __init__( self, model_name: str = 'test', *, provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ): """Initialize the test embedding model. Args: model_name: The model name to report in results. provider_name: The provider name to report in results. dimensions: The number of dimensions for the generated embeddings. settings: Optional default settings for the model. """ self._model_name = model_name self._provider_name = provider_name self._dimensions = dimensions self.last_settings = None super().__init__(settings=settings) ``` #### model_name ```python model_name: str ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### WrapperEmbeddingModel Bases: `EmbeddingModel` Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py` ```python @dataclass(init=False) class WrapperEmbeddingModel(EmbeddingModel): """Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. """ wrapped: EmbeddingModel """The underlying embedding model being wrapped.""" def __init__(self, wrapped: EmbeddingModel | str): """Initialize the wrapper with an embedding model. Args: wrapped: The model to wrap. Can be an [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance or a model name string (e.g., `'openai:text-embedding-3-small'`). """ from . import infer_embedding_model super().__init__() self.wrapped = infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: return await self.wrapped.embed(inputs, input_type=input_type, settings=settings) async def max_input_tokens(self) -> int | None: return await self.wrapped.max_input_tokens() async def count_tokens(self, text: str) -> int: return await self.wrapped.count_tokens(text) @property def model_name(self) -> str: return self.wrapped.model_name @property def system(self) -> str: return self.wrapped.system @property def settings(self) -> EmbeddingSettings | None: """Get the settings from the wrapped embedding model.""" return self.wrapped.settings @property def base_url(self) -> str | None: return self.wrapped.base_url def __getattr__(self, item: str): return getattr(self.wrapped, item) # pragma: no cover ``` #### wrapped ```python wrapped: EmbeddingModel = ( infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped ) ``` The underlying embedding model being wrapped. #### __init__ ```python __init__(wrapped: EmbeddingModel | str) ``` Initialize the wrapper with an embedding model. Parameters: | Name | Type | Description | Default | | --------- | ---------------- | ----------- | -------------------------------------------------------------------------------------------------------------------- | | `wrapped` | \`EmbeddingModel | str\` | The model to wrap. Can be an EmbeddingModel instance or a model name string (e.g., 'openai:text-embedding-3-small'). | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py` ```python def __init__(self, wrapped: EmbeddingModel | str): """Initialize the wrapper with an embedding model. Args: wrapped: The model to wrap. Can be an [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance or a model name string (e.g., `'openai:text-embedding-3-small'`). """ from . import infer_embedding_model super().__init__() self.wrapped = infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped ``` #### settings ```python settings: EmbeddingSettings | None ``` Get the settings from the wrapped embedding model. ### KnownEmbeddingModelName ```python KnownEmbeddingModelName = TypeAliasType( "KnownEmbeddingModelName", Literal[ "google-gla:gemini-embedding-001", "google-vertex:gemini-embedding-001", "google-vertex:text-embedding-005", "google-vertex:text-multilingual-embedding-002", "openai:text-embedding-ada-002", "openai:text-embedding-3-small", "openai:text-embedding-3-large", "cohere:embed-v4.0", "cohere:embed-english-v3.0", "cohere:embed-english-light-v3.0", "cohere:embed-multilingual-v3.0", "cohere:embed-multilingual-light-v3.0", "voyageai:voyage-4-large", "voyageai:voyage-4", "voyageai:voyage-4-lite", "voyageai:voyage-3-large", "voyageai:voyage-3.5", "voyageai:voyage-3.5-lite", "voyageai:voyage-code-3", "voyageai:voyage-finance-2", "voyageai:voyage-law-2", "voyageai:voyage-code-2", "bedrock:amazon.titan-embed-text-v1", "bedrock:amazon.titan-embed-text-v2:0", "bedrock:cohere.embed-english-v3", "bedrock:cohere.embed-multilingual-v3", "bedrock:cohere.embed-v4:0", "bedrock:amazon.nova-2-multimodal-embeddings-v1:0", ], ) ``` Known model names that can be used with the `model` parameter of Embedder. `KnownEmbeddingModelName` is provided as a concise way to specify an embedding model. ### infer_embedding_model ```python infer_embedding_model( model: EmbeddingModel | KnownEmbeddingModelName | str, *, provider_factory: Callable[ [str], Provider[Any] ] = infer_provider ) -> EmbeddingModel ``` Infer the model from the name. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def infer_embedding_model( model: EmbeddingModel | KnownEmbeddingModelName | str, *, provider_factory: Callable[[str], Provider[Any]] = infer_provider, ) -> EmbeddingModel: """Infer the model from the name.""" if isinstance(model, EmbeddingModel): return model try: provider_name, model_name = model.split(':', maxsplit=1) except ValueError as e: raise ValueError('You must provide a provider prefix when specifying an embedding model name') from e provider = provider_factory(provider_name) model_kind = provider_name if model_kind.startswith('gateway/'): from ..providers.gateway import normalize_gateway_provider model_kind = normalize_gateway_provider(model_kind) if model_kind in ( 'openai', # For now, we assume that every chat and completions-compatible provider also # supports the embeddings endpoint, as at worst the user would get an `ModelHTTPError`. *get_args(OpenAIChatCompatibleProvider.__value__), *get_args(OpenAIResponsesCompatibleProvider.__value__), ): from .openai import OpenAIEmbeddingModel return OpenAIEmbeddingModel(model_name, provider=provider) elif model_kind == 'cohere': from .cohere import CohereEmbeddingModel return CohereEmbeddingModel(model_name, provider=provider) elif model_kind == 'bedrock': from .bedrock import BedrockEmbeddingModel return BedrockEmbeddingModel(model_name, provider=provider) elif model_kind in ('google-gla', 'google-vertex'): from .google import GoogleEmbeddingModel return GoogleEmbeddingModel(model_name, provider=provider) elif model_kind == 'sentence-transformers': from .sentence_transformers import SentenceTransformerEmbeddingModel return SentenceTransformerEmbeddingModel(model_name) elif model_kind == 'voyageai': from .voyageai import VoyageAIEmbeddingModel return VoyageAIEmbeddingModel(model_name, provider=provider) else: raise UserError(f'Unknown embeddings model: {model}') # pragma: no cover ``` ### Embedder High-level interface for generating text embeddings. The `Embedder` class provides a convenient way to generate vector embeddings from text using various embedding model providers. It handles model inference, settings management, and optional OpenTelemetry instrumentation. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is machine learning?') print(result.embeddings[0][:5]) # First 5 dimensions #> [1.0, 1.0, 1.0, 1.0, 1.0] ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ````python @dataclass(init=False) class Embedder: """High-level interface for generating text embeddings. The `Embedder` class provides a convenient way to generate vector embeddings from text using various embedding model providers. It handles model inference, settings management, and optional OpenTelemetry instrumentation. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is machine learning?') print(result.embeddings[0][:5]) # First 5 dimensions #> [1.0, 1.0, 1.0, 1.0, 1.0] ``` """ instrument: InstrumentationSettings | bool | None """Options to automatically instrument with OpenTelemetry. Set to `True` to use default instrumentation settings, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.models.instrumented.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Embedder.instrument_all()`][pydantic_ai.embeddings.Embedder.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ _instrument_default: ClassVar[InstrumentationSettings | bool] = False def __init__( self, model: EmbeddingModel | KnownEmbeddingModelName | str, *, settings: EmbeddingSettings | None = None, defer_model_check: bool = True, instrument: InstrumentationSettings | bool | None = None, ) -> None: """Initialize an Embedder. Args: model: The embedding model to use. Can be specified as: - A model name string in the format `'provider:model-name'` (e.g., `'openai:text-embedding-3-small'`) - An [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance settings: Optional [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for all embed calls. defer_model_check: Whether to defer model validation until first use. Set to `False` to validate the model immediately on construction. instrument: OpenTelemetry instrumentation settings. Set to `True` to enable with defaults, or pass an [`InstrumentationSettings`][pydantic_ai.models.instrumented.InstrumentationSettings] instance to customize. If `None`, uses the value from [`Embedder.instrument_all()`][pydantic_ai.embeddings.Embedder.instrument_all]. """ self._model = model if defer_model_check else infer_embedding_model(model) self._settings = settings self.instrument = instrument self._override_model: ContextVar[EmbeddingModel | None] = ContextVar('_override_model', default=None) @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the default instrumentation options for all embedders where `instrument` is not explicitly set. This is useful for enabling instrumentation globally without modifying each embedder individually. Args: instrument: Instrumentation settings to use as the default. Set to `True` for default settings, `False` to disable, or pass an [`InstrumentationSettings`][pydantic_ai.models.instrumented.InstrumentationSettings] instance to customize. """ Embedder._instrument_default = instrument @property def model(self) -> EmbeddingModel | KnownEmbeddingModelName | str: """The embedding model used by this embedder.""" return self._model @contextmanager def override( self, *, model: EmbeddingModel | KnownEmbeddingModelName | str | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override the embedding model. Useful for testing or dynamically switching models. Args: model: The embedding model to use within this context. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Temporarily use a different model with embedder.override(model='openai:text-embedding-3-large'): result = await embedder.embed_query('test') print(len(result.embeddings[0])) # 3072 dimensions for large model #> 3072 ``` """ if _utils.is_set(model): model_token = self._override_model.set(infer_embedding_model(model)) else: model_token = None try: yield finally: if model_token is not None: self._override_model.reset(model_token) async def embed_query( self, query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed one or more query texts. Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document. Args: query: A single query string or sequence of query strings to embed. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ return await self.embed(query, input_type='query', settings=settings) async def embed_documents( self, documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed one or more document texts. Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document. Args: documents: A single document string or sequence of document strings to embed. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ return await self.embed(documents, input_type='document', settings=settings) async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed text inputs with explicit input type specification. This is the low-level embedding method. For most use cases, prefer [`embed_query()`][pydantic_ai.embeddings.Embedder.embed_query] or [`embed_documents()`][pydantic_ai.embeddings.Embedder.embed_documents]. Args: inputs: A single string or sequence of strings to embed. input_type: The type of input, either `'query'` or `'document'`. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ model = self._get_model() settings = merge_embedding_settings(self._settings, settings) return await model.embed(inputs, input_type=input_type, settings=settings) async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens the model can accept as input. Returns: The maximum token count, or `None` if the limit is unknown for this model. """ model = self._get_model() return await model.max_input_tokens() async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens in the text. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ model = self._get_model() return await model.count_tokens(text) def embed_query_sync( self, query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed_query()`][pydantic_ai.embeddings.Embedder.embed_query].""" return _utils.get_event_loop().run_until_complete(self.embed_query(query, settings=settings)) def embed_documents_sync( self, documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed_documents()`][pydantic_ai.embeddings.Embedder.embed_documents].""" return _utils.get_event_loop().run_until_complete(self.embed_documents(documents, settings=settings)) def embed_sync( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed()`][pydantic_ai.embeddings.Embedder.embed].""" return _utils.get_event_loop().run_until_complete(self.embed(inputs, input_type=input_type, settings=settings)) def max_input_tokens_sync(self) -> int | None: """Synchronous version of [`max_input_tokens()`][pydantic_ai.embeddings.Embedder.max_input_tokens].""" return _utils.get_event_loop().run_until_complete(self.max_input_tokens()) def count_tokens_sync(self, text: str) -> int: """Synchronous version of [`count_tokens()`][pydantic_ai.embeddings.Embedder.count_tokens].""" return _utils.get_event_loop().run_until_complete(self.count_tokens(text)) def _get_model(self) -> EmbeddingModel: """Create a model configured for this embedder. Returns: The embedding model to use, with instrumentation applied if configured. """ model_: EmbeddingModel if some_model := self._override_model.get(): model_ = some_model else: model_ = self._model = infer_embedding_model(self.model) instrument = self.instrument if instrument is None: instrument = self._instrument_default return instrument_embedding_model(model_, instrument) ```` #### __init__ ```python __init__( model: EmbeddingModel | KnownEmbeddingModelName | str, *, settings: EmbeddingSettings | None = None, defer_model_check: bool = True, instrument: InstrumentationSettings | bool | None = None ) -> None ``` Initialize an Embedder. Parameters: | Name | Type | Description | Default | | ------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------ | | `model` | \`EmbeddingModel | KnownEmbeddingModelName | str\` | | `settings` | \`EmbeddingSettings | None\` | Optional EmbeddingSettings to use as defaults for all embed calls. | | `defer_model_check` | `bool` | Whether to defer model validation until first use. Set to False to validate the model immediately on construction. | `True` | | `instrument` | \`InstrumentationSettings | bool | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def __init__( self, model: EmbeddingModel | KnownEmbeddingModelName | str, *, settings: EmbeddingSettings | None = None, defer_model_check: bool = True, instrument: InstrumentationSettings | bool | None = None, ) -> None: """Initialize an Embedder. Args: model: The embedding model to use. Can be specified as: - A model name string in the format `'provider:model-name'` (e.g., `'openai:text-embedding-3-small'`) - An [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance settings: Optional [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for all embed calls. defer_model_check: Whether to defer model validation until first use. Set to `False` to validate the model immediately on construction. instrument: OpenTelemetry instrumentation settings. Set to `True` to enable with defaults, or pass an [`InstrumentationSettings`][pydantic_ai.models.instrumented.InstrumentationSettings] instance to customize. If `None`, uses the value from [`Embedder.instrument_all()`][pydantic_ai.embeddings.Embedder.instrument_all]. """ self._model = model if defer_model_check else infer_embedding_model(model) self._settings = settings self.instrument = instrument self._override_model: ContextVar[EmbeddingModel | None] = ContextVar('_override_model', default=None) ``` #### instrument ```python instrument: InstrumentationSettings | bool | None = ( instrument ) ``` Options to automatically instrument with OpenTelemetry. Set to `True` to use default instrumentation settings, which will use Logfire if it's configured. Set to an instance of InstrumentationSettings to customize. If this isn't set, then the last value set by Embedder.instrument_all() will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. #### instrument_all ```python instrument_all( instrument: InstrumentationSettings | bool = True, ) -> None ``` Set the default instrumentation options for all embedders where `instrument` is not explicitly set. This is useful for enabling instrumentation globally without modifying each embedder individually. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `instrument` | \`InstrumentationSettings | bool\` | Instrumentation settings to use as the default. Set to True for default settings, False to disable, or pass an InstrumentationSettings instance to customize. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the default instrumentation options for all embedders where `instrument` is not explicitly set. This is useful for enabling instrumentation globally without modifying each embedder individually. Args: instrument: Instrumentation settings to use as the default. Set to `True` for default settings, `False` to disable, or pass an [`InstrumentationSettings`][pydantic_ai.models.instrumented.InstrumentationSettings] instance to customize. """ Embedder._instrument_default = instrument ``` #### model ```python model: EmbeddingModel | KnownEmbeddingModelName | str ``` The embedding model used by this embedder. #### override ```python override( *, model: ( EmbeddingModel | KnownEmbeddingModelName | str | Unset ) = UNSET ) -> Iterator[None] ``` Context manager to temporarily override the embedding model. Useful for testing or dynamically switching models. Parameters: | Name | Type | Description | Default | | ------- | ---------------- | ----------------------- | ------- | | `model` | \`EmbeddingModel | KnownEmbeddingModelName | str | Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Temporarily use a different model with embedder.override(model='openai:text-embedding-3-large'): result = await embedder.embed_query('test') print(len(result.embeddings[0])) # 3072 dimensions for large model #> 3072 ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ````python @contextmanager def override( self, *, model: EmbeddingModel | KnownEmbeddingModelName | str | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override the embedding model. Useful for testing or dynamically switching models. Args: model: The embedding model to use within this context. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): # Temporarily use a different model with embedder.override(model='openai:text-embedding-3-large'): result = await embedder.embed_query('test') print(len(result.embeddings[0])) # 3072 dimensions for large model #> 3072 ``` """ if _utils.is_set(model): model_token = self._override_model.set(infer_embedding_model(model)) else: model_token = None try: yield finally: if model_token is not None: self._override_model.reset(model_token) ```` #### embed_query ```python embed_query( query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Embed one or more query texts. Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document. Parameters: | Name | Type | Description | Default | | ---------- | ------------------- | --------------- | ---------------------------------------------------------------------------- | | `query` | \`str | Sequence[str]\` | A single query string or sequence of query strings to embed. | | `settings` | \`EmbeddingSettings | None\` | Optional settings to override the embedder's default settings for this call. | Returns: | Type | Description | | ----------------- | -------------------------------------------- | | `EmbeddingResult` | An EmbeddingResult containing the embeddings | | `EmbeddingResult` | and metadata about the operation. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python async def embed_query( self, query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed one or more query texts. Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document. Args: query: A single query string or sequence of query strings to embed. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ return await self.embed(query, input_type='query', settings=settings) ``` #### embed_documents ```python embed_documents( documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Embed one or more document texts. Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document. Parameters: | Name | Type | Description | Default | | ----------- | ------------------- | --------------- | ---------------------------------------------------------------------------- | | `documents` | \`str | Sequence[str]\` | A single document string or sequence of document strings to embed. | | `settings` | \`EmbeddingSettings | None\` | Optional settings to override the embedder's default settings for this call. | Returns: | Type | Description | | ----------------- | -------------------------------------------- | | `EmbeddingResult` | An EmbeddingResult containing the embeddings | | `EmbeddingResult` | and metadata about the operation. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python async def embed_documents( self, documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed one or more document texts. Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document. Args: documents: A single document string or sequence of document strings to embed. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ return await self.embed(documents, input_type='document', settings=settings) ``` #### embed ```python embed( inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Embed text inputs with explicit input type specification. This is the low-level embedding method. For most use cases, prefer embed_query() or embed_documents(). Parameters: | Name | Type | Description | Default | | ------------ | ------------------- | ------------------------------------------------ | ---------------------------------------------------------------------------- | | `inputs` | \`str | Sequence[str]\` | A single string or sequence of strings to embed. | | `input_type` | `EmbedInputType` | The type of input, either 'query' or 'document'. | *required* | | `settings` | \`EmbeddingSettings | None\` | Optional settings to override the embedder's default settings for this call. | Returns: | Type | Description | | ----------------- | -------------------------------------------- | | `EmbeddingResult` | An EmbeddingResult containing the embeddings | | `EmbeddingResult` | and metadata about the operation. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Embed text inputs with explicit input type specification. This is the low-level embedding method. For most use cases, prefer [`embed_query()`][pydantic_ai.embeddings.Embedder.embed_query] or [`embed_documents()`][pydantic_ai.embeddings.Embedder.embed_documents]. Args: inputs: A single string or sequence of strings to embed. input_type: The type of input, either `'query'` or `'document'`. settings: Optional settings to override the embedder's default settings for this call. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata about the operation. """ model = self._get_model() settings = merge_embedding_settings(self._settings, settings) return await model.embed(inputs, input_type=input_type, settings=settings) ``` #### max_input_tokens ```python max_input_tokens() -> int | None ``` Get the maximum number of tokens the model can accept as input. Returns: | Type | Description | | ----- | ----------- | | \`int | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens the model can accept as input. Returns: The maximum token count, or `None` if the limit is unknown for this model. """ model = self._get_model() return await model.max_input_tokens() ``` #### count_tokens ```python count_tokens(text: str) -> int ``` Count the number of tokens in the given text. Parameters: | Name | Type | Description | Default | | ------ | ----- | ------------------------------- | ---------- | | `text` | `str` | The text to tokenize and count. | *required* | Returns: | Type | Description | | ----- | --------------------------------- | | `int` | The number of tokens in the text. | Raises: | Type | Description | | --------------------- | -------------------------------------------- | | `NotImplementedError` | If the model doesn't support token counting. | | `UserError` | If the model or tokenizer is not supported. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens in the text. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ model = self._get_model() return await model.count_tokens(text) ``` #### embed_query_sync ```python embed_query_sync( query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Synchronous version of embed_query(). Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def embed_query_sync( self, query: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed_query()`][pydantic_ai.embeddings.Embedder.embed_query].""" return _utils.get_event_loop().run_until_complete(self.embed_query(query, settings=settings)) ``` #### embed_documents_sync ```python embed_documents_sync( documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Synchronous version of embed_documents(). Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def embed_documents_sync( self, documents: str | Sequence[str], *, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed_documents()`][pydantic_ai.embeddings.Embedder.embed_documents].""" return _utils.get_event_loop().run_until_complete(self.embed_documents(documents, settings=settings)) ``` #### embed_sync ```python embed_sync( inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Synchronous version of embed(). Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def embed_sync( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Synchronous version of [`embed()`][pydantic_ai.embeddings.Embedder.embed].""" return _utils.get_event_loop().run_until_complete(self.embed(inputs, input_type=input_type, settings=settings)) ``` #### max_input_tokens_sync ```python max_input_tokens_sync() -> int | None ``` Synchronous version of max_input_tokens(). Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def max_input_tokens_sync(self) -> int | None: """Synchronous version of [`max_input_tokens()`][pydantic_ai.embeddings.Embedder.max_input_tokens].""" return _utils.get_event_loop().run_until_complete(self.max_input_tokens()) ``` #### count_tokens_sync ```python count_tokens_sync(text: str) -> int ``` Synchronous version of count_tokens(). Source code in `pydantic_ai_slim/pydantic_ai/embeddings/__init__.py` ```python def count_tokens_sync(self, text: str) -> int: """Synchronous version of [`count_tokens()`][pydantic_ai.embeddings.Embedder.count_tokens].""" return _utils.get_event_loop().run_until_complete(self.count_tokens(text)) ``` ### EmbeddingModel Bases: `ABC` Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - OpenAIEmbeddingModel - CohereEmbeddingModel - GoogleEmbeddingModel - BedrockEmbeddingModel - SentenceTransformerEmbeddingModel Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python class EmbeddingModel(ABC): """Abstract base class for embedding models. Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations: - [`OpenAIEmbeddingModel`][pydantic_ai.embeddings.openai.OpenAIEmbeddingModel] - [`CohereEmbeddingModel`][pydantic_ai.embeddings.cohere.CohereEmbeddingModel] - [`GoogleEmbeddingModel`][pydantic_ai.embeddings.google.GoogleEmbeddingModel] - [`BedrockEmbeddingModel`][pydantic_ai.embeddings.bedrock.BedrockEmbeddingModel] - [`SentenceTransformerEmbeddingModel`][pydantic_ai.embeddings.sentence_transformers.SentenceTransformerEmbeddingModel] """ _settings: EmbeddingSettings | None = None def __init__( self, *, settings: EmbeddingSettings | None = None, ) -> None: """Initialize the model with optional settings. Args: settings: Model-specific settings that will be used as defaults for this model. """ self._settings = settings @property def settings(self) -> EmbeddingSettings | None: """Get the default settings for this model.""" return self._settings @property def base_url(self) -> str | None: """The base URL for the provider API, if available.""" return None @property @abstractmethod def model_name(self) -> str: """The name of the embedding model.""" raise NotImplementedError() @property @abstractmethod def system(self) -> str: """The embedding model provider/system identifier (e.g., 'openai', 'cohere').""" raise NotImplementedError() @abstractmethod async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Generate embeddings for the given inputs. Args: inputs: A single string or sequence of strings to embed. input_type: Whether the inputs are queries or documents. settings: Optional settings to override the model's defaults. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata. """ raise NotImplementedError def prepare_embed( self, inputs: str | Sequence[str], settings: EmbeddingSettings | None = None ) -> tuple[list[str], EmbeddingSettings]: """Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Args: inputs: A single string or sequence of strings. settings: Optional settings to merge with defaults. Returns: A tuple of (normalized inputs list, merged settings). """ inputs = [inputs] if isinstance(inputs, str) else list(inputs) settings = merge_embedding_settings(self._settings, settings) or {} return inputs, settings async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model. Returns: The maximum token count, or `None` if unknown. """ return None # pragma: no cover async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ raise NotImplementedError ``` #### __init__ ```python __init__( *, settings: EmbeddingSettings | None = None ) -> None ``` Initialize the model with optional settings. Parameters: | Name | Type | Description | Default | | ---------- | ------------------- | ----------- | --------------------------------------------------------------------- | | `settings` | \`EmbeddingSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python def __init__( self, *, settings: EmbeddingSettings | None = None, ) -> None: """Initialize the model with optional settings. Args: settings: Model-specific settings that will be used as defaults for this model. """ self._settings = settings ``` #### settings ```python settings: EmbeddingSettings | None ``` Get the default settings for this model. #### base_url ```python base_url: str | None ``` The base URL for the provider API, if available. #### model_name ```python model_name: str ``` The name of the embedding model. #### system ```python system: str ``` The embedding model provider/system identifier (e.g., 'openai', 'cohere'). #### embed ```python embed( inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult ``` Generate embeddings for the given inputs. Parameters: | Name | Type | Description | Default | | ------------ | ------------------- | -------------------------------------------- | --------------------------------------------------- | | `inputs` | \`str | Sequence[str]\` | A single string or sequence of strings to embed. | | `input_type` | `EmbedInputType` | Whether the inputs are queries or documents. | *required* | | `settings` | \`EmbeddingSettings | None\` | Optional settings to override the model's defaults. | Returns: | Type | Description | | ----------------- | ----------------------------- | | `EmbeddingResult` | An EmbeddingResult containing | | `EmbeddingResult` | the embeddings and metadata. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python @abstractmethod async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: """Generate embeddings for the given inputs. Args: inputs: A single string or sequence of strings to embed. input_type: Whether the inputs are queries or documents. settings: Optional settings to override the model's defaults. Returns: An [`EmbeddingResult`][pydantic_ai.embeddings.EmbeddingResult] containing the embeddings and metadata. """ raise NotImplementedError ``` #### prepare_embed ```python prepare_embed( inputs: str | Sequence[str], settings: EmbeddingSettings | None = None, ) -> tuple[list[str], EmbeddingSettings] ``` Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Parameters: | Name | Type | Description | Default | | ---------- | ------------------- | --------------- | ----------------------------------------- | | `inputs` | \`str | Sequence[str]\` | A single string or sequence of strings. | | `settings` | \`EmbeddingSettings | None\` | Optional settings to merge with defaults. | Returns: | Type | Description | | ------------------------------------- | ----------------------------------------------------- | | `tuple[list[str], EmbeddingSettings]` | A tuple of (normalized inputs list, merged settings). | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python def prepare_embed( self, inputs: str | Sequence[str], settings: EmbeddingSettings | None = None ) -> tuple[list[str], EmbeddingSettings]: """Prepare the inputs and settings for embedding. This method normalizes inputs to a list and merges settings. Subclasses should call this at the start of their `embed()` implementation. Args: inputs: A single string or sequence of strings. settings: Optional settings to merge with defaults. Returns: A tuple of (normalized inputs list, merged settings). """ inputs = [inputs] if isinstance(inputs, str) else list(inputs) settings = merge_embedding_settings(self._settings, settings) or {} return inputs, settings ``` #### max_input_tokens ```python max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. Returns: | Type | Description | | ----- | ----------- | | \`int | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model. Returns: The maximum token count, or `None` if unknown. """ return None # pragma: no cover ``` #### count_tokens ```python count_tokens(text: str) -> int ``` Count the number of tokens in the given text. Parameters: | Name | Type | Description | Default | | ------ | ----- | ------------------------------- | ---------- | | `text` | `str` | The text to tokenize and count. | *required* | Returns: | Type | Description | | ----- | --------------------- | | `int` | The number of tokens. | Raises: | Type | Description | | --------------------- | -------------------------------------------- | | `NotImplementedError` | If the model doesn't support token counting. | | `UserError` | If the model or tokenizer is not supported. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/base.py` ```python async def count_tokens(self, text: str) -> int: """Count the number of tokens in the given text. Args: text: The text to tokenize and count. Returns: The number of tokens. Raises: NotImplementedError: If the model doesn't support token counting. UserError: If the model or tokenizer is not supported. """ raise NotImplementedError ``` ### EmbedInputType ```python EmbedInputType = Literal['query', 'document'] ``` The type of input to the embedding model. - `'query'`: Text that will be used as a search query - `'document'`: Text that will be stored and searched against Some embedding models optimize differently for queries vs documents. ### EmbeddingResult The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ````python @dataclass class EmbeddingResult: """The result of an embedding operation. This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing. Example: ```python from pydantic_ai import Embedder embedder = Embedder('openai:text-embedding-3-small') async def main(): result = await embedder.embed_query('What is AI?') # Access embeddings by index print(len(result.embeddings[0])) #> 1536 # Access embeddings by original input text print(result['What is AI?'] == result.embeddings[0]) #> True # Check usage print(f'Tokens used: {result.usage.input_tokens}') #> Tokens used: 3 ``` """ embeddings: Sequence[Sequence[float]] """The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. """ _: KW_ONLY inputs: Sequence[str] """The original input texts that were embedded.""" input_type: EmbedInputType """Whether the inputs were embedded as queries or documents.""" model_name: str """The name of the model that generated these embeddings.""" provider_name: str """The name of the provider (e.g., 'openai', 'cohere').""" timestamp: datetime = field(default_factory=_now_utc) """When the embedding request was made.""" usage: RequestUsage = field(default_factory=RequestUsage) """Token usage statistics for this request.""" provider_details: dict[str, Any] | None = None """Provider-specific details from the response.""" provider_response_id: str | None = None """Unique identifier for this response from the provider, if available.""" def __getitem__(self, item: int | str) -> Sequence[float]: """Get the embedding for an input by index or by the original input text. Args: item: Either an integer index or the original input string. Returns: The embedding vector for the specified input. Raises: IndexError: If the index is out of range. ValueError: If the string is not found in the inputs. """ if isinstance(item, str): item = self.inputs.index(item) return self.embeddings[item] def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: A price calculation object with `total_price`, `input_price`, and other cost details. Raises: LookupError: If pricing data is not available for this model/provider. """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ```` #### embeddings ```python embeddings: Sequence[Sequence[float]] ``` The computed embedding vectors, one per input text. Each embedding is a sequence of floats representing the text in vector space. #### inputs ```python inputs: Sequence[str] ``` The original input texts that were embedded. #### input_type ```python input_type: EmbedInputType ``` Whether the inputs were embedded as queries or documents. #### model_name ```python model_name: str ``` The name of the model that generated these embeddings. #### provider_name ```python provider_name: str ``` The name of the provider (e.g., 'openai', 'cohere'). #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` When the embedding request was made. #### usage ```python usage: RequestUsage = field(default_factory=RequestUsage) ``` Token usage statistics for this request. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Provider-specific details from the response. #### provider_response_id ```python provider_response_id: str | None = None ``` Unique identifier for this response from the provider, if available. #### __getitem__ ```python __getitem__(item: int | str) -> Sequence[float] ``` Get the embedding for an input by index or by the original input text. Parameters: | Name | Type | Description | Default | | ------ | ----- | ----------- | ----------------------------------------------------- | | `item` | \`int | str\` | Either an integer index or the original input string. | Returns: | Type | Description | | ----------------- | --------------------------------------------- | | `Sequence[float]` | The embedding vector for the specified input. | Raises: | Type | Description | | ------------ | ----------------------------------------- | | `IndexError` | If the index is out of range. | | `ValueError` | If the string is not found in the inputs. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ```python def __getitem__(self, item: int | str) -> Sequence[float]: """Get the embedding for an input by index or by the original input text. Args: item: Either an integer index or the original input string. Returns: The embedding vector for the specified input. Raises: IndexError: If the index is out of range. ValueError: If the string is not found in the inputs. """ if isinstance(item, str): item = self.inputs.index(item) return self.embeddings[item] ``` #### cost ```python cost() -> PriceCalculation ``` Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: | Type | Description | | ------------------ | --------------------------------------------------------------------------------- | | `PriceCalculation` | A price calculation object with total_price, input_price, and other cost details. | Raises: | Type | Description | | ------------- | --------------------------------------------------------- | | `LookupError` | If pricing data is not available for this model/provider. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/result.py` ```python def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the embedding request. Uses [`genai-prices`](https://github.com/pydantic/genai-prices) for pricing data. Returns: A price calculation object with `total_price`, `input_price`, and other cost details. Raises: LookupError: If pricing data is not available for this model/provider. """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ``` ### EmbeddingSettings Bases: `TypedDict` Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., OpenAIEmbeddingSettings, CohereEmbeddingSettings) extend this with additional provider-prefixed options. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/settings.py` ```python class EmbeddingSettings(TypedDict, total=False): """Common settings for configuring embedding models. These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details. Provider-specific settings classes (e.g., [`OpenAIEmbeddingSettings`][pydantic_ai.embeddings.openai.OpenAIEmbeddingSettings], [`CohereEmbeddingSettings`][pydantic_ai.embeddings.cohere.CohereEmbeddingSettings]) extend this with additional provider-prefixed options. """ dimensions: int """The number of dimensions for the output embeddings. Supported by: * OpenAI * Cohere * Google * Sentence Transformers * Bedrock * VoyageAI """ truncate: bool """Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use [`max_input_tokens()`][pydantic_ai.embeddings.Embedder.max_input_tokens] and [`count_tokens()`][pydantic_ai.embeddings.Embedder.count_tokens] to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: * Cohere * Bedrock (Cohere and Nova models) * VoyageAI """ extra_headers: dict[str, str] """Extra headers to send to the model. Supported by: * OpenAI * Cohere """ extra_body: object """Extra body to send to the model. Supported by: * OpenAI * Cohere """ ``` #### dimensions ```python dimensions: int ``` The number of dimensions for the output embeddings. Supported by: - OpenAI - Cohere - Google - Sentence Transformers - Bedrock - VoyageAI #### truncate ```python truncate: bool ``` Whether to truncate inputs that exceed the model's context length. Defaults to `False`. If `True`, inputs that are too long will be truncated. If `False`, an error will be raised for inputs that exceed the context length. For more control over truncation, you can use max_input_tokens() and count_tokens() to implement your own truncation logic. Provider-specific truncation settings (e.g., `cohere_truncate`, `bedrock_cohere_truncate`) take precedence if specified. Supported by: - Cohere - Bedrock (Cohere and Nova models) - VoyageAI #### extra_headers ```python extra_headers: dict[str, str] ``` Extra headers to send to the model. Supported by: - OpenAI - Cohere #### extra_body ```python extra_body: object ``` Extra body to send to the model. Supported by: - OpenAI - Cohere ### merge_embedding_settings ```python merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None, ) -> EmbeddingSettings | None ``` Merge two sets of embedding settings, with overrides taking precedence. Parameters: | Name | Type | Description | Default | | ----------- | ------------------- | ----------- | --------------------------------------------------------------------- | | `base` | \`EmbeddingSettings | None\` | Base settings (typically from the embedder or model). | | `overrides` | \`EmbeddingSettings | None\` | Settings that should override the base (typically per-call settings). | Returns: | Type | Description | | ------------------- | ----------- | | \`EmbeddingSettings | None\` | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/settings.py` ```python def merge_embedding_settings( base: EmbeddingSettings | None, overrides: EmbeddingSettings | None ) -> EmbeddingSettings | None: """Merge two sets of embedding settings, with overrides taking precedence. Args: base: Base settings (typically from the embedder or model). overrides: Settings that should override the base (typically per-call settings). Returns: Merged settings, or `None` if both inputs are `None`. """ # Note: we may want merge recursively if/when we add non-primitive values if base and overrides: return base | overrides else: return base or overrides ``` ### OpenAIEmbeddingModelName ```python OpenAIEmbeddingModelName = str | EmbeddingModel ``` Possible OpenAI embeddings model names. See the [OpenAI embeddings documentation](https://platform.openai.com/docs/guides/embeddings) for available models. ### OpenAIEmbeddingSettings Bases: `EmbeddingSettings` Settings used for an OpenAI embedding model request. All fields from EmbeddingSettings are supported. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/openai.py` ```python class OpenAIEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for an OpenAI embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported. """ ``` ### OpenAIEmbeddingModel Bases: `EmbeddingModel` OpenAI embedding model implementation. This model works with OpenAI's embeddings API and any [OpenAI-compatible providers](https://ai.pydantic.dev/models/openai/#openai-compatible-models). Example: ```python from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel from pydantic_ai.providers.openai import OpenAIProvider # Using OpenAI directly model = OpenAIEmbeddingModel('text-embedding-3-small') # Using an OpenAI-compatible provider model = OpenAIEmbeddingModel( 'text-embedding-3-small', provider=OpenAIProvider(base_url='https://my-provider.com/v1'), ) ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/openai.py` ````python @dataclass(init=False) class OpenAIEmbeddingModel(EmbeddingModel): """OpenAI embedding model implementation. This model works with OpenAI's embeddings API and any [OpenAI-compatible providers](../models/openai.md#openai-compatible-models). Example: ```python from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel from pydantic_ai.providers.openai import OpenAIProvider # Using OpenAI directly model = OpenAIEmbeddingModel('text-embedding-3-small') # Using an OpenAI-compatible provider model = OpenAIEmbeddingModel( 'text-embedding-3-small', provider=OpenAIProvider(base_url='https://my-provider.com/v1'), ) ``` """ _model_name: OpenAIEmbeddingModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) def __init__( self, model_name: OpenAIEmbeddingModelName, *, provider: OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI] = 'openai', settings: EmbeddingSettings | None = None, ): """Initialize an OpenAI embedding model. Args: model_name: The name of the OpenAI model to use. See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available options. provider: The provider to use for authentication and API access. Can be: - `'openai'` (default): Uses the standard OpenAI API - A provider name string (e.g., `'azure'`, `'deepseek'`) - A [`Provider`][pydantic_ai.providers.Provider] instance for custom configuration See [OpenAI-compatible providers](../models/openai.md#openai-compatible-models) for a list of supported providers. settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client super().__init__(settings=settings) @property def base_url(self) -> str: return str(self._client.base_url) @property def model_name(self) -> OpenAIEmbeddingModelName: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider.name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) settings = cast(OpenAIEmbeddingSettings, settings) try: response = await self._client.embeddings.create( input=inputs, model=self.model_name, dimensions=settings.get('dimensions') or OMIT, extra_headers=settings.get('extra_headers'), extra_body=settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover except APIConnectionError as e: # pragma: no cover raise ModelAPIError(model_name=self.model_name, message=e.message) from e embeddings = [item.embedding for item in response.data] return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, usage=_map_usage(response.usage, self.system, self.base_url, response.model), model_name=response.model, provider_name=self.system, ) async def max_input_tokens(self) -> int | None: if self.system != 'openai': return None # https://platform.openai.com/docs/guides/embeddings#embedding-models return 8192 async def count_tokens(self, text: str) -> int: if self.system != 'openai': raise UserError( 'Counting tokens is not supported for non-OpenAI embedding models', ) try: encoding = await _utils.run_in_executor(tiktoken.encoding_for_model, self.model_name) except KeyError as e: # pragma: no cover raise ValueError( f'The embedding model {self.model_name!r} is not supported by tiktoken', ) from e return len(encoding.encode(text)) ```` #### __init__ ```python __init__( model_name: OpenAIEmbeddingModelName, *, provider: ( OpenAIEmbeddingsCompatibleProvider | Literal["openai"] | Provider[AsyncOpenAI] ) = "openai", settings: EmbeddingSettings | None = None ) ``` Initialize an OpenAI embedding model. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------------------ | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------- | | `model_name` | `OpenAIEmbeddingModelName` | The name of the OpenAI model to use. See OpenAI's embedding models for available options. | *required* | | `provider` | \`OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI]\` | | `settings` | \`EmbeddingSettings | None\` | Model-specific EmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/openai.py` ```python def __init__( self, model_name: OpenAIEmbeddingModelName, *, provider: OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI] = 'openai', settings: EmbeddingSettings | None = None, ): """Initialize an OpenAI embedding model. Args: model_name: The name of the OpenAI model to use. See [OpenAI's embedding models](https://platform.openai.com/docs/guides/embeddings) for available options. provider: The provider to use for authentication and API access. Can be: - `'openai'` (default): Uses the standard OpenAI API - A provider name string (e.g., `'azure'`, `'deepseek'`) - A [`Provider`][pydantic_ai.providers.Provider] instance for custom configuration See [OpenAI-compatible providers](../models/openai.md#openai-compatible-models) for a list of supported providers. settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client super().__init__(settings=settings) ``` #### model_name ```python model_name: OpenAIEmbeddingModelName ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### LatestCohereEmbeddingModelNames ```python LatestCohereEmbeddingModelNames = Literal[ "embed-v4.0", "embed-english-v3.0", "embed-english-light-v3.0", "embed-multilingual-v3.0", "embed-multilingual-light-v3.0", ] ``` Latest Cohere embeddings models. See the [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models and their capabilities. ### CohereEmbeddingModelName ```python CohereEmbeddingModelName = ( str | LatestCohereEmbeddingModelNames ) ``` Possible Cohere embeddings model names. ### CohereEmbeddingSettings Bases: `EmbeddingSettings` Settings used for a Cohere embedding model request. All fields from EmbeddingSettings are supported, plus Cohere-specific settings prefixed with `cohere_`. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/cohere.py` ```python class CohereEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for a Cohere embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported, plus Cohere-specific settings prefixed with `cohere_`. """ # ALL FIELDS MUST BE `cohere_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. cohere_max_tokens: int """The maximum number of tokens to embed.""" cohere_input_type: CohereEmbedInputType """The Cohere-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'search_query'`, `'search_document'`, `'classification'`, `'clustering'`, and `'image'`. """ cohere_truncate: V2EmbedRequestTruncate """The truncation strategy to use: - `'NONE'` (default): Raise an error if input exceeds max tokens. - `'END'`: Truncate the end of the input text. - `'START'`: Truncate the start of the input text. Note: This setting overrides the standard `truncate` boolean setting when specified. """ ``` #### cohere_max_tokens ```python cohere_max_tokens: int ``` The maximum number of tokens to embed. #### cohere_input_type ```python cohere_input_type: EmbedInputType ``` The Cohere-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'search_query'`, `'search_document'`, `'classification'`, `'clustering'`, and `'image'`. #### cohere_truncate ```python cohere_truncate: V2EmbedRequestTruncate ``` The truncation strategy to use: - `'NONE'` (default): Raise an error if input exceeds max tokens. - `'END'`: Truncate the end of the input text. - `'START'`: Truncate the start of the input text. Note: This setting overrides the standard `truncate` boolean setting when specified. ### CohereEmbeddingModel Bases: `EmbeddingModel` Cohere embedding model implementation. This model works with Cohere's embeddings API, which offers multilingual support and various model sizes. Example: ```python from pydantic_ai.embeddings.cohere import CohereEmbeddingModel model = CohereEmbeddingModel('embed-v4.0') ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/cohere.py` ````python @dataclass(init=False) class CohereEmbeddingModel(EmbeddingModel): """Cohere embedding model implementation. This model works with Cohere's embeddings API, which offers multilingual support and various model sizes. Example: ```python from pydantic_ai.embeddings.cohere import CohereEmbeddingModel model = CohereEmbeddingModel('embed-v4.0') ``` """ _model_name: CohereEmbeddingModelName = field(repr=False) _provider: Provider[AsyncClientV2] = field(repr=False) def __init__( self, model_name: CohereEmbeddingModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', settings: EmbeddingSettings | None = None, ): """Initialize a Cohere embedding model. Args: model_name: The name of the Cohere model to use. See [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models. provider: The provider to use for authentication and API access. Can be: - `'cohere'` (default): Uses the standard Cohere API - A [`CohereProvider`][pydantic_ai.providers.cohere.CohereProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client self._v1_client = provider.v1_client if isinstance(provider, CohereProvider) else None super().__init__(settings=settings) @property def base_url(self) -> str: """The base URL for the provider API, if available.""" return self._provider.base_url @property def model_name(self) -> CohereEmbeddingModelName: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider.name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) settings = cast(CohereEmbeddingSettings, settings) cohere_input_type = settings.get( 'cohere_input_type', 'search_document' if input_type == 'document' else 'search_query' ) request_options: RequestOptions = {} if extra_headers := settings.get('extra_headers'): # pragma: no cover request_options['additional_headers'] = extra_headers if extra_body := settings.get('extra_body'): # pragma: no cover request_options['additional_body_parameters'] = cast(dict[str, Any], extra_body) # Determine truncation strategy: cohere_truncate takes precedence over truncate if 'cohere_truncate' in settings: truncate = settings['cohere_truncate'] elif settings.get('truncate'): truncate = 'END' else: truncate = 'NONE' try: response = await self._client.embed( model=self.model_name, texts=inputs, output_dimension=settings.get('dimensions'), input_type=cohere_input_type, max_tokens=settings.get('cohere_max_tokens'), truncate=truncate, request_options=request_options, ) except ApiError as e: if (status_code := e.status_code) and status_code >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e # pragma: no cover embeddings = response.embeddings.float_ if embeddings is None: raise UnexpectedModelBehavior( # pragma: no cover 'The Cohere embeddings response did not have an `embeddings` field holding a list of floats', str(response), ) return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, usage=_map_usage(response, self.system, self.base_url, self.model_name), model_name=self.model_name, provider_name=self.system, provider_response_id=response.id, ) async def max_input_tokens(self) -> int | None: return _MAX_INPUT_TOKENS.get(self.model_name) async def count_tokens(self, text: str) -> int: if self._v1_client is None: raise NotImplementedError('Counting tokens requires the Cohere v1 client') try: result = await self._v1_client.tokenize( model=self.model_name, text=text, # Has a max length of 65536 characters offline=False, ) except ApiError as e: # pragma: no cover if (status_code := e.status_code) and status_code >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e return len(result.tokens) ```` #### __init__ ```python __init__( model_name: CohereEmbeddingModelName, *, provider: ( Literal["cohere"] | Provider[AsyncClientV2] ) = "cohere", settings: EmbeddingSettings | None = None ) ``` Initialize a Cohere embedding model. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------- | ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `model_name` | `CohereEmbeddingModelName` | The name of the Cohere model to use. See Cohere Embed documentation for available models. | *required* | | `provider` | \`Literal['cohere'] | Provider[AsyncClientV2]\` | The provider to use for authentication and API access. Can be: 'cohere' (default): Uses the standard Cohere API A CohereProvider instance for custom configuration | | `settings` | \`EmbeddingSettings | None\` | Model-specific EmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/cohere.py` ```python def __init__( self, model_name: CohereEmbeddingModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', settings: EmbeddingSettings | None = None, ): """Initialize a Cohere embedding model. Args: model_name: The name of the Cohere model to use. See [Cohere Embed documentation](https://docs.cohere.com/docs/cohere-embed) for available models. provider: The provider to use for authentication and API access. Can be: - `'cohere'` (default): Uses the standard Cohere API - A [`CohereProvider`][pydantic_ai.providers.cohere.CohereProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client self._v1_client = provider.v1_client if isinstance(provider, CohereProvider) else None super().__init__(settings=settings) ``` #### base_url ```python base_url: str ``` The base URL for the provider API, if available. #### model_name ```python model_name: CohereEmbeddingModelName ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### LatestGoogleGLAEmbeddingModelNames ```python LatestGoogleGLAEmbeddingModelNames = Literal[ "gemini-embedding-001" ] ``` Latest Google Gemini API (GLA) embedding models. See the [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models and their capabilities. ### LatestGoogleVertexEmbeddingModelNames ```python LatestGoogleVertexEmbeddingModelNames = Literal[ "gemini-embedding-001", "text-embedding-005", "text-multilingual-embedding-002", ] ``` Latest Google Vertex AI embedding models. See the [Vertex AI Embeddings documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) for available models and their capabilities. ### LatestGoogleEmbeddingModelNames ```python LatestGoogleEmbeddingModelNames = ( LatestGoogleGLAEmbeddingModelNames | LatestGoogleVertexEmbeddingModelNames ) ``` All latest Google embedding models (union of GLA and Vertex AI models). ### GoogleEmbeddingModelName ```python GoogleEmbeddingModelName = ( str | LatestGoogleEmbeddingModelNames ) ``` Possible Google embeddings model names. ### GoogleEmbeddingSettings Bases: `EmbeddingSettings` Settings used for a Google embedding model request. All fields from EmbeddingSettings are supported, plus Google-specific settings prefixed with `google_`. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/google.py` ```python class GoogleEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for a Google embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported, plus Google-specific settings prefixed with `google_`. """ # ALL FIELDS MUST BE `google_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. google_task_type: str """The task type for the embedding. Overrides the automatic task type selection based on `input_type`. See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available options. """ google_title: str """Optional title for the content being embedded. Only applicable when task_type is `RETRIEVAL_DOCUMENT`. """ ``` #### google_task_type ```python google_task_type: str ``` The task type for the embedding. Overrides the automatic task type selection based on `input_type`. See [Google's task type documentation](https://ai.google.dev/gemini-api/docs/embeddings#task-types) for available options. #### google_title ```python google_title: str ``` Optional title for the content being embedded. Only applicable when task_type is `RETRIEVAL_DOCUMENT`. ### GoogleEmbeddingModel Bases: `EmbeddingModel` Google embedding model implementation. This model works with Google's embeddings API via the `google-genai` SDK, supporting both the Gemini API (Google AI Studio) and Vertex AI. Example: ```python from pydantic_ai.embeddings.google import GoogleEmbeddingModel from pydantic_ai.providers.google import GoogleProvider # Using Gemini API (requires GOOGLE_API_KEY env var) model = GoogleEmbeddingModel('gemini-embedding-001') # Using Vertex AI model = GoogleEmbeddingModel( 'gemini-embedding-001', provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'), ) ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/google.py` ````python @dataclass(init=False) class GoogleEmbeddingModel(EmbeddingModel): """Google embedding model implementation. This model works with Google's embeddings API via the `google-genai` SDK, supporting both the Gemini API (Google AI Studio) and Vertex AI. Example: ```python from pydantic_ai.embeddings.google import GoogleEmbeddingModel from pydantic_ai.providers.google import GoogleProvider # Using Gemini API (requires GOOGLE_API_KEY env var) model = GoogleEmbeddingModel('gemini-embedding-001') # Using Vertex AI model = GoogleEmbeddingModel( 'gemini-embedding-001', provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'), ) ``` """ _model_name: GoogleEmbeddingModelName = field(repr=False) _provider: Provider[Client] = field(repr=False) def __init__( self, model_name: GoogleEmbeddingModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', settings: EmbeddingSettings | None = None, ): """Initialize a Google embedding model. Args: model_name: The name of the Google model to use. See [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models. provider: The provider to use for authentication and API access. Can be: - `'google-gla'` (default): Uses the Gemini API (Google AI Studio) - `'google-vertex'`: Uses Vertex AI - A [`GoogleProvider`][pydantic_ai.providers.google.GoogleProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client super().__init__(settings=settings) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> GoogleEmbeddingModelName: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider.name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) settings = cast(GoogleEmbeddingSettings, settings) google_task_type = settings.get('google_task_type') if google_task_type is None: google_task_type = 'RETRIEVAL_DOCUMENT' if input_type == 'document' else 'RETRIEVAL_QUERY' config = EmbedContentConfig( task_type=google_task_type, output_dimensionality=settings.get('dimensions'), title=settings.get('google_title'), ) try: response = await self._client.aio.models.embed_content( model=self._model_name, contents=cast(ContentListUnion, inputs), config=config, ) except errors.APIError as e: if (status_code := e.code) >= 400: raise ModelHTTPError( status_code=status_code, model_name=self._model_name, body=cast(object, e.details), # pyright: ignore[reportUnknownMemberType] ) from e raise # pragma: no cover embeddings: list[list[float]] = [emb.values for emb in (response.embeddings or []) if emb.values is not None] return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, usage=_map_usage(response, self.system, self.base_url, self._model_name), model_name=self._model_name, provider_name=self.system, ) async def max_input_tokens(self) -> int | None: return _MAX_INPUT_TOKENS.get(self._model_name) async def count_tokens(self, text: str) -> int: try: response = await self._client.aio.models.count_tokens( model=self._model_name, contents=text, ) except errors.APIError as e: if (status_code := e.code) >= 400: raise ModelHTTPError( status_code=status_code, model_name=self._model_name, body=cast(object, e.details), # pyright: ignore[reportUnknownMemberType] ) from e raise # pragma: no cover if response.total_tokens is None: raise UnexpectedModelBehavior('Token counting returned no result') # pragma: no cover return response.total_tokens ```` #### __init__ ```python __init__( model_name: GoogleEmbeddingModelName, *, provider: ( Literal["google-gla", "google-vertex"] | Provider[Client] ) = "google-gla", settings: EmbeddingSettings | None = None ) ``` Initialize a Google embedding model. Parameters: | Name | Type | Description | Default | | ------------ | ---------------------------------------- | ---------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `GoogleEmbeddingModelName` | The name of the Google model to use. See Google Embeddings documentation for available models. | *required* | | `provider` | \`Literal['google-gla', 'google-vertex'] | Provider[Client]\` | The provider to use for authentication and API access. Can be: 'google-gla' (default): Uses the Gemini API (Google AI Studio) 'google-vertex': Uses Vertex AI A GoogleProvider instance for custom configuration | | `settings` | \`EmbeddingSettings | None\` | Model-specific EmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/google.py` ```python def __init__( self, model_name: GoogleEmbeddingModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', settings: EmbeddingSettings | None = None, ): """Initialize a Google embedding model. Args: model_name: The name of the Google model to use. See [Google Embeddings documentation](https://ai.google.dev/gemini-api/docs/embeddings) for available models. provider: The provider to use for authentication and API access. Can be: - `'google-gla'` (default): Uses the Gemini API (Google AI Studio) - `'google-vertex'`: Uses Vertex AI - A [`GoogleProvider`][pydantic_ai.providers.google.GoogleProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self._client = provider.client super().__init__(settings=settings) ``` #### model_name ```python model_name: GoogleEmbeddingModelName ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### LatestBedrockEmbeddingModelNames ```python LatestBedrockEmbeddingModelNames = Literal[ "amazon.titan-embed-text-v1", "amazon.titan-embed-text-v2:0", "cohere.embed-english-v3", "cohere.embed-multilingual-v3", "cohere.embed-v4:0", "amazon.nova-2-multimodal-embeddings-v1:0", ] ``` Latest Bedrock embedding model names. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available embedding models. ### BedrockEmbeddingModelName ```python BedrockEmbeddingModelName = ( str | LatestBedrockEmbeddingModelNames ) ``` Possible Bedrock embedding model names. ### BedrockEmbeddingSettings Bases: `EmbeddingSettings` Settings used for a Bedrock embedding model request. All fields from EmbeddingSettings are supported, plus Bedrock-specific settings prefixed with `bedrock_`. All settings are optional - if not specified, model defaults are used. **Note on `dimensions` parameter support:** - **Titan v1** (`amazon.titan-embed-text-v1`): Not supported (fixed: 1536) - **Titan v2** (`amazon.titan-embed-text-v2:0`): Supported (default: 1024, accepts 256/384/1024) - **Cohere v3** (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`): Not supported (fixed: 1024) - **Cohere v4** (`cohere.embed-v4:0`): Supported (default: 1536, accepts 256/512/1024/1536) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: 3072, accepts 256/384/1024/3072) Unsupported settings are silently ignored. **Note on `truncate` parameter support:** - **Titan models** (`amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`): Not supported - **Cohere models** (all versions): Supported (default: `False`, maps to `'END'` when `True`) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: `False`, maps to `'END'` when `True`) For fine-grained truncation control, use model-specific settings: `bedrock_cohere_truncate` or `bedrock_nova_truncate`. Example ```python from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings # Use model defaults settings = BedrockEmbeddingSettings() # Customize specific settings for Titan v2:0 settings = BedrockEmbeddingSettings( dimensions=512, bedrock_titan_normalize=True, ) # Customize specific settings for Cohere v4 settings = BedrockEmbeddingSettings( dimensions=512, bedrock_cohere_max_tokens=1000, ) ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py` ````python class BedrockEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for a Bedrock embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported, plus Bedrock-specific settings prefixed with `bedrock_`. All settings are optional - if not specified, model defaults are used. **Note on `dimensions` parameter support:** - **Titan v1** (`amazon.titan-embed-text-v1`): Not supported (fixed: 1536) - **Titan v2** (`amazon.titan-embed-text-v2:0`): Supported (default: 1024, accepts 256/384/1024) - **Cohere v3** (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`): Not supported (fixed: 1024) - **Cohere v4** (`cohere.embed-v4:0`): Supported (default: 1536, accepts 256/512/1024/1536) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: 3072, accepts 256/384/1024/3072) Unsupported settings are silently ignored. **Note on `truncate` parameter support:** - **Titan models** (`amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`): Not supported - **Cohere models** (all versions): Supported (default: `False`, maps to `'END'` when `True`) - **Nova** (`amazon.nova-2-multimodal-embeddings-v1:0`): Supported (default: `False`, maps to `'END'` when `True`) For fine-grained truncation control, use model-specific settings: `bedrock_cohere_truncate` or `bedrock_nova_truncate`. Example: ```python from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings # Use model defaults settings = BedrockEmbeddingSettings() # Customize specific settings for Titan v2:0 settings = BedrockEmbeddingSettings( dimensions=512, bedrock_titan_normalize=True, ) # Customize specific settings for Cohere v4 settings = BedrockEmbeddingSettings( dimensions=512, bedrock_cohere_max_tokens=1000, ) ``` """ # ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. # ==================== Amazon Titan Settings ==================== bedrock_titan_normalize: bool """Whether to normalize embedding vectors for Titan models. **Supported by:** `amazon.titan-embed-text-v2:0` (default: `True`) **Not supported by:** `amazon.titan-embed-text-v1` (silently ignored) When enabled, vectors are normalized for direct cosine similarity calculations. """ # ==================== Cohere Settings ==================== bedrock_cohere_max_tokens: int """The maximum number of tokens to embed for Cohere models. **Supported by:** `cohere.embed-v4:0` (default: 128000) **Not supported by:** `cohere.embed-english-v3`, `cohere.embed-multilingual-v3` (silently ignored) """ bedrock_cohere_input_type: Literal['search_document', 'search_query', 'classification', 'clustering'] """The input type for Cohere models. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`. """ bedrock_cohere_truncate: Literal['NONE', 'START', 'END'] """The truncation strategy for Cohere models. Overrides base `truncate` setting. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. """ # ==================== Amazon Nova Settings ==================== bedrock_nova_truncate: Literal['NONE', 'START', 'END'] """The truncation strategy for Nova models. Overrides base `truncate` setting. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. """ bedrock_nova_embedding_purpose: Literal[ 'GENERIC_INDEX', 'GENERIC_RETRIEVAL', 'TEXT_RETRIEVAL', 'CLASSIFICATION', 'CLUSTERING', ] """The embedding purpose for Nova models. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`. Note: Multimodal-specific purposes (`'IMAGE_RETRIEVAL'`, `'VIDEO_RETRIEVAL'`, `'DOCUMENT_RETRIEVAL'`, `'AUDIO_RETRIEVAL'`) are not supported as this embedding client only accepts text input. """ # ==================== Concurrency Settings ==================== bedrock_max_concurrency: int """Maximum number of concurrent requests for models that don't support batch embedding. **Applies to:** `amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`, `amazon.nova-2-multimodal-embeddings-v1:0` When embedding multiple texts with models that only support single-text requests, this controls how many requests run in parallel. Defaults to 5. """ ```` #### bedrock_titan_normalize ```python bedrock_titan_normalize: bool ``` Whether to normalize embedding vectors for Titan models. **Supported by:** `amazon.titan-embed-text-v2:0` (default: `True`) **Not supported by:** `amazon.titan-embed-text-v1` (silently ignored) When enabled, vectors are normalized for direct cosine similarity calculations. #### bedrock_cohere_max_tokens ```python bedrock_cohere_max_tokens: int ``` The maximum number of tokens to embed for Cohere models. **Supported by:** `cohere.embed-v4:0` (default: 128000) **Not supported by:** `cohere.embed-english-v3`, `cohere.embed-multilingual-v3` (silently ignored) #### bedrock_cohere_input_type ```python bedrock_cohere_input_type: Literal[ "search_document", "search_query", "classification", "clustering", ] ``` The input type for Cohere models. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) By default, `embed_query()` uses `'search_query'` and `embed_documents()` uses `'search_document'`. Also accepts `'classification'` or `'clustering'`. #### bedrock_cohere_truncate ```python bedrock_cohere_truncate: Literal['NONE', 'START', 'END'] ``` The truncation strategy for Cohere models. Overrides base `truncate` setting. **Supported by:** All Cohere models (`cohere.embed-english-v3`, `cohere.embed-multilingual-v3`, `cohere.embed-v4:0`) Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. #### bedrock_nova_truncate ```python bedrock_nova_truncate: Literal['NONE', 'START', 'END'] ``` The truncation strategy for Nova models. Overrides base `truncate` setting. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` Default: `'NONE'` - `'NONE'`: Raise an error if input exceeds max tokens. - `'START'`: Truncate the start of the input. - `'END'`: Truncate the end of the input. #### bedrock_nova_embedding_purpose ```python bedrock_nova_embedding_purpose: Literal[ "GENERIC_INDEX", "GENERIC_RETRIEVAL", "TEXT_RETRIEVAL", "CLASSIFICATION", "CLUSTERING", ] ``` The embedding purpose for Nova models. **Supported by:** `amazon.nova-2-multimodal-embeddings-v1:0` By default, `embed_query()` uses `'GENERIC_RETRIEVAL'` and `embed_documents()` uses `'GENERIC_INDEX'`. Also accepts `'TEXT_RETRIEVAL'`, `'CLASSIFICATION'`, or `'CLUSTERING'`. Note: Multimodal-specific purposes (`'IMAGE_RETRIEVAL'`, `'VIDEO_RETRIEVAL'`, `'DOCUMENT_RETRIEVAL'`, `'AUDIO_RETRIEVAL'`) are not supported as this embedding client only accepts text input. #### bedrock_max_concurrency ```python bedrock_max_concurrency: int ``` Maximum number of concurrent requests for models that don't support batch embedding. **Applies to:** `amazon.titan-embed-text-v1`, `amazon.titan-embed-text-v2:0`, `amazon.nova-2-multimodal-embeddings-v1:0` When embedding multiple texts with models that only support single-text requests, this controls how many requests run in parallel. Defaults to 5. ### BedrockEmbeddingModel Bases: `EmbeddingModel` Bedrock embedding model implementation. This model works with AWS Bedrock's embedding models including Amazon Titan Embeddings and Cohere Embed models. Example: ```python from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider # Using default AWS credentials model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0') # Using explicit credentials model = BedrockEmbeddingModel( 'cohere.embed-english-v3', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='...', aws_secret_access_key='...', ), ) ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py` ````python @dataclass(init=False) class BedrockEmbeddingModel(EmbeddingModel): """Bedrock embedding model implementation. This model works with AWS Bedrock's embedding models including Amazon Titan Embeddings and Cohere Embed models. Example: ```python from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel from pydantic_ai.providers.bedrock import BedrockProvider # Using default AWS credentials model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0') # Using explicit credentials model = BedrockEmbeddingModel( 'cohere.embed-english-v3', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='...', aws_secret_access_key='...', ), ) ``` """ client: BedrockRuntimeClient _model_name: BedrockEmbeddingModelName = field(repr=False) _provider: Provider[BaseClient] = field(repr=False) _handler: _BedrockEmbeddingHandler = field(repr=False) def __init__( self, model_name: BedrockEmbeddingModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', settings: EmbeddingSettings | None = None, ): """Initialize a Bedrock embedding model. Args: model_name: The name of the Bedrock embedding model to use. See [Bedrock embedding models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available options. provider: The provider to use for authentication and API access. Can be: - `'bedrock'` (default): Uses default AWS credentials - A [`BedrockProvider`][pydantic_ai.providers.bedrock.BedrockProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) self._handler = _get_handler_for_model(model_name) super().__init__(settings=settings) @property def base_url(self) -> str: """The base URL for the provider API.""" return str(self.client.meta.endpoint_url) @property def model_name(self) -> BedrockEmbeddingModelName: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider.name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs_list, settings_dict = self.prepare_embed(inputs, settings) settings_typed = cast(BedrockEmbeddingSettings, settings_dict) if self._handler.supports_batch: # Models like Cohere support batch requests return await self._embed_batch(inputs_list, input_type, settings_typed) else: # Models like Titan require individual requests return await self._embed_concurrent(inputs_list, input_type, settings_typed) async def _embed_batch( self, inputs: list[str], input_type: EmbedInputType, settings: BedrockEmbeddingSettings, ) -> EmbeddingResult: """Embed all inputs in a single batch request.""" body = self._handler.prepare_request(inputs, input_type, settings) response, input_tokens = await self._invoke_model(body) embeddings, response_id = self._handler.parse_response(response) return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, usage=RequestUsage(input_tokens=input_tokens), model_name=self.model_name, provider_name=self.system, provider_response_id=response_id, ) async def _embed_concurrent( self, inputs: list[str], input_type: EmbedInputType, settings: BedrockEmbeddingSettings, ) -> EmbeddingResult: """Embed inputs concurrently with controlled parallelism and combine results.""" max_concurrency = settings.get('bedrock_max_concurrency', 5) semaphore = anyio.Semaphore(max_concurrency) results: list[tuple[Sequence[float], int]] = [None] * len(inputs) # type: ignore[list-item] async def embed_single(index: int, text: str) -> None: async with semaphore: body = self._handler.prepare_request([text], input_type, settings) response, input_tokens = await self._invoke_model(body) embeddings, _ = self._handler.parse_response(response) results[index] = (embeddings[0], input_tokens) async with anyio.create_task_group() as tg: for i, text in enumerate(inputs): tg.start_soon(embed_single, i, text) all_embeddings = [embedding for embedding, _ in results] total_input_tokens = sum(tokens for _, tokens in results) return EmbeddingResult( embeddings=all_embeddings, inputs=inputs, input_type=input_type, usage=RequestUsage(input_tokens=total_input_tokens), model_name=self.model_name, provider_name=self.system, ) async def _invoke_model(self, body: dict[str, Any]) -> tuple[dict[str, Any], int]: """Invoke the Bedrock model and return parsed response with token count. Returns: A tuple of (response_body, input_token_count). """ try: response: InvokeModelResponseTypeDef = await anyio.to_thread.run_sync( functools.partial( self.client.invoke_model, modelId=self._model_name, body=json.dumps(body), contentType='application/json', accept='application/json', ) ) except ClientError as e: status_code = e.response.get('ResponseMetadata', {}).get('HTTPStatusCode') if isinstance(status_code, int): raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.response) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e # Extract input token count from HTTP headers input_tokens = int( response.get('ResponseMetadata', {}).get('HTTPHeaders', {}).get('x-amzn-bedrock-input-token-count', '0') ) response_body = json.loads(response['body'].read()) return response_body, input_tokens async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model.""" return _MAX_INPUT_TOKENS.get(self._handler.model_name, None) ```` #### __init__ ```python __init__( model_name: BedrockEmbeddingModelName, *, provider: ( Literal["bedrock"] | Provider[BaseClient] ) = "bedrock", settings: EmbeddingSettings | None = None ) ``` Initialize a Bedrock embedding model. Parameters: | Name | Type | Description | Default | | ------------ | --------------------------- | --------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `BedrockEmbeddingModelName` | The name of the Bedrock embedding model to use. See Bedrock embedding models for available options. | *required* | | `provider` | \`Literal['bedrock'] | Provider[BaseClient]\` | The provider to use for authentication and API access. Can be: 'bedrock' (default): Uses default AWS credentials A BedrockProvider instance for custom configuration | | `settings` | \`EmbeddingSettings | None\` | Model-specific EmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py` ```python def __init__( self, model_name: BedrockEmbeddingModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', settings: EmbeddingSettings | None = None, ): """Initialize a Bedrock embedding model. Args: model_name: The name of the Bedrock embedding model to use. See [Bedrock embedding models](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for available options. provider: The provider to use for authentication and API access. Can be: - `'bedrock'` (default): Uses default AWS credentials - A [`BedrockProvider`][pydantic_ai.providers.bedrock.BedrockProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) self._handler = _get_handler_for_model(model_name) super().__init__(settings=settings) ``` #### base_url ```python base_url: str ``` The base URL for the provider API. #### model_name ```python model_name: BedrockEmbeddingModelName ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. #### max_input_tokens ```python max_input_tokens() -> int | None ``` Get the maximum number of tokens that can be input to the model. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py` ```python async def max_input_tokens(self) -> int | None: """Get the maximum number of tokens that can be input to the model.""" return _MAX_INPUT_TOKENS.get(self._handler.model_name, None) ``` ### LatestVoyageAIEmbeddingModelNames ```python LatestVoyageAIEmbeddingModelNames = Literal[ "voyage-4-large", "voyage-4", "voyage-4-lite", "voyage-3-large", "voyage-3.5", "voyage-3.5-lite", "voyage-code-3", "voyage-finance-2", "voyage-law-2", "voyage-code-2", ] ``` Latest VoyageAI embedding models. See [VoyageAI Embeddings](https://docs.voyageai.com/docs/embeddings) for available models and their capabilities. ### VoyageAIEmbeddingModelName ```python VoyageAIEmbeddingModelName = ( str | LatestVoyageAIEmbeddingModelNames ) ``` Possible VoyageAI embedding model names. ### VoyageAIEmbedInputType ```python VoyageAIEmbedInputType = Literal[ "query", "document", "none" ] ``` VoyageAI embedding input types. - `'query'`: For search queries; prepends retrieval-optimized prefix. - `'document'`: For documents; prepends document retrieval prefix. - `'none'`: Direct embedding without any prefix. ### VoyageAIEmbeddingSettings Bases: `EmbeddingSettings` Settings used for a VoyageAI embedding model request. All fields from EmbeddingSettings are supported, plus VoyageAI-specific settings prefixed with `voyageai_`. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py` ```python class VoyageAIEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for a VoyageAI embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported, plus VoyageAI-specific settings prefixed with `voyageai_`. """ # ALL FIELDS MUST BE `voyageai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. voyageai_input_type: VoyageAIEmbedInputType """The VoyageAI-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'query'`, `'document'`, or `'none'` for direct embedding without prefix. """ ``` #### voyageai_input_type ```python voyageai_input_type: VoyageAIEmbedInputType ``` The VoyageAI-specific input type for the embedding. Overrides the standard `input_type` argument. Options include: `'query'`, `'document'`, or `'none'` for direct embedding without prefix. ### VoyageAIEmbeddingModel Bases: `EmbeddingModel` VoyageAI embedding model implementation. VoyageAI provides state-of-the-art embedding models optimized for retrieval, with specialized models for code, finance, and legal domains. Example: ```python from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel model = VoyageAIEmbeddingModel('voyage-3.5') ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py` ````python @dataclass(init=False) class VoyageAIEmbeddingModel(EmbeddingModel): """VoyageAI embedding model implementation. VoyageAI provides state-of-the-art embedding models optimized for retrieval, with specialized models for code, finance, and legal domains. Example: ```python from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel model = VoyageAIEmbeddingModel('voyage-3.5') ``` """ _model_name: VoyageAIEmbeddingModelName = field(repr=False) _provider: Provider[AsyncClient] = field(repr=False) def __init__( self, model_name: VoyageAIEmbeddingModelName, *, provider: Literal['voyageai'] | Provider[AsyncClient] = 'voyageai', settings: EmbeddingSettings | None = None, ): """Initialize a VoyageAI embedding model. Args: model_name: The name of the VoyageAI model to use. See [VoyageAI models](https://docs.voyageai.com/docs/embeddings) for available options. provider: The provider to use for authentication and API access. Can be: - `'voyageai'` (default): Uses the standard VoyageAI API - A [`VoyageAIProvider`][pydantic_ai.providers.voyageai.VoyageAIProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider super().__init__(settings=settings) @property def base_url(self) -> str: """The base URL for the provider API.""" return self._provider.base_url @property def model_name(self) -> VoyageAIEmbeddingModelName: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider.name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None, ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) settings = cast(VoyageAIEmbeddingSettings, settings) voyageai_input_type: VoyageAIEmbedInputType = settings.get( 'voyageai_input_type', 'document' if input_type == 'document' else 'query' ) # Convert 'none' string to None for the API api_input_type = None if voyageai_input_type == 'none' else voyageai_input_type try: response = await self._provider.client.embed( texts=list(inputs), model=self.model_name, input_type=api_input_type, truncation=settings.get('truncate', False), output_dimension=settings.get('dimensions'), ) except VoyageError as e: raise ModelAPIError(model_name=self.model_name, message=str(e)) from e return EmbeddingResult( embeddings=response.embeddings, inputs=inputs, input_type=input_type, usage=_map_usage(response.total_tokens), model_name=self.model_name, provider_name=self.system, ) async def max_input_tokens(self) -> int | None: return _MAX_INPUT_TOKENS.get(self.model_name) ```` #### __init__ ```python __init__( model_name: VoyageAIEmbeddingModelName, *, provider: ( Literal["voyageai"] | Provider[AsyncClient] ) = "voyageai", settings: EmbeddingSettings | None = None ) ``` Initialize a VoyageAI embedding model. Parameters: | Name | Type | Description | Default | | ------------ | ---------------------------- | --------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `model_name` | `VoyageAIEmbeddingModelName` | The name of the VoyageAI model to use. See VoyageAI models for available options. | *required* | | `provider` | \`Literal['voyageai'] | Provider[AsyncClient]\` | The provider to use for authentication and API access. Can be: 'voyageai' (default): Uses the standard VoyageAI API A VoyageAIProvider instance for custom configuration | | `settings` | \`EmbeddingSettings | None\` | Model-specific EmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py` ```python def __init__( self, model_name: VoyageAIEmbeddingModelName, *, provider: Literal['voyageai'] | Provider[AsyncClient] = 'voyageai', settings: EmbeddingSettings | None = None, ): """Initialize a VoyageAI embedding model. Args: model_name: The name of the VoyageAI model to use. See [VoyageAI models](https://docs.voyageai.com/docs/embeddings) for available options. provider: The provider to use for authentication and API access. Can be: - `'voyageai'` (default): Uses the standard VoyageAI API - A [`VoyageAIProvider`][pydantic_ai.providers.voyageai.VoyageAIProvider] instance for custom configuration settings: Model-specific [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] to use as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider super().__init__(settings=settings) ``` #### base_url ```python base_url: str ``` The base URL for the provider API. #### model_name ```python model_name: VoyageAIEmbeddingModelName ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### SentenceTransformersEmbeddingSettings Bases: `EmbeddingSettings` Settings used for a Sentence-Transformers embedding model request. All fields from EmbeddingSettings are supported, plus Sentence-Transformers-specific settings prefixed with `sentence_transformers_`. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py` ```python class SentenceTransformersEmbeddingSettings(EmbeddingSettings, total=False): """Settings used for a Sentence-Transformers embedding model request. All fields from [`EmbeddingSettings`][pydantic_ai.embeddings.EmbeddingSettings] are supported, plus Sentence-Transformers-specific settings prefixed with `sentence_transformers_`. """ sentence_transformers_device: str """Device to run inference on. Examples: `'cpu'`, `'cuda'`, `'cuda:0'`, `'mps'` (Apple Silicon). """ sentence_transformers_normalize_embeddings: bool """Whether to L2-normalize embeddings. When `True`, all embeddings will have unit length, which is useful for cosine similarity calculations. """ sentence_transformers_batch_size: int """Batch size to use during encoding. Larger batches may be faster but require more memory. """ ``` #### sentence_transformers_device ```python sentence_transformers_device: str ``` Device to run inference on. Examples: `'cpu'`, `'cuda'`, `'cuda:0'`, `'mps'` (Apple Silicon). #### sentence_transformers_normalize_embeddings ```python sentence_transformers_normalize_embeddings: bool ``` Whether to L2-normalize embeddings. When `True`, all embeddings will have unit length, which is useful for cosine similarity calculations. #### sentence_transformers_batch_size ```python sentence_transformers_batch_size: int ``` Batch size to use during encoding. Larger batches may be faster but require more memory. ### SentenceTransformerEmbeddingModel Bases: `EmbeddingModel` Local embedding model using the `sentence-transformers` library. This model runs embeddings locally on your machine, which is useful for: - Privacy-sensitive applications where data shouldn't leave your infrastructure - Reducing API costs for high-volume embedding workloads - Offline or air-gapped environments Models are downloaded from Hugging Face on first use. See the [Sentence-Transformers documentation](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) for available models. Example: ```python from sentence_transformers import SentenceTransformer from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformerEmbeddingModel, ) # Using a model name (downloads from Hugging Face) model = SentenceTransformerEmbeddingModel('all-MiniLM-L6-v2') # Using an existing SentenceTransformer instance st_model = SentenceTransformer('all-MiniLM-L6-v2') model = SentenceTransformerEmbeddingModel(st_model) ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py` ````python @dataclass(init=False) class SentenceTransformerEmbeddingModel(EmbeddingModel): """Local embedding model using the `sentence-transformers` library. This model runs embeddings locally on your machine, which is useful for: - Privacy-sensitive applications where data shouldn't leave your infrastructure - Reducing API costs for high-volume embedding workloads - Offline or air-gapped environments Models are downloaded from Hugging Face on first use. See the [Sentence-Transformers documentation](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) for available models. Example: ```python from sentence_transformers import SentenceTransformer from pydantic_ai.embeddings.sentence_transformers import ( SentenceTransformerEmbeddingModel, ) # Using a model name (downloads from Hugging Face) model = SentenceTransformerEmbeddingModel('all-MiniLM-L6-v2') # Using an existing SentenceTransformer instance st_model = SentenceTransformer('all-MiniLM-L6-v2') model = SentenceTransformerEmbeddingModel(st_model) ``` """ _model_name: str = field(repr=False) _model: SentenceTransformer | None = field(repr=False, default=None) def __init__(self, model: SentenceTransformer | str, *, settings: EmbeddingSettings | None = None) -> None: """Initialize a Sentence-Transformers embedding model. Args: model: The model to use. Can be: - A model name from Hugging Face (e.g., `'all-MiniLM-L6-v2'`) - A local path to a saved model - An existing `SentenceTransformer` instance settings: Model-specific [`SentenceTransformersEmbeddingSettings`][pydantic_ai.embeddings.sentence_transformers.SentenceTransformersEmbeddingSettings] to use as defaults for this model. """ if isinstance(model, str): self._model_name = model else: self._model = deepcopy(model) self._model_name = model.model_card_data.model_id or model.model_card_data.base_model or 'unknown' super().__init__(settings=settings) @property def base_url(self) -> str | None: """No base URL — runs locally.""" return None @property def model_name(self) -> str: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider/system identifier.""" return 'sentence-transformers' async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) settings = cast(SentenceTransformersEmbeddingSettings, settings) device = settings.get('sentence_transformers_device', None) normalize = settings.get('sentence_transformers_normalize_embeddings', False) batch_size = settings.get('sentence_transformers_batch_size', None) dimensions = settings.get('dimensions', None) model = await self._get_model() encode_func = model.encode_query if input_type == 'query' else model.encode_document # type: ignore[reportUnknownReturnType] np_embeddings: np.ndarray[Any, float] = await _utils.run_in_executor( # type: ignore[reportAssignmentType] encode_func, # type: ignore[reportArgumentType] inputs, show_progress_bar=False, convert_to_numpy=True, convert_to_tensor=False, device=device, normalize_embeddings=normalize, truncate_dim=dimensions, **{'batch_size': batch_size} if batch_size is not None else {}, # type: ignore[reportArgumentType] ) embeddings = np_embeddings.tolist() return EmbeddingResult( embeddings=embeddings, inputs=inputs, input_type=input_type, model_name=self.model_name, provider_name=self.system, ) async def max_input_tokens(self) -> int | None: model = await self._get_model() return model.get_max_seq_length() async def count_tokens(self, text: str) -> int: model = await self._get_model() result: dict[str, torch.Tensor] = await _utils.run_in_executor( model.tokenize, # type: ignore[reportArgumentType] [text], ) if 'input_ids' not in result or not isinstance(result['input_ids'], torch.Tensor): # pragma: no cover raise UnexpectedModelBehavior( 'The SentenceTransformers tokenizer output did not have an `input_ids` field holding a tensor', str(result), ) return len(result['input_ids'][0]) async def _get_model(self) -> SentenceTransformer: if self._model is None: # This may download the model from Hugging Face, so we do it in a thread self._model = await _utils.run_in_executor(SentenceTransformer, self.model_name) # pragma: no cover return self._model ```` #### __init__ ```python __init__( model: SentenceTransformer | str, *, settings: EmbeddingSettings | None = None ) -> None ``` Initialize a Sentence-Transformers embedding model. Parameters: | Name | Type | Description | Default | | ---------- | --------------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | \`SentenceTransformer | str\` | The model to use. Can be: A model name from Hugging Face (e.g., 'all-MiniLM-L6-v2') A local path to a saved model An existing SentenceTransformer instance | | `settings` | \`EmbeddingSettings | None\` | Model-specific SentenceTransformersEmbeddingSettings to use as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py` ```python def __init__(self, model: SentenceTransformer | str, *, settings: EmbeddingSettings | None = None) -> None: """Initialize a Sentence-Transformers embedding model. Args: model: The model to use. Can be: - A model name from Hugging Face (e.g., `'all-MiniLM-L6-v2'`) - A local path to a saved model - An existing `SentenceTransformer` instance settings: Model-specific [`SentenceTransformersEmbeddingSettings`][pydantic_ai.embeddings.sentence_transformers.SentenceTransformersEmbeddingSettings] to use as defaults for this model. """ if isinstance(model, str): self._model_name = model else: self._model = deepcopy(model) self._model_name = model.model_card_data.model_id or model.model_card_data.base_model or 'unknown' super().__init__(settings=settings) ``` #### base_url ```python base_url: str | None ``` No base URL — runs locally. #### model_name ```python model_name: str ``` The embedding model name. #### system ```python system: str ``` The embedding model provider/system identifier. ### TestEmbeddingModel Bases: `EmbeddingModel` A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` Source code in `pydantic_ai_slim/pydantic_ai/embeddings/test.py` ````python @dataclass(init=False) class TestEmbeddingModel(EmbeddingModel): """A mock embedding model for testing. This model returns deterministic embeddings (all 1.0 values) and tracks the settings used in the last call via the `last_settings` attribute. Example: ```python from pydantic_ai import Embedder from pydantic_ai.embeddings import TestEmbeddingModel test_model = TestEmbeddingModel() embedder = Embedder('openai:text-embedding-3-small') async def main(): with embedder.override(model=test_model): await embedder.embed_query('test') assert test_model.last_settings is not None ``` """ # NOTE: Avoid test discovery by pytest. __test__ = False _model_name: str """The model name to report in results.""" _provider_name: str """The provider name to report in results.""" _dimensions: int """The number of dimensions for generated embeddings.""" last_settings: EmbeddingSettings | None = None """The settings used in the most recent embed call.""" def __init__( self, model_name: str = 'test', *, provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ): """Initialize the test embedding model. Args: model_name: The model name to report in results. provider_name: The provider name to report in results. dimensions: The number of dimensions for the generated embeddings. settings: Optional default settings for the model. """ self._model_name = model_name self._provider_name = provider_name self._dimensions = dimensions self.last_settings = None super().__init__(settings=settings) @property def model_name(self) -> str: """The embedding model name.""" return self._model_name @property def system(self) -> str: """The embedding model provider.""" return self._provider_name async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) self.last_settings = settings dimensions = settings.get('dimensions') or self._dimensions return EmbeddingResult( embeddings=[[1.0] * dimensions] * len(inputs), inputs=inputs, input_type=input_type, usage=RequestUsage(input_tokens=sum(_estimate_tokens(text) for text in inputs)), model_name=self.model_name, provider_name=self.system, provider_response_id=str(uuid.uuid4()), ) async def max_input_tokens(self) -> int | None: return 1024 async def count_tokens(self, text: str) -> int: return _estimate_tokens(text) ```` #### __init__ ```python __init__( model_name: str = "test", *, provider_name: str = "test", dimensions: int = 8, settings: EmbeddingSettings | None = None ) ``` Initialize the test embedding model. Parameters: | Name | Type | Description | Default | | --------------- | ------------------- | ------------------------------------------------------ | ---------------------------------------- | | `model_name` | `str` | The model name to report in results. | `'test'` | | `provider_name` | `str` | The provider name to report in results. | `'test'` | | `dimensions` | `int` | The number of dimensions for the generated embeddings. | `8` | | `settings` | \`EmbeddingSettings | None\` | Optional default settings for the model. | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/test.py` ```python def __init__( self, model_name: str = 'test', *, provider_name: str = 'test', dimensions: int = 8, settings: EmbeddingSettings | None = None, ): """Initialize the test embedding model. Args: model_name: The model name to report in results. provider_name: The provider name to report in results. dimensions: The number of dimensions for the generated embeddings. settings: Optional default settings for the model. """ self._model_name = model_name self._provider_name = provider_name self._dimensions = dimensions self.last_settings = None super().__init__(settings=settings) ``` #### last_settings ```python last_settings: EmbeddingSettings | None = None ``` The settings used in the most recent embed call. #### model_name ```python model_name: str ``` The embedding model name. #### system ```python system: str ``` The embedding model provider. ### WrapperEmbeddingModel Bases: `EmbeddingModel` Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py` ```python @dataclass(init=False) class WrapperEmbeddingModel(EmbeddingModel): """Base class for embedding models that wrap another model. Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model. By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior. """ wrapped: EmbeddingModel """The underlying embedding model being wrapped.""" def __init__(self, wrapped: EmbeddingModel | str): """Initialize the wrapper with an embedding model. Args: wrapped: The model to wrap. Can be an [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance or a model name string (e.g., `'openai:text-embedding-3-small'`). """ from . import infer_embedding_model super().__init__() self.wrapped = infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: return await self.wrapped.embed(inputs, input_type=input_type, settings=settings) async def max_input_tokens(self) -> int | None: return await self.wrapped.max_input_tokens() async def count_tokens(self, text: str) -> int: return await self.wrapped.count_tokens(text) @property def model_name(self) -> str: return self.wrapped.model_name @property def system(self) -> str: return self.wrapped.system @property def settings(self) -> EmbeddingSettings | None: """Get the settings from the wrapped embedding model.""" return self.wrapped.settings @property def base_url(self) -> str | None: return self.wrapped.base_url def __getattr__(self, item: str): return getattr(self.wrapped, item) # pragma: no cover ``` #### __init__ ```python __init__(wrapped: EmbeddingModel | str) ``` Initialize the wrapper with an embedding model. Parameters: | Name | Type | Description | Default | | --------- | ---------------- | ----------- | -------------------------------------------------------------------------------------------------------------------- | | `wrapped` | \`EmbeddingModel | str\` | The model to wrap. Can be an EmbeddingModel instance or a model name string (e.g., 'openai:text-embedding-3-small'). | Source code in `pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py` ```python def __init__(self, wrapped: EmbeddingModel | str): """Initialize the wrapper with an embedding model. Args: wrapped: The model to wrap. Can be an [`EmbeddingModel`][pydantic_ai.embeddings.EmbeddingModel] instance or a model name string (e.g., `'openai:text-embedding-3-small'`). """ from . import infer_embedding_model super().__init__() self.wrapped = infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped ``` #### wrapped ```python wrapped: EmbeddingModel = ( infer_embedding_model(wrapped) if isinstance(wrapped, str) else wrapped ) ``` The underlying embedding model being wrapped. #### settings ```python settings: EmbeddingSettings | None ``` Get the settings from the wrapped embedding model. ### instrument_embedding_model ```python instrument_embedding_model( model: EmbeddingModel, instrument: InstrumentationSettings | bool, ) -> EmbeddingModel ``` Instrument an embedding model with OpenTelemetry/logfire. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py` ```python def instrument_embedding_model(model: EmbeddingModel, instrument: InstrumentationSettings | bool) -> EmbeddingModel: """Instrument an embedding model with OpenTelemetry/logfire.""" if instrument and not isinstance(model, InstrumentedEmbeddingModel): if instrument is True: instrument = InstrumentationSettings() model = InstrumentedEmbeddingModel(model, instrument) return model ``` ### InstrumentedEmbeddingModel Bases: `WrapperEmbeddingModel` Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py` ```python @dataclass(init=False) class InstrumentedEmbeddingModel(WrapperEmbeddingModel): """Embedding model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ instrumentation_settings: InstrumentationSettings """Instrumentation settings for this model.""" def __init__( self, wrapped: EmbeddingModel | str, options: InstrumentationSettings | None = None, ) -> None: super().__init__(wrapped) self.instrumentation_settings = options or InstrumentationSettings() async def embed( self, inputs: str | Sequence[str], *, input_type: EmbedInputType, settings: EmbeddingSettings | None = None ) -> EmbeddingResult: inputs, settings = self.prepare_embed(inputs, settings) with self._instrument(inputs, input_type, settings) as finish: result = await super().embed(inputs, input_type=input_type, settings=settings) finish(result) return result @contextmanager def _instrument( self, inputs: list[str], input_type: EmbedInputType, settings: EmbeddingSettings | None, ) -> Iterator[Callable[[EmbeddingResult], None]]: operation = 'embeddings' span_name = f'{operation} {self.model_name}' inputs_count = len(inputs) attributes: dict[str, AttributeValue] = { 'gen_ai.operation.name': operation, **self.model_attributes(self.wrapped), 'input_type': input_type, 'inputs_count': inputs_count, } if settings: attributes['embedding_settings'] = json.dumps(self.serialize_any(settings)) if self.instrumentation_settings.include_content: attributes['inputs'] = json.dumps(inputs) attributes['logfire.json_schema'] = json.dumps( { 'type': 'object', 'properties': { 'input_type': {'type': 'string'}, 'inputs_count': {'type': 'integer'}, 'embedding_settings': {'type': 'object'}, **( {'inputs': {'type': ['array']}, 'embeddings': {'type': 'array'}} if self.instrumentation_settings.include_content else {} ), }, } ) record_metrics: Callable[[], None] | None = None try: with self.instrumentation_settings.tracer.start_as_current_span(span_name, attributes=attributes) as span: def finish(result: EmbeddingResult): # Prepare metric recording closure first so metrics are recorded # even if the span is not recording. provider_name = attributes[GEN_AI_PROVIDER_NAME_ATTRIBUTE] request_model = attributes[GEN_AI_REQUEST_MODEL_ATTRIBUTE] response_model = result.model_name or request_model price_calculation = None def _record_metrics(): token_attributes = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: provider_name, 'gen_ai.operation.name': operation, GEN_AI_REQUEST_MODEL_ATTRIBUTE: request_model, 'gen_ai.response.model': response_model, 'gen_ai.token.type': 'input', } tokens = result.usage.input_tokens or 0 if tokens: # pragma: no branch self.instrumentation_settings.tokens_histogram.record(tokens, token_attributes) if price_calculation is not None: self.instrumentation_settings.cost_histogram.record( float(getattr(price_calculation, 'input_price', 0.0)), token_attributes, ) nonlocal record_metrics record_metrics = _record_metrics if not span.is_recording(): return # pragma: lax no cover attributes_to_set: dict[str, AttributeValue] = { **result.usage.opentelemetry_attributes(), 'gen_ai.response.model': response_model, } try: price_calculation = result.cost() except LookupError: # The cost of this provider/model is unknown, which is common. pass except Exception as e: # pragma: no cover warnings.warn( f'Failed to get cost from response: {type(e).__name__}: {e}', CostCalculationFailedWarning ) else: attributes_to_set['operation.cost'] = float(price_calculation.total_price) embeddings = result.embeddings if embeddings: # pragma: no branch attributes_to_set['gen_ai.embeddings.dimension.count'] = len(embeddings[0]) if self.instrumentation_settings.include_content: attributes['embeddings'] = json.dumps(embeddings) if result.provider_response_id is not None: attributes_to_set['gen_ai.response.id'] = result.provider_response_id span.set_attributes(attributes_to_set) yield finish finally: if record_metrics: # pragma: no branch # Record metrics after the span finishes to avoid duplication. record_metrics() @staticmethod def model_attributes(model: EmbeddingModel) -> dict[str, AttributeValue]: attributes: dict[str, AttributeValue] = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: model.system, GEN_AI_REQUEST_MODEL_ATTRIBUTE: model.model_name, } if base_url := model.base_url: try: parsed = urlparse(base_url) except Exception: # pragma: no cover pass else: if parsed.hostname: # pragma: no branch attributes['server.address'] = parsed.hostname if parsed.port: attributes['server.port'] = parsed.port # pragma: no cover return attributes @staticmethod def serialize_any(value: Any) -> str: try: return ANY_ADAPTER.dump_python(value, mode='json') except Exception: # pragma: no cover try: return str(value) except Exception as e: return f'Unable to serialize: {e}' ``` #### instrumentation_settings ```python instrumentation_settings: InstrumentationSettings = ( options or InstrumentationSettings() ) ``` Instrumentation settings for this model. # `pydantic_ai.exceptions` ### ModelRetry Bases: `Exception` Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelRetry(Exception): """Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. """ message: str """The message to return to the model.""" def __init__(self, message: str): self.message = message super().__init__(message) def __eq__(self, other: Any) -> bool: return isinstance(other, self.__class__) and other.message == self.message def __hash__(self) -> int: return hash((self.__class__, self.message)) @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` #### message ```python message: str = message ``` The message to return to the model. #### __get_pydantic_core_schema__ ```python __get_pydantic_core_schema__(_: Any, __: Any) -> CoreSchema ``` Pydantic core schema to allow `ModelRetry` to be (de)serialized. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` ### CallDeferred Bases: `Exception` Exception to raise when a tool call should be deferred. See [tools docs](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more information. Parameters: | Name | Type | Description | Default | | ---------- | ---------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `metadata` | \`dict[str, Any] | None\` | Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in DeferredToolRequests.metadata keyed by tool_call_id. | Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class CallDeferred(Exception): """Exception to raise when a tool call should be deferred. See [tools docs](../deferred-tools.md#deferred-tools) for more information. Args: metadata: Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`. """ def __init__(self, metadata: dict[str, Any] | None = None): self.metadata = metadata super().__init__() ``` ### ApprovalRequired Bases: `Exception` Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](https://ai.pydantic.dev/deferred-tools/#human-in-the-loop-tool-approval) for more information. Parameters: | Name | Type | Description | Default | | ---------- | ---------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `metadata` | \`dict[str, Any] | None\` | Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in DeferredToolRequests.metadata keyed by tool_call_id. | Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ApprovalRequired(Exception): """Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](../deferred-tools.md#human-in-the-loop-tool-approval) for more information. Args: metadata: Optional dictionary of metadata to attach to the deferred tool call. This metadata will be available in `DeferredToolRequests.metadata` keyed by `tool_call_id`. """ def __init__(self, metadata: dict[str, Any] | None = None): self.metadata = metadata super().__init__() ``` ### UserError Bases: `RuntimeError` Error caused by a usage mistake by the application developer — You! Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UserError(RuntimeError): """Error caused by a usage mistake by the application developer — You!""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### AgentRunError Bases: `RuntimeError` Base class for errors occurring during an agent run. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class AgentRunError(RuntimeError): """Base class for errors occurring during an agent run.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) def __str__(self) -> str: return self.message ``` #### message ```python message: str = message ``` The error message. ### UsageLimitExceeded Bases: `AgentRunError` Error raised when a Model's usage exceeds the specified limits. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UsageLimitExceeded(AgentRunError): """Error raised when a Model's usage exceeds the specified limits.""" ``` ### ConcurrencyLimitExceeded Bases: `AgentRunError` Error raised when the concurrency queue depth exceeds max_queued. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ConcurrencyLimitExceeded(AgentRunError): """Error raised when the concurrency queue depth exceeds max_queued.""" ``` ### UnexpectedModelBehavior Bases: `AgentRunError` Error caused by unexpected Model behavior, e.g. an unexpected response code. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UnexpectedModelBehavior(AgentRunError): """Error caused by unexpected Model behavior, e.g. an unexpected response code.""" message: str """Description of the unexpected behavior.""" body: str | None """The body of the response, if available.""" def __init__(self, message: str, body: str | None = None): self.message = message if body is None: self.body: str | None = None else: try: self.body = json.dumps(json.loads(body), indent=2) except ValueError: self.body = body super().__init__(message) def __str__(self) -> str: if self.body: return f'{self.message}, body:\n{self.body}' else: return self.message ``` #### message ```python message: str = message ``` Description of the unexpected behavior. #### body ```python body: str | None = dumps(loads(body), indent=2) ``` The body of the response, if available. ### ContentFilterError Bases: `UnexpectedModelBehavior` Raised when content filtering is triggered by the model provider resulting in an empty response. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ContentFilterError(UnexpectedModelBehavior): """Raised when content filtering is triggered by the model provider resulting in an empty response.""" ``` ### ModelAPIError Bases: `AgentRunError` Raised when a model provider API request fails. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelAPIError(AgentRunError): """Raised when a model provider API request fails.""" model_name: str """The name of the model associated with the error.""" def __init__(self, model_name: str, message: str): self.model_name = model_name super().__init__(message) ``` #### model_name ```python model_name: str = model_name ``` The name of the model associated with the error. ### ModelHTTPError Bases: `ModelAPIError` Raised when an model provider response has a status code of 4xx or 5xx. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelHTTPError(ModelAPIError): """Raised when an model provider response has a status code of 4xx or 5xx.""" status_code: int """The HTTP status code returned by the API.""" body: object | None """The body of the response, if available.""" def __init__(self, status_code: int, model_name: str, body: object | None = None): self.status_code = status_code self.body = body message = f'status_code: {status_code}, model_name: {model_name}, body: {body}' super().__init__(model_name=model_name, message=message) ``` #### status_code ```python status_code: int = status_code ``` The HTTP status code returned by the API. #### body ```python body: object | None = body ``` The body of the response, if available. ### FallbackExceptionGroup Bases: `ExceptionGroup[Any]` A group of exceptions that can be raised when all fallback models fail. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class FallbackExceptionGroup(ExceptionGroup[Any]): """A group of exceptions that can be raised when all fallback models fail.""" ``` ### ToolRetryError Bases: `Exception` Exception used to signal a `ToolRetry` message should be returned to the LLM. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ToolRetryError(Exception): """Exception used to signal a `ToolRetry` message should be returned to the LLM.""" def __init__(self, tool_retry: RetryPromptPart): self.tool_retry = tool_retry message = ( tool_retry.content if isinstance(tool_retry.content, str) else self._format_error_details(tool_retry.content, tool_retry.tool_name) ) super().__init__(message) @staticmethod def _format_error_details(errors: list[pydantic_core.ErrorDetails], tool_name: str | None) -> str: """Format ErrorDetails as a human-readable message. We format manually rather than using ValidationError.from_exception_data because some error types (value_error, assertion_error, etc.) require an 'error' key in ctx, but when ErrorDetails are serialized, exception objects are stripped from ctx. The 'msg' field already contains the human-readable message, so we use that directly. """ error_count = len(errors) lines = [ f'{error_count} validation error{"" if error_count == 1 else "s"}{f" for {tool_name!r}" if tool_name else ""}' ] for e in errors: loc = '.'.join(str(x) for x in e['loc']) if e['loc'] else '__root__' lines.append(loc) lines.append(f' {e["msg"]} [type={e["type"]}, input_value={e["input"]!r}]') return '\n'.join(lines) ``` ### IncompleteToolCall Bases: `UnexpectedModelBehavior` Error raised when a model stops due to token limit while emitting a tool call. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class IncompleteToolCall(UnexpectedModelBehavior): """Error raised when a model stops due to token limit while emitting a tool call.""" ``` # `pydantic_ai.ext` ### tool_from_langchain ```python tool_from_langchain(langchain_tool: LangChainTool) -> Tool ``` Creates a Pydantic AI tool proxy from a LangChain tool. Parameters: | Name | Type | Description | Default | | ---------------- | --------------- | --------------------------- | ---------- | | `langchain_tool` | `LangChainTool` | The LangChain tool to wrap. | *required* | Returns: | Type | Description | | ------ | ---------------------------------------------------------- | | `Tool` | A Pydantic AI tool that corresponds to the LangChain tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python def tool_from_langchain(langchain_tool: LangChainTool) -> Tool: """Creates a Pydantic AI tool proxy from a LangChain tool. Args: langchain_tool: The LangChain tool to wrap. Returns: A Pydantic AI tool that corresponds to the LangChain tool. """ function_name = langchain_tool.name function_description = langchain_tool.description inputs = langchain_tool.args.copy() required = sorted({name for name, detail in inputs.items() if 'default' not in detail}) schema: JsonSchemaValue = langchain_tool.get_input_jsonschema() if 'additionalProperties' not in schema: schema['additionalProperties'] = False if required: schema['required'] = required defaults = {name: detail['default'] for name, detail in inputs.items() if 'default' in detail} # restructures the arguments to match langchain tool run def proxy(*args: Any, **kwargs: Any) -> str: assert not args, 'This should always be called with kwargs' kwargs = defaults | kwargs return langchain_tool.run(kwargs) return Tool.from_schema( function=proxy, name=function_name, description=function_description, json_schema=schema, ) ``` ### LangChainToolset Bases: `FunctionToolset` A toolset that wraps LangChain tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python class LangChainToolset(FunctionToolset): """A toolset that wraps LangChain tools.""" def __init__(self, tools: list[LangChainTool], *, id: str | None = None): super().__init__([tool_from_langchain(tool) for tool in tools], id=id) ``` ### tool_from_aci ```python tool_from_aci( aci_function: str, linked_account_owner_id: str ) -> Tool ``` Creates a Pydantic AI tool proxy from an ACI.dev function. Parameters: | Name | Type | Description | Default | | ------------------------- | ----- | ----------------------------------------------------- | ---------- | | `aci_function` | `str` | The ACI.dev function to wrap. | *required* | | `linked_account_owner_id` | `str` | The ACI user ID to execute the function on behalf of. | *required* | Returns: | Type | Description | | ------ | -------------------------------------------------------- | | `Tool` | A Pydantic AI tool that corresponds to the ACI.dev tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python def tool_from_aci(aci_function: str, linked_account_owner_id: str) -> Tool: """Creates a Pydantic AI tool proxy from an ACI.dev function. Args: aci_function: The ACI.dev function to wrap. linked_account_owner_id: The ACI user ID to execute the function on behalf of. Returns: A Pydantic AI tool that corresponds to the ACI.dev tool. """ aci = ACI() function_definition = aci.functions.get_definition(aci_function) function_name = function_definition['function']['name'] function_description = function_definition['function']['description'] inputs = function_definition['function']['parameters'] json_schema = { 'additionalProperties': inputs.get('additionalProperties', False), 'properties': inputs.get('properties', {}), 'required': inputs.get('required', []), # Default to 'object' if not specified 'type': inputs.get('type', 'object'), } # Clean the schema json_schema = _clean_schema(json_schema) def implementation(*args: Any, **kwargs: Any) -> str: if args: raise TypeError('Positional arguments are not allowed') return aci.handle_function_call( function_name, kwargs, linked_account_owner_id=linked_account_owner_id, allowed_apps_only=True, ) return Tool.from_schema( function=implementation, name=function_name, description=function_description, json_schema=json_schema, ) ``` ### ACIToolset Bases: `FunctionToolset` A toolset that wraps ACI.dev tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python class ACIToolset(FunctionToolset): """A toolset that wraps ACI.dev tools.""" def __init__(self, aci_functions: Sequence[str], linked_account_owner_id: str, *, id: str | None = None): super().__init__( [tool_from_aci(aci_function, linked_account_owner_id) for aci_function in aci_functions], id=id ) ``` # `fasta2a` ### FastA2A Bases: `Starlette` The main class for the FastA2A library. Source code in `.venv/lib/python3.12/site-packages/fasta2a/applications.py` ```python class FastA2A(Starlette): """The main class for the FastA2A library.""" def __init__( self, *, storage: Storage, broker: Broker, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ): if lifespan is None: lifespan = _default_lifespan super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) self.name = name or 'My Agent' self.url = url self.version = version self.description = description self.provider = provider self.skills = skills or [] # NOTE: For now, I don't think there's any reason to support any other input/output modes. self.default_input_modes = ['application/json'] self.default_output_modes = ['application/json'] self.task_manager = TaskManager(broker=broker, storage=storage) # Setup self._agent_card_json_schema: bytes | None = None self.router.add_route( '/.well-known/agent-card.json', self._agent_card_endpoint, methods=['HEAD', 'GET', 'OPTIONS'] ) self.router.add_route('/', self._agent_run_endpoint, methods=['POST']) self.router.add_route('/docs', self._docs_endpoint, methods=['GET']) async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None: if scope['type'] == 'http' and not self.task_manager.is_running: raise RuntimeError('TaskManager was not properly initialized.') await super().__call__(scope, receive, send) async def _agent_card_endpoint(self, request: Request) -> Response: if self._agent_card_json_schema is None: agent_card = AgentCard( name=self.name, description=self.description or 'An AI agent exposed as an A2A agent.', url=self.url, version=self.version, protocol_version='0.3.0', skills=self.skills, default_input_modes=self.default_input_modes, default_output_modes=self.default_output_modes, capabilities=AgentCapabilities( streaming=False, push_notifications=False, state_transition_history=False ), ) if self.provider is not None: agent_card['provider'] = self.provider self._agent_card_json_schema = agent_card_ta.dump_json(agent_card, by_alias=True) return Response(content=self._agent_card_json_schema, media_type='application/json') async def _docs_endpoint(self, request: Request) -> Response: """Serve the documentation interface.""" docs_path = Path(__file__).parent / 'static' / 'docs.html' return FileResponse(docs_path, media_type='text/html') async def _agent_run_endpoint(self, request: Request) -> Response: """This is the main endpoint for the A2A server. Although the specification allows freedom of choice and implementation, I'm pretty sure about some decisions. 1. The server will always either send a "submitted" or a "failed" on `message/send`. Never a "completed" on the first message. 2. There are three possible ends for the task: 2.1. The task was "completed" successfully. 2.2. The task was "canceled". 2.3. The task "failed". 3. The server will send a "working" on the first chunk on `tasks/pushNotification/get`. """ data = await request.body() a2a_request = a2a_request_ta.validate_json(data) if a2a_request['method'] == 'message/send': jsonrpc_response = await self.task_manager.send_message(a2a_request) elif a2a_request['method'] == 'tasks/get': jsonrpc_response = await self.task_manager.get_task(a2a_request) elif a2a_request['method'] == 'tasks/cancel': jsonrpc_response = await self.task_manager.cancel_task(a2a_request) else: raise NotImplementedError(f'Method {a2a_request["method"]} not implemented.') return Response( content=a2a_response_ta.dump_json(jsonrpc_response, by_alias=True), media_type='application/json' ) ``` ### Broker Bases: `ABC` The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @dataclass class Broker(ABC): """The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. """ @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') @abstractmethod async def __aenter__(self) -> Self: ... @abstractmethod async def __aexit__(self, exc_type: Any, exc_value: Any, traceback: Any): ... @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` #### run_task ```python run_task(params: TaskSendParams) -> None ``` Send a task to be executed by the worker. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') ``` #### cancel_task ```python cancel_task(params: TaskIdParams) -> None ``` Cancel a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') ``` #### receive_task_operations ```python receive_task_operations() -> AsyncIterator[TaskOperation] ``` Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Storage Bases: `ABC`, `Generic[ContextT]` A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 1. Context storage: Stores conversation context in a format optimized for the specific agent implementation Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python class Storage(ABC, Generic[ContextT]): """A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 2. Context storage: Stores conversation context in a format optimized for the specific agent implementation """ @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` #### load_task ```python load_task( task_id: str, history_length: int | None = None ) -> Task | None ``` Load a task from storage. If the task is not found, return None. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ ``` #### submit_task ```python submit_task(context_id: str, message: Message) -> Task ``` Submit a task to storage. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" ``` #### update_task ```python update_task( task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task ``` Update the state of a task. Appends artifacts and messages, if specified. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" ``` #### load_context ```python load_context(context_id: str) -> ContextT | None ``` Retrieve the stored context given the `context_id`. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" ``` #### update_context ```python update_context(context_id: str, context: ContextT) -> None ``` Updates the context for a `context_id`. Implementing agent can decide what to store in context. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` ### Worker Bases: `ABC`, `Generic[ContextT]` A worker is responsible for executing tasks. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @dataclass class Worker(ABC, Generic[ContextT]): """A worker is responsible for executing tasks.""" broker: Broker storage: Storage[ContextT] @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() async def _loop(self) -> None: async for task_operation in self.broker.receive_task_operations(): await self._handle_task_operation(task_operation) async def _handle_task_operation(self, task_operation: TaskOperation) -> None: try: with use_span(task_operation['_current_span']): with tracer.start_as_current_span( f'{task_operation["operation"]} task', attributes={'logfire.tags': ['fasta2a']} ): if task_operation['operation'] == 'run': await self.run_task(task_operation['params']) elif task_operation['operation'] == 'cancel': await self.cancel_task(task_operation['params']) else: assert_never(task_operation) except Exception: await self.storage.update_task(task_operation['params']['id'], state='failed') @abstractmethod async def run_task(self, params: TaskSendParams) -> None: ... @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: ... @abstractmethod def build_message_history(self, history: list[Message]) -> list[Any]: ... @abstractmethod def build_artifacts(self, result: Any) -> list[Artifact]: ... ``` #### run ```python run() -> AsyncIterator[None] ``` Run the worker. It connects to the broker, and it makes itself available to receive commands. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() ``` This module contains the schema for the agent card. ### AgentCard Bases: `TypedDict` The card that describes an agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCard(TypedDict): """The card that describes an agent.""" name: str """Human readable name of the agent e.g. "Recipe Agent".""" description: str """A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") """ url: str """A URL to the address the agent is hosted at.""" version: str """The version of the agent - format is up to the provider. (e.g. "1.0.0")""" protocol_version: str """The version of the A2A protocol this agent supports.""" provider: NotRequired[AgentProvider] """The service provider of the agent.""" documentation_url: NotRequired[str] """A URL to documentation for the agent.""" icon_url: NotRequired[str] """A URL to an icon for the agent.""" preferred_transport: NotRequired[str] """The transport of the preferred endpoint. If empty, defaults to JSONRPC.""" additional_interfaces: NotRequired[list[AgentInterface]] """Announcement of additional supported transports.""" capabilities: AgentCapabilities """The capabilities of the agent.""" security: NotRequired[list[dict[str, list[str]]]] """Security requirements for contacting the agent.""" security_schemes: NotRequired[dict[str, SecurityScheme]] """Security scheme definitions.""" default_input_modes: list[str] """Supported mime types for input data.""" default_output_modes: list[str] """Supported mime types for output data.""" skills: list[Skill] """The set of skills, or distinct capabilities, that the agent can perform.""" ``` #### name ```python name: str ``` Human readable name of the agent e.g. "Recipe Agent". #### description ```python description: str ``` A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") #### url ```python url: str ``` A URL to the address the agent is hosted at. #### version ```python version: str ``` The version of the agent - format is up to the provider. (e.g. "1.0.0") #### protocol_version ```python protocol_version: str ``` The version of the A2A protocol this agent supports. #### provider ```python provider: NotRequired[AgentProvider] ``` The service provider of the agent. #### documentation_url ```python documentation_url: NotRequired[str] ``` A URL to documentation for the agent. #### icon_url ```python icon_url: NotRequired[str] ``` A URL to an icon for the agent. #### preferred_transport ```python preferred_transport: NotRequired[str] ``` The transport of the preferred endpoint. If empty, defaults to JSONRPC. #### additional_interfaces ```python additional_interfaces: NotRequired[list[AgentInterface]] ``` Announcement of additional supported transports. #### capabilities ```python capabilities: AgentCapabilities ``` The capabilities of the agent. #### security ```python security: NotRequired[list[dict[str, list[str]]]] ``` Security requirements for contacting the agent. #### security_schemes ```python security_schemes: NotRequired[dict[str, SecurityScheme]] ``` Security scheme definitions. #### default_input_modes ```python default_input_modes: list[str] ``` Supported mime types for input data. #### default_output_modes ```python default_output_modes: list[str] ``` Supported mime types for output data. #### skills ```python skills: list[Skill] ``` The set of skills, or distinct capabilities, that the agent can perform. ### AgentProvider Bases: `TypedDict` The service provider of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class AgentProvider(TypedDict): """The service provider of the agent.""" organization: str """The name of the agent provider's organization.""" url: str """A URL for the agent provider's website or relevant documentation.""" ``` #### organization ```python organization: str ``` The name of the agent provider's organization. #### url ```python url: str ``` A URL for the agent provider's website or relevant documentation. ### AgentCapabilities Bases: `TypedDict` The capabilities of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCapabilities(TypedDict): """The capabilities of the agent.""" streaming: NotRequired[bool] """Whether the agent supports streaming.""" push_notifications: NotRequired[bool] """Whether the agent can notify updates to client.""" state_transition_history: NotRequired[bool] """Whether the agent exposes status change history for tasks.""" ``` #### streaming ```python streaming: NotRequired[bool] ``` Whether the agent supports streaming. #### push_notifications ```python push_notifications: NotRequired[bool] ``` Whether the agent can notify updates to client. #### state_transition_history ```python state_transition_history: NotRequired[bool] ``` Whether the agent exposes status change history for tasks. ### HttpSecurityScheme Bases: `TypedDict` HTTP security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class HttpSecurityScheme(TypedDict): """HTTP security scheme.""" type: Literal['http'] """The type of the security scheme. Must be 'http'.""" scheme: str """The name of the HTTP Authorization scheme.""" bearer_format: NotRequired[str] """A hint to the client to identify how the bearer token is formatted.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### type ```python type: Literal['http'] ``` The type of the security scheme. Must be 'http'. #### scheme ```python scheme: str ``` The name of the HTTP Authorization scheme. #### bearer_format ```python bearer_format: NotRequired[str] ``` A hint to the client to identify how the bearer token is formatted. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### ApiKeySecurityScheme Bases: `TypedDict` API Key security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ApiKeySecurityScheme(TypedDict): """API Key security scheme.""" type: Literal['apiKey'] """The type of the security scheme. Must be 'apiKey'.""" name: str """The name of the header, query or cookie parameter to be used.""" in_: Literal['query', 'header', 'cookie'] """The location of the API key.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### type ```python type: Literal['apiKey'] ``` The type of the security scheme. Must be 'apiKey'. #### name ```python name: str ``` The name of the header, query or cookie parameter to be used. #### in\_ ```python in_: Literal['query', 'header', 'cookie'] ``` The location of the API key. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OAuth2SecurityScheme Bases: `TypedDict` OAuth2 security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OAuth2SecurityScheme(TypedDict): """OAuth2 security scheme.""" type: Literal['oauth2'] """The type of the security scheme. Must be 'oauth2'.""" flows: dict[str, Any] """An object containing configuration information for the flow types supported.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### type ```python type: Literal['oauth2'] ``` The type of the security scheme. Must be 'oauth2'. #### flows ```python flows: dict[str, Any] ``` An object containing configuration information for the flow types supported. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OpenIdConnectSecurityScheme Bases: `TypedDict` OpenID Connect security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OpenIdConnectSecurityScheme(TypedDict): """OpenID Connect security scheme.""" type: Literal['openIdConnect'] """The type of the security scheme. Must be 'openIdConnect'.""" open_id_connect_url: str """OpenId Connect URL to discover OAuth2 configuration values.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### type ```python type: Literal['openIdConnect'] ``` The type of the security scheme. Must be 'openIdConnect'. #### open_id_connect_url ```python open_id_connect_url: str ``` OpenId Connect URL to discover OAuth2 configuration values. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### SecurityScheme ```python SecurityScheme = Annotated[ Union[ HttpSecurityScheme, ApiKeySecurityScheme, OAuth2SecurityScheme, OpenIdConnectSecurityScheme, ], Field(discriminator="type"), ] ``` A security scheme for authentication. ### AgentInterface Bases: `TypedDict` An interface that the agent supports. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentInterface(TypedDict): """An interface that the agent supports.""" transport: str """The transport protocol (e.g., 'jsonrpc', 'websocket').""" url: str """The URL endpoint for this transport.""" description: NotRequired[str] """Description of this interface.""" ``` #### transport ```python transport: str ``` The transport protocol (e.g., 'jsonrpc', 'websocket'). #### url ```python url: str ``` The URL endpoint for this transport. #### description ```python description: NotRequired[str] ``` Description of this interface. ### AgentExtension Bases: `TypedDict` A declaration of an extension supported by an Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentExtension(TypedDict): """A declaration of an extension supported by an Agent.""" uri: str """The URI of the extension.""" description: NotRequired[str] """A description of how this agent uses this extension.""" required: NotRequired[bool] """Whether the client must follow specific requirements of the extension.""" params: NotRequired[dict[str, Any]] """Optional configuration for the extension.""" ``` #### uri ```python uri: str ``` The URI of the extension. #### description ```python description: NotRequired[str] ``` A description of how this agent uses this extension. #### required ```python required: NotRequired[bool] ``` Whether the client must follow specific requirements of the extension. #### params ```python params: NotRequired[dict[str, Any]] ``` Optional configuration for the extension. ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Artifact Bases: `TypedDict` Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Artifact(TypedDict): """Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. """ artifact_id: str """Unique identifier for the artifact.""" name: NotRequired[str] """The name of the artifact.""" description: NotRequired[str] """A description of the artifact.""" parts: list[Part] """The parts that make up the artifact.""" metadata: NotRequired[dict[str, Any]] """Metadata about the artifact.""" extensions: NotRequired[list[str]] """Array of extensions.""" append: NotRequired[bool] """Whether to append this artifact to an existing one.""" last_chunk: NotRequired[bool] """Whether this is the last chunk of the artifact.""" ``` #### artifact_id ```python artifact_id: str ``` Unique identifier for the artifact. #### name ```python name: NotRequired[str] ``` The name of the artifact. #### description ```python description: NotRequired[str] ``` A description of the artifact. #### parts ```python parts: list[Part] ``` The parts that make up the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the artifact. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. #### append ```python append: NotRequired[bool] ``` Whether to append this artifact to an existing one. #### last_chunk ```python last_chunk: NotRequired[bool] ``` Whether this is the last chunk of the artifact. ### PushNotificationConfig Bases: `TypedDict` Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc.). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class PushNotificationConfig(TypedDict): """Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc.). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). """ id: NotRequired[str] """Server-assigned identifier.""" url: str """The URL to send push notifications to.""" token: NotRequired[str] """Token unique to this task/session.""" authentication: NotRequired[SecurityScheme] """Authentication details for push notifications.""" ``` #### id ```python id: NotRequired[str] ``` Server-assigned identifier. #### url ```python url: str ``` The URL to send push notifications to. #### token ```python token: NotRequired[str] ``` Token unique to this task/session. #### authentication ```python authentication: NotRequired[SecurityScheme] ``` Authentication details for push notifications. ### TaskPushNotificationConfig Bases: `TypedDict` Configuration for task push notifications. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskPushNotificationConfig(TypedDict): """Configuration for task push notifications.""" id: str """The task id.""" push_notification_config: PushNotificationConfig """The push notification configuration.""" ``` #### id ```python id: str ``` The task id. #### push_notification_config ```python push_notification_config: PushNotificationConfig ``` The push notification configuration. ### Message Bases: `TypedDict` A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Message(TypedDict): """A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. """ role: Literal['user', 'agent'] """The role of the message.""" parts: list[Part] """The parts of the message.""" kind: Literal['message'] """Event type.""" metadata: NotRequired[dict[str, Any]] """Metadata about the message.""" # Additional fields message_id: str """Identifier created by the message creator.""" context_id: NotRequired[str] """The context the message is associated with.""" task_id: NotRequired[str] """Identifier of task the message is related to.""" reference_task_ids: NotRequired[list[str]] """Array of task IDs this message references.""" extensions: NotRequired[list[str]] """Array of extensions.""" ``` #### role ```python role: Literal['user', 'agent'] ``` The role of the message. #### parts ```python parts: list[Part] ``` The parts of the message. #### kind ```python kind: Literal['message'] ``` Event type. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the message. #### message_id ```python message_id: str ``` Identifier created by the message creator. #### context_id ```python context_id: NotRequired[str] ``` The context the message is associated with. #### task_id ```python task_id: NotRequired[str] ``` Identifier of task the message is related to. #### reference_task_ids ```python reference_task_ids: NotRequired[list[str]] ``` Array of task IDs this message references. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. ### TextPart Bases: `_BasePart` A part that contains text. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TextPart(_BasePart): """A part that contains text.""" kind: Literal['text'] """The kind of the part.""" text: str """The text of the part.""" ``` #### kind ```python kind: Literal['text'] ``` The kind of the part. #### text ```python text: str ``` The text of the part. ### FileWithBytes Bases: `TypedDict` File with base64 encoded data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithBytes(TypedDict): """File with base64 encoded data.""" bytes: str """The base64 encoded content of the file.""" mime_type: NotRequired[str] """Optional mime type for the file.""" ``` #### bytes ```python bytes: str ``` The base64 encoded content of the file. #### mime_type ```python mime_type: NotRequired[str] ``` Optional mime type for the file. ### FileWithUri Bases: `TypedDict` File with URI reference. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithUri(TypedDict): """File with URI reference.""" uri: str """The URI of the file.""" mime_type: NotRequired[str] """The mime type of the file.""" ``` #### uri ```python uri: str ``` The URI of the file. #### mime_type ```python mime_type: NotRequired[str] ``` The mime type of the file. ### FilePart Bases: `_BasePart` A part that contains a file. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FilePart(_BasePart): """A part that contains a file.""" kind: Literal['file'] """The kind of the part.""" file: FileWithBytes | FileWithUri """The file content - either bytes or URI.""" ``` #### kind ```python kind: Literal['file'] ``` The kind of the part. #### file ```python file: FileWithBytes | FileWithUri ``` The file content - either bytes or URI. ### DataPart Bases: `_BasePart` A part that contains structured data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DataPart(_BasePart): """A part that contains structured data.""" kind: Literal['data'] """The kind of the part.""" data: dict[str, Any] """The data of the part.""" ``` #### kind ```python kind: Literal['data'] ``` The kind of the part. #### data ```python data: dict[str, Any] ``` The data of the part. ### Part ```python Part = Annotated[ Union[TextPart, FilePart, DataPart], Field(discriminator="kind"), ] ``` A fully formed piece of content exchanged between a client and a remote agent as part of a Message or an Artifact. Each Part has its own content type and metadata. ### TaskState ```python TaskState: TypeAlias = Literal[ "submitted", "working", "input-required", "completed", "canceled", "failed", "rejected", "auth-required", "unknown", ] ``` The possible states of a task. ### TaskStatus Bases: `TypedDict` Status and accompanying message for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatus(TypedDict): """Status and accompanying message for a task.""" state: TaskState """The current state of the task.""" message: NotRequired[Message] """Additional status updates for client.""" timestamp: NotRequired[str] """ISO datetime value of when the status was updated.""" ``` #### state ```python state: TaskState ``` The current state of the task. #### message ```python message: NotRequired[Message] ``` Additional status updates for client. #### timestamp ```python timestamp: NotRequired[str] ``` ISO datetime value of when the status was updated. ### Task Bases: `TypedDict` A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Task(TypedDict): """A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. """ id: str """Unique identifier for the task.""" context_id: str """The context the task is associated with.""" kind: Literal['task'] """Event type.""" status: TaskStatus """Current status of the task.""" history: NotRequired[list[Message]] """Optional history of messages.""" artifacts: NotRequired[list[Artifact]] """Collection of artifacts created by the agent.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Unique identifier for the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['task'] ``` Event type. #### status ```python status: TaskStatus ``` Current status of the task. #### history ```python history: NotRequired[list[Message]] ``` Optional history of messages. #### artifacts ```python artifacts: NotRequired[list[Artifact]] ``` Collection of artifacts created by the agent. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskStatusUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatusUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['status-update'] """Event type.""" status: TaskStatus """The status of the task.""" final: bool """Indicates the end of the event stream.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['status-update'] ``` Event type. #### status ```python status: TaskStatus ``` The status of the task. #### final ```python final: bool ``` Indicates the end of the event stream. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskArtifactUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskArtifactUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['artifact-update'] """Event type identification.""" artifact: Artifact """The artifact that was updated.""" append: NotRequired[bool] """Whether to append to existing artifact (true) or replace (false).""" last_chunk: NotRequired[bool] """Indicates this is the final chunk of the artifact.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['artifact-update'] ``` Event type identification. #### artifact ```python artifact: Artifact ``` The artifact that was updated. #### append ```python append: NotRequired[bool] ``` Whether to append to existing artifact (true) or replace (false). #### last_chunk ```python last_chunk: NotRequired[bool] ``` Indicates this is the final chunk of the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskIdParams Bases: `TypedDict` Parameters for a task id. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskIdParams(TypedDict): """Parameters for a task id.""" id: str """The unique identifier for the task.""" metadata: NotRequired[dict[str, Any]] """Optional metadata associated with the request.""" ``` #### id ```python id: str ``` The unique identifier for the task. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Optional metadata associated with the request. ### TaskQueryParams Bases: `TaskIdParams` Query parameters for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskQueryParams(TaskIdParams): """Query parameters for a task.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" ``` #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. ### MessageSendConfiguration Bases: `TypedDict` Configuration for the send message request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendConfiguration(TypedDict): """Configuration for the send message request.""" accepted_output_modes: list[str] """Accepted output modalities by the client.""" blocking: NotRequired[bool] """If the server should treat the client as a blocking request.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" push_notification_config: NotRequired[PushNotificationConfig] """Where the server should send notifications when disconnected.""" ``` #### accepted_output_modes ```python accepted_output_modes: list[str] ``` Accepted output modalities by the client. #### blocking ```python blocking: NotRequired[bool] ``` If the server should treat the client as a blocking request. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### push_notification_config ```python push_notification_config: NotRequired[ PushNotificationConfig ] ``` Where the server should send notifications when disconnected. ### MessageSendParams Bases: `TypedDict` Parameters for message/send method. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendParams(TypedDict): """Parameters for message/send method.""" configuration: NotRequired[MessageSendConfiguration] """Send message configuration.""" message: Message """The message being sent to the server.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### configuration ```python configuration: NotRequired[MessageSendConfiguration] ``` Send message configuration. #### message ```python message: Message ``` The message being sent to the server. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskSendParams Bases: `TypedDict` Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskSendParams(TypedDict): """Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. """ id: str """The id of the task.""" context_id: str """The context id for the task.""" message: Message """The message to process.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context id for the task. #### message ```python message: Message ``` The message to process. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### ListTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for getting list of pushNotificationConfigurations associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ListTaskPushNotificationConfigParams(TypedDict): """Parameters for getting list of pushNotificationConfigurations associated with a Task.""" id: str """Task id.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### DeleteTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for removing pushNotificationConfiguration associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DeleteTaskPushNotificationConfigParams(TypedDict): """Parameters for removing pushNotificationConfiguration associated with a Task.""" id: str """Task id.""" push_notification_config_id: str """The push notification config id to delete.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### push_notification_config_id ```python push_notification_config_id: str ``` The push notification config id to delete. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### JSONRPCMessage Bases: `TypedDict` A JSON RPC message. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCMessage(TypedDict): """A JSON RPC message.""" jsonrpc: Literal['2.0'] """The JSON RPC version.""" id: int | str | None """The request id.""" ``` #### jsonrpc ```python jsonrpc: Literal['2.0'] ``` The JSON RPC version. #### id ```python id: int | str | None ``` The request id. ### JSONRPCRequest Bases: `JSONRPCMessage`, `Generic[Method, Params]` A JSON RPC request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCRequest(JSONRPCMessage, Generic[Method, Params]): """A JSON RPC request.""" method: Method """The method to call.""" params: Params """The parameters to pass to the method.""" ``` #### method ```python method: Method ``` The method to call. #### params ```python params: Params ``` The parameters to pass to the method. ### JSONRPCError Bases: `TypedDict`, `Generic[CodeT, MessageT]` A JSON RPC error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCError(TypedDict, Generic[CodeT, MessageT]): """A JSON RPC error.""" code: CodeT """A number that indicates the error type that occurred.""" message: MessageT """A string providing a short description of the error.""" data: NotRequired[Any] """A primitive or structured value containing additional information about the error.""" ``` #### code ```python code: CodeT ``` A number that indicates the error type that occurred. #### message ```python message: MessageT ``` A string providing a short description of the error. #### data ```python data: NotRequired[Any] ``` A primitive or structured value containing additional information about the error. ### JSONRPCResponse Bases: `JSONRPCMessage`, `Generic[ResultT, ErrorT]` A JSON RPC response. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCResponse(JSONRPCMessage, Generic[ResultT, ErrorT]): """A JSON RPC response.""" result: NotRequired[ResultT] error: NotRequired[ErrorT] ``` ### JSONParseError ```python JSONParseError = JSONRPCError[ Literal[-32700], Literal["Invalid JSON payload"] ] ``` A JSON RPC error for a parse error. ### InvalidRequestError ```python InvalidRequestError = JSONRPCError[ Literal[-32600], Literal["Request payload validation error"], ] ``` A JSON RPC error for an invalid request. ### MethodNotFoundError ```python MethodNotFoundError = JSONRPCError[ Literal[-32601], Literal["Method not found"] ] ``` A JSON RPC error for a method not found. ### InvalidParamsError ```python InvalidParamsError = JSONRPCError[ Literal[-32602], Literal["Invalid parameters"] ] ``` A JSON RPC error for invalid parameters. ### InternalError ```python InternalError = JSONRPCError[ Literal[-32603], Literal["Internal error"] ] ``` A JSON RPC error for an internal error. ### TaskNotFoundError ```python TaskNotFoundError = JSONRPCError[ Literal[-32001], Literal["Task not found"] ] ``` A JSON RPC error for a task not found. ### TaskNotCancelableError ```python TaskNotCancelableError = JSONRPCError[ Literal[-32002], Literal["Task not cancelable"] ] ``` A JSON RPC error for a task not cancelable. ### PushNotificationNotSupportedError ```python PushNotificationNotSupportedError = JSONRPCError[ Literal[-32003], Literal["Push notification not supported"], ] ``` A JSON RPC error for a push notification not supported. ### UnsupportedOperationError ```python UnsupportedOperationError = JSONRPCError[ Literal[-32004], Literal["This operation is not supported"], ] ``` A JSON RPC error for an unsupported operation. ### ContentTypeNotSupportedError ```python ContentTypeNotSupportedError = JSONRPCError[ Literal[-32005], Literal["Incompatible content types"] ] ``` A JSON RPC error for incompatible content types. ### InvalidAgentResponseError ```python InvalidAgentResponseError = JSONRPCError[ Literal[-32006], Literal["Invalid agent response"] ] ``` A JSON RPC error for invalid agent response. ### SendMessageRequest ```python SendMessageRequest = JSONRPCRequest[ Literal["message/send"], MessageSendParams ] ``` A JSON RPC request to send a message. ### SendMessageResponse ```python SendMessageResponse = JSONRPCResponse[ Union[Task, Message], JSONRPCError[Any, Any] ] ``` A JSON RPC response to send a message. ### StreamMessageRequest ```python StreamMessageRequest = JSONRPCRequest[ Literal["message/stream"], MessageSendParams ] ``` A JSON RPC request to stream a message. ### StreamMessageResponse ```python StreamMessageResponse = JSONRPCResponse[ Union[ Task, Message, TaskStatusUpdateEvent, TaskArtifactUpdateEvent, ], JSONRPCError[Any, Any], ] ``` A JSON RPC response to a StreamMessageRequest. ### GetTaskRequest ```python GetTaskRequest = JSONRPCRequest[ Literal["tasks/get"], TaskQueryParams ] ``` A JSON RPC request to get a task. ### GetTaskResponse ```python GetTaskResponse = JSONRPCResponse[Task, TaskNotFoundError] ``` A JSON RPC response to get a task. ### CancelTaskRequest ```python CancelTaskRequest = JSONRPCRequest[ Literal["tasks/cancel"], TaskIdParams ] ``` A JSON RPC request to cancel a task. ### CancelTaskResponse ```python CancelTaskResponse = JSONRPCResponse[ Task, Union[TaskNotCancelableError, TaskNotFoundError] ] ``` A JSON RPC response to cancel a task. ### SetTaskPushNotificationRequest ```python SetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/set"], TaskPushNotificationConfig, ] ``` A JSON RPC request to set a task push notification. ### SetTaskPushNotificationResponse ```python SetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to set a task push notification. ### GetTaskPushNotificationRequest ```python GetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/get"], TaskIdParams ] ``` A JSON RPC request to get a task push notification. ### GetTaskPushNotificationResponse ```python GetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to get a task push notification. ### ResubscribeTaskRequest ```python ResubscribeTaskRequest = JSONRPCRequest[ Literal["tasks/resubscribe"], TaskIdParams ] ``` A JSON RPC request to resubscribe to a task. ### ListTaskPushNotificationConfigRequest ```python ListTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/list"], ListTaskPushNotificationConfigParams, ] ``` A JSON RPC request to list task push notification configs. ### DeleteTaskPushNotificationConfigRequest ```python DeleteTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/delete"], DeleteTaskPushNotificationConfigParams, ] ``` A JSON RPC request to delete a task push notification config. ### A2ARequest ```python A2ARequest = Annotated[ Union[ SendMessageRequest, StreamMessageRequest, GetTaskRequest, CancelTaskRequest, SetTaskPushNotificationRequest, GetTaskPushNotificationRequest, ResubscribeTaskRequest, ListTaskPushNotificationConfigRequest, DeleteTaskPushNotificationConfigRequest, ], Discriminator("method"), ] ``` A JSON RPC request to the A2A server. ### A2AResponse ```python A2AResponse: TypeAlias = Union[ SendMessageResponse, StreamMessageResponse, GetTaskResponse, CancelTaskResponse, SetTaskPushNotificationResponse, GetTaskPushNotificationResponse, ] ``` A JSON RPC response from the A2A server. ### A2AClient A client for the A2A protocol. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class A2AClient: """A client for the A2A protocol.""" def __init__(self, base_url: str = 'http://localhost:8000', http_client: httpx.AsyncClient | None = None) -> None: if http_client is None: self.http_client = httpx.AsyncClient(base_url=base_url) else: self.http_client = http_client self.http_client.base_url = base_url async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) async def get_task(self, task_id: str) -> GetTaskResponse: payload = GetTaskRequest(jsonrpc='2.0', id=None, method='tasks/get', params={'id': task_id}) content = a2a_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return get_task_response_ta.validate_json(response.content) def _raise_for_status(self, response: httpx.Response) -> None: if response.status_code >= 400: raise UnexpectedResponseError(response.status_code, response.text) ``` #### send_message ```python send_message( message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None ) -> SendMessageResponse ``` Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) ``` ### UnexpectedResponseError Bases: `Exception` An error raised when an unexpected response is received from the server. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class UnexpectedResponseError(Exception): """An error raised when an unexpected response is received from the server.""" def __init__(self, status_code: int, content: str) -> None: self.status_code = status_code self.content = content ``` # `pydantic_ai.format_prompt` ### format_as_xml ```python format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = "item", none_str: str = "null", indent: str | None = " ", include_field_info: Literal["once"] | bool = False, ) -> str ``` Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `Decimal`, `date`, `datetime`, `time`, `timedelta`, `UUID`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Parameters: | Name | Type | Description | Default | | -------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `obj` | `Any` | Python Object to serialize to XML. | *required* | | `root_tag` | \`str | None\` | Outer tag to wrap the XML in, use None to omit the outer tag. | | `item_tag` | `str` | Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. | `'item'` | | `none_str` | `str` | String to use for None values. | `'null'` | | `indent` | \`str | None\` | Indentation string to use for pretty printing. | | `include_field_info` | \`Literal['once'] | bool\` | Whether to include attributes like Pydantic Field attributes and dataclasses field() metadata as XML attributes. In both cases the allowed Field attributes and field() metadata keys are title and description. If a field is repeated in the data (e.g. in a list) by setting once the attributes are included only in the first occurrence of an XML element relative to the same field. | Returns: | Type | Description | | ----- | --------------------------------- | | `str` | XML representation of the object. | Example: format_as_xml_example.py ```python from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` Source code in `pydantic_ai_slim/pydantic_ai/format_prompt.py` ````python def format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = 'item', none_str: str = 'null', indent: str | None = ' ', include_field_info: Literal['once'] | bool = False, ) -> str: """Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `Decimal`, `date`, `datetime`, `time`, `timedelta`, `UUID`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Args: obj: Python Object to serialize to XML. root_tag: Outer tag to wrap the XML in, use `None` to omit the outer tag. item_tag: Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. none_str: String to use for `None` values. indent: Indentation string to use for pretty printing. include_field_info: Whether to include attributes like Pydantic `Field` attributes and dataclasses `field()` `metadata` as XML attributes. In both cases the allowed `Field` attributes and `field()` metadata keys are `title` and `description`. If a field is repeated in the data (e.g. in a list) by setting `once` the attributes are included only in the first occurrence of an XML element relative to the same field. Returns: XML representation of the object. Example: ```python {title="format_as_xml_example.py" lint="skip"} from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` """ el = _ToXml( data=obj, item_tag=item_tag, none_str=none_str, include_field_info=include_field_info, ).to_xml(root_tag) if root_tag is None and el.text is None: join = '' if indent is None else '\n' return join.join(_rootless_xml_elements(el, indent)) else: if indent is not None: ElementTree.indent(el, space=indent) return ElementTree.tostring(el, encoding='unicode') ```` # `pydantic_ai.mcp` ### MCPError Bases: `RuntimeError` Raised when an MCP server returns an error response. This exception wraps error responses from MCP servers, following the ErrorData schema from the MCP specification. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python class MCPError(RuntimeError): """Raised when an MCP server returns an error response. This exception wraps error responses from MCP servers, following the ErrorData schema from the MCP specification. """ message: str """The error message.""" code: int """The error code returned by the server.""" data: dict[str, Any] | None """Additional information about the error, if provided by the server.""" def __init__(self, message: str, code: int, data: dict[str, Any] | None = None): self.message = message self.code = code self.data = data super().__init__(message) @classmethod def from_mcp_sdk(cls, error: mcp_exceptions.McpError) -> MCPError: """Create an MCPError from an MCP SDK McpError. Args: error: An McpError from the MCP SDK. """ # Extract error data from the McpError.error attribute error_data = error.error return cls(message=error_data.message, code=error_data.code, data=error_data.data) def __str__(self) -> str: if self.data: return f'{self.message} (code: {self.code}, data: {self.data})' return f'{self.message} (code: {self.code})' ``` #### message ```python message: str = message ``` The error message. #### code ```python code: int = code ``` The error code returned by the server. #### data ```python data: dict[str, Any] | None = data ``` Additional information about the error, if provided by the server. #### from_mcp_sdk ```python from_mcp_sdk(error: McpError) -> MCPError ``` Create an MCPError from an MCP SDK McpError. Parameters: | Name | Type | Description | Default | | ------- | ---------- | ----------------------------- | ---------- | | `error` | `McpError` | An McpError from the MCP SDK. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @classmethod def from_mcp_sdk(cls, error: mcp_exceptions.McpError) -> MCPError: """Create an MCPError from an MCP SDK McpError. Args: error: An McpError from the MCP SDK. """ # Extract error data from the McpError.error attribute error_data = error.error return cls(message=error_data.message, code=error_data.code, data=error_data.data) ``` ### ResourceAnnotations Additional properties describing MCP entities. See the [resource annotations in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @dataclass(repr=False, kw_only=True) class ResourceAnnotations: """Additional properties describing MCP entities. See the [resource annotations in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#annotations). """ audience: list[mcp_types.Role] | None = None """Intended audience for this entity.""" priority: Annotated[float, Field(ge=0.0, le=1.0)] | None = None """Priority level for this entity, ranging from 0.0 to 1.0.""" __repr__ = _utils.dataclasses_no_defaults_repr @classmethod def from_mcp_sdk(cls, mcp_annotations: mcp_types.Annotations) -> ResourceAnnotations: """Convert from MCP SDK Annotations to ResourceAnnotations. Args: mcp_annotations: The MCP SDK annotations object. """ return cls(audience=mcp_annotations.audience, priority=mcp_annotations.priority) ``` #### audience ```python audience: list[Role] | None = None ``` Intended audience for this entity. #### priority ```python priority: Annotated[float, Field(ge=0.0, le=1.0)] | None = ( None ) ``` Priority level for this entity, ranging from 0.0 to 1.0. #### from_mcp_sdk ```python from_mcp_sdk( mcp_annotations: Annotations, ) -> ResourceAnnotations ``` Convert from MCP SDK Annotations to ResourceAnnotations. Parameters: | Name | Type | Description | Default | | ----------------- | ------------- | ------------------------------- | ---------- | | `mcp_annotations` | `Annotations` | The MCP SDK annotations object. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @classmethod def from_mcp_sdk(cls, mcp_annotations: mcp_types.Annotations) -> ResourceAnnotations: """Convert from MCP SDK Annotations to ResourceAnnotations. Args: mcp_annotations: The MCP SDK annotations object. """ return cls(audience=mcp_annotations.audience, priority=mcp_annotations.priority) ``` ### BaseResource Bases: `ABC` Base class for MCP resources. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @dataclass(repr=False, kw_only=True) class BaseResource(ABC): """Base class for MCP resources.""" name: str """The programmatic name of the resource.""" title: str | None = None """Human-readable title for UI contexts.""" description: str | None = None """A description of what this resource represents.""" mime_type: str | None = None """The MIME type of the resource, if known.""" annotations: ResourceAnnotations | None = None """Optional annotations for the resource.""" metadata: dict[str, Any] | None = None """Optional metadata for the resource.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### name ```python name: str ``` The programmatic name of the resource. #### title ```python title: str | None = None ``` Human-readable title for UI contexts. #### description ```python description: str | None = None ``` A description of what this resource represents. #### mime_type ```python mime_type: str | None = None ``` The MIME type of the resource, if known. #### annotations ```python annotations: ResourceAnnotations | None = None ``` Optional annotations for the resource. #### metadata ```python metadata: dict[str, Any] | None = None ``` Optional metadata for the resource. ### Resource Bases: `BaseResource` A resource that can be read from an MCP server. See the [resources in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources). Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @dataclass(repr=False, kw_only=True) class Resource(BaseResource): """A resource that can be read from an MCP server. See the [resources in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources). """ uri: str """The URI of the resource.""" size: int | None = None """The size of the raw resource content in bytes (before base64 encoding), if known.""" @classmethod def from_mcp_sdk(cls, mcp_resource: mcp_types.Resource) -> Resource: """Convert from MCP SDK Resource to PydanticAI Resource. Args: mcp_resource: The MCP SDK Resource object. """ return cls( uri=str(mcp_resource.uri), name=mcp_resource.name, title=mcp_resource.title, description=mcp_resource.description, mime_type=mcp_resource.mimeType, size=mcp_resource.size, annotations=ResourceAnnotations.from_mcp_sdk(mcp_resource.annotations) if mcp_resource.annotations else None, metadata=mcp_resource.meta, ) ``` #### uri ```python uri: str ``` The URI of the resource. #### size ```python size: int | None = None ``` The size of the raw resource content in bytes (before base64 encoding), if known. #### from_mcp_sdk ```python from_mcp_sdk(mcp_resource: Resource) -> Resource ``` Convert from MCP SDK Resource to PydanticAI Resource. Parameters: | Name | Type | Description | Default | | -------------- | ---------- | ---------------------------- | ---------- | | `mcp_resource` | `Resource` | The MCP SDK Resource object. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @classmethod def from_mcp_sdk(cls, mcp_resource: mcp_types.Resource) -> Resource: """Convert from MCP SDK Resource to PydanticAI Resource. Args: mcp_resource: The MCP SDK Resource object. """ return cls( uri=str(mcp_resource.uri), name=mcp_resource.name, title=mcp_resource.title, description=mcp_resource.description, mime_type=mcp_resource.mimeType, size=mcp_resource.size, annotations=ResourceAnnotations.from_mcp_sdk(mcp_resource.annotations) if mcp_resource.annotations else None, metadata=mcp_resource.meta, ) ``` ### ResourceTemplate Bases: `BaseResource` A template for parameterized resources on an MCP server. See the [resource templates in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#resource-templates). Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @dataclass(repr=False, kw_only=True) class ResourceTemplate(BaseResource): """A template for parameterized resources on an MCP server. See the [resource templates in the MCP specification](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#resource-templates). """ uri_template: str """URI template (RFC 6570) for constructing resource URIs.""" @classmethod def from_mcp_sdk(cls, mcp_template: mcp_types.ResourceTemplate) -> ResourceTemplate: """Convert from MCP SDK ResourceTemplate to PydanticAI ResourceTemplate. Args: mcp_template: The MCP SDK ResourceTemplate object. """ return cls( uri_template=mcp_template.uriTemplate, name=mcp_template.name, title=mcp_template.title, description=mcp_template.description, mime_type=mcp_template.mimeType, annotations=ResourceAnnotations.from_mcp_sdk(mcp_template.annotations) if mcp_template.annotations else None, metadata=mcp_template.meta, ) ``` #### uri_template ```python uri_template: str ``` URI template (RFC 6570) for constructing resource URIs. #### from_mcp_sdk ```python from_mcp_sdk( mcp_template: ResourceTemplate, ) -> ResourceTemplate ``` Convert from MCP SDK ResourceTemplate to PydanticAI ResourceTemplate. Parameters: | Name | Type | Description | Default | | -------------- | ------------------ | ------------------------------------ | ---------- | | `mcp_template` | `ResourceTemplate` | The MCP SDK ResourceTemplate object. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @classmethod def from_mcp_sdk(cls, mcp_template: mcp_types.ResourceTemplate) -> ResourceTemplate: """Convert from MCP SDK ResourceTemplate to PydanticAI ResourceTemplate. Args: mcp_template: The MCP SDK ResourceTemplate object. """ return cls( uri_template=mcp_template.uriTemplate, name=mcp_template.name, title=mcp_template.title, description=mcp_template.description, mime_type=mcp_template.mimeType, annotations=ResourceAnnotations.from_mcp_sdk(mcp_template.annotations) if mcp_template.annotations else None, metadata=mcp_template.meta, ) ``` ### ServerCapabilities Capabilities that an MCP server supports. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @dataclass(repr=False, kw_only=True) class ServerCapabilities: """Capabilities that an MCP server supports.""" experimental: list[str] | None = None """Experimental, non-standard capabilities that the server supports.""" logging: bool = False """Whether the server supports sending log messages to the client.""" prompts: bool = False """Whether the server offers any prompt templates.""" prompts_list_changed: bool = False """Whether the server will emit notifications when the list of prompts changes.""" resources: bool = False """Whether the server offers any resources to read.""" resources_list_changed: bool = False """Whether the server will emit notifications when the list of resources changes.""" tools: bool = False """Whether the server offers any tools to call.""" tools_list_changed: bool = False """Whether the server will emit notifications when the list of tools changes.""" completions: bool = False """Whether the server offers autocompletion suggestions for prompts and resources.""" __repr__ = _utils.dataclasses_no_defaults_repr @classmethod def from_mcp_sdk(cls, mcp_capabilities: mcp_types.ServerCapabilities) -> ServerCapabilities: """Convert from MCP SDK ServerCapabilities to PydanticAI ServerCapabilities. Args: mcp_capabilities: The MCP SDK ServerCapabilities object. """ prompts_cap = mcp_capabilities.prompts resources_cap = mcp_capabilities.resources tools_cap = mcp_capabilities.tools return cls( experimental=list(mcp_capabilities.experimental.keys()) if mcp_capabilities.experimental else None, logging=mcp_capabilities.logging is not None, prompts=prompts_cap is not None, prompts_list_changed=bool(prompts_cap.listChanged) if prompts_cap else False, resources=resources_cap is not None, resources_list_changed=bool(resources_cap.listChanged) if resources_cap else False, tools=tools_cap is not None, tools_list_changed=bool(tools_cap.listChanged) if tools_cap else False, completions=mcp_capabilities.completions is not None, ) ``` #### experimental ```python experimental: list[str] | None = None ``` Experimental, non-standard capabilities that the server supports. #### logging ```python logging: bool = False ``` Whether the server supports sending log messages to the client. #### prompts ```python prompts: bool = False ``` Whether the server offers any prompt templates. #### prompts_list_changed ```python prompts_list_changed: bool = False ``` Whether the server will emit notifications when the list of prompts changes. #### resources ```python resources: bool = False ``` Whether the server offers any resources to read. #### resources_list_changed ```python resources_list_changed: bool = False ``` Whether the server will emit notifications when the list of resources changes. #### tools ```python tools: bool = False ``` Whether the server offers any tools to call. #### tools_list_changed ```python tools_list_changed: bool = False ``` Whether the server will emit notifications when the list of tools changes. #### completions ```python completions: bool = False ``` Whether the server offers autocompletion suggestions for prompts and resources. #### from_mcp_sdk ```python from_mcp_sdk( mcp_capabilities: ServerCapabilities, ) -> ServerCapabilities ``` Convert from MCP SDK ServerCapabilities to PydanticAI ServerCapabilities. Parameters: | Name | Type | Description | Default | | ------------------ | -------------------- | -------------------------------------- | ---------- | | `mcp_capabilities` | `ServerCapabilities` | The MCP SDK ServerCapabilities object. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @classmethod def from_mcp_sdk(cls, mcp_capabilities: mcp_types.ServerCapabilities) -> ServerCapabilities: """Convert from MCP SDK ServerCapabilities to PydanticAI ServerCapabilities. Args: mcp_capabilities: The MCP SDK ServerCapabilities object. """ prompts_cap = mcp_capabilities.prompts resources_cap = mcp_capabilities.resources tools_cap = mcp_capabilities.tools return cls( experimental=list(mcp_capabilities.experimental.keys()) if mcp_capabilities.experimental else None, logging=mcp_capabilities.logging is not None, prompts=prompts_cap is not None, prompts_list_changed=bool(prompts_cap.listChanged) if prompts_cap else False, resources=resources_cap is not None, resources_list_changed=bool(resources_cap.listChanged) if resources_cap else False, tools=tools_cap is not None, tools_list_changed=bool(tools_cap.listChanged) if tools_cap else False, completions=mcp_capabilities.completions is not None, ) ``` ### MCPServer Bases: `AbstractToolset[Any]`, `ABC` Base class for attaching agents to MCP servers. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python class MCPServer(AbstractToolset[Any], ABC): """Base class for attaching agents to MCP servers. See for more information. """ tool_prefix: str | None """A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` """ log_level: mcp_types.LoggingLevel | None """The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. """ log_handler: LoggingFnT | None """A handler for logging messages from the server.""" timeout: float """The timeout in seconds to wait for the client to initialize.""" read_timeout: float """Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). """ process_tool_call: ProcessToolCallback | None """Hook to customize tool calling and optionally pass extra metadata.""" allow_sampling: bool """Whether to allow MCP sampling through this client.""" sampling_model: models.Model | None """The model to use for sampling.""" max_retries: int """The maximum number of times to retry a tool call.""" elicitation_callback: ElicitationFnT | None = None """Callback function to handle elicitation requests from the server.""" cache_tools: bool """Whether to cache the list of tools. When enabled (default), tools are fetched once and cached until either: - The server sends a `notifications/tools/list_changed` notification - The connection is closed Set to `False` for servers that change tools dynamically without sending notifications. """ cache_resources: bool """Whether to cache the list of resources. When enabled (default), resources are fetched once and cached until either: - The server sends a `notifications/resources/list_changed` notification - The connection is closed Set to `False` for servers that change resources dynamically without sending notifications. """ _id: str | None _enter_lock: Lock = field(compare=False) _running_count: int _exit_stack: AsyncExitStack | None _client: ClientSession _read_stream: MemoryObjectReceiveStream[SessionMessage | Exception] _write_stream: MemoryObjectSendStream[SessionMessage] _server_info: mcp_types.Implementation _server_capabilities: ServerCapabilities _instructions: str | None _cached_tools: list[mcp_types.Tool] | None _cached_resources: list[Resource] | None # TODO (v2): enforce the arguments to be passed as keyword arguments only def __init__( self, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, cache_tools: bool = True, cache_resources: bool = True, *, id: str | None = None, client_info: mcp_types.Implementation | None = None, ): self.tool_prefix = tool_prefix self.log_level = log_level self.log_handler = log_handler self.timeout = timeout self.read_timeout = read_timeout self.process_tool_call = process_tool_call self.allow_sampling = allow_sampling self.sampling_model = sampling_model self.max_retries = max_retries self.elicitation_callback = elicitation_callback self.cache_tools = cache_tools self.cache_resources = cache_resources self.client_info = client_info self._id = id or tool_prefix self.__post_init__() def __post_init__(self): self._enter_lock = Lock() self._running_count = 0 self._exit_stack = None self._cached_tools = None self._cached_resources = None @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield @property def id(self) -> str | None: return self._id @id.setter def id(self, value: str | None): self._id = value @property def label(self) -> str: if self.id: return super().label # pragma: no cover else: return repr(self) @property def tool_name_conflict_hint(self) -> str: return 'Set the `tool_prefix` attribute to avoid name conflicts.' @property def server_info(self) -> mcp_types.Implementation: """Access the information send by the MCP server during initialization.""" if getattr(self, '_server_info', None) is None: raise AttributeError( f'The `{self.__class__.__name__}.server_info` is only instantiated after initialization.' ) return self._server_info @property def capabilities(self) -> ServerCapabilities: """Access the capabilities advertised by the MCP server during initialization.""" if getattr(self, '_server_capabilities', None) is None: raise AttributeError( f'The `{self.__class__.__name__}.capabilities` is only instantiated after initialization.' ) return self._server_capabilities @property def instructions(self) -> str | None: """Access the instructions sent by the MCP server during initialization.""" if not hasattr(self, '_instructions'): raise AttributeError( f'The `{self.__class__.__name__}.instructions` is only available after initialization.' ) return self._instructions async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Tools are cached by default, with cache invalidation on: - `notifications/tools/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_tools=False` for servers that change tools without sending notifications. """ async with self: if self.cache_tools: if self._cached_tools is not None: return self._cached_tools result = await self._client.list_tools() self._cached_tools = result.tools return result.tools else: result = await self._client.list_tools() return result.tools async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except mcp_exceptions.McpError as e: raise exceptions.ModelRetry(e.error.message) if result.isError: message: str | None = None if result.content: # pragma: no branch text_parts = [part.text for part in result.content if isinstance(part, mcp_types.TextContent)] message = '\n'.join(text_parts) raise exceptions.ModelRetry(message or 'MCP tool call failed') # Prefer structured content if there are only text parts, which per the docs would contain the JSON-encoded structured content for backward compatibility. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if (structured := result.structuredContent) and not any( not isinstance(part, mcp_types.TextContent) for part in result.content ): # The MCP SDK wraps primitives and generic types like list in a `result` key, but we want to use the raw value returned by the tool function. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if isinstance(structured, dict) and len(structured) == 1 and 'result' in structured: return structured['result'] return structured mapped = [await self._map_tool_result_part(part) for part in result.content] return mapped[0] if len(mapped) == 1 else mapped async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[Any], tool: ToolsetTool[Any], ) -> ToolResult: if self.tool_prefix: name = name.removeprefix(f'{self.tool_prefix}_') ctx = replace(ctx, tool_name=name) if self.process_tool_call is not None: return await self.process_tool_call(ctx, self.direct_call_tool, name, tool_args) else: return await self.direct_call_tool(name, tool_args) async def get_tools(self, ctx: RunContext[Any]) -> dict[str, ToolsetTool[Any]]: return { name: self.tool_for_tool_def( ToolDefinition( name=name, description=mcp_tool.description, parameters_json_schema=mcp_tool.inputSchema, metadata={ 'meta': mcp_tool.meta, 'annotations': mcp_tool.annotations.model_dump() if mcp_tool.annotations else None, 'output_schema': mcp_tool.outputSchema or None, }, ), ) for mcp_tool in await self.list_tools() if (name := f'{self.tool_prefix}_{mcp_tool.name}' if self.tool_prefix else mcp_tool.name) } def tool_for_tool_def(self, tool_def: ToolDefinition) -> ToolsetTool[Any]: return ToolsetTool( toolset=self, tool_def=tool_def, max_retries=self.max_retries, args_validator=TOOL_SCHEMA_VALIDATOR, ) async def list_resources(self) -> list[Resource]: """Retrieve resources that are currently present on the server. Resources are cached by default, with cache invalidation on: - `notifications/resources/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_resources=False` for servers that change resources without sending notifications. Raises: MCPError: If the server returns an error. """ async with self: if not self.capabilities.resources: return [] try: if self.cache_resources: if self._cached_resources is not None: return self._cached_resources result = await self._client.list_resources() resources = [Resource.from_mcp_sdk(r) for r in result.resources] self._cached_resources = resources return resources else: result = await self._client.list_resources() return [Resource.from_mcp_sdk(r) for r in result.resources] except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e async def list_resource_templates(self) -> list[ResourceTemplate]: """Retrieve resource templates that are currently present on the server. Raises: MCPError: If the server returns an error. """ async with self: # Ensure server is running if not self.capabilities.resources: return [] try: result = await self._client.list_resource_templates() except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e return [ResourceTemplate.from_mcp_sdk(t) for t in result.resourceTemplates] @overload async def read_resource(self, uri: str) -> str | messages.BinaryContent | list[str | messages.BinaryContent]: ... @overload async def read_resource( self, uri: Resource ) -> str | messages.BinaryContent | list[str | messages.BinaryContent]: ... async def read_resource( self, uri: str | Resource ) -> str | messages.BinaryContent | list[str | messages.BinaryContent]: """Read the contents of a specific resource by URI. Args: uri: The URI of the resource to read, or a Resource object. Returns: The resource contents. If the resource has a single content item, returns that item directly. If the resource has multiple content items, returns a list of items. Raises: MCPError: If the server returns an error. """ resource_uri = uri if isinstance(uri, str) else uri.uri async with self: # Ensure server is running try: result = await self._client.read_resource(AnyUrl(resource_uri)) except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e return ( self._get_content(result.contents[0]) if len(result.contents) == 1 else [self._get_content(resource) for resource in result.contents] ) async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), message_handler=self._handle_notification, client_info=self.client_info, ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): result = await self._client.initialize() self._server_info = result.serverInfo self._server_capabilities = ServerCapabilities.from_mcp_sdk(result.capabilities) self._instructions = result.instructions if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: if self._running_count == 0: raise ValueError('MCPServer.__aexit__ called more times than __aenter__') async with self._enter_lock: self._running_count -= 1 if self._running_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None self._cached_tools = None self._cached_resources = None @property def is_running(self) -> bool: """Check if the MCP server is running.""" return bool(self._running_count) async def _sampling_callback( self, context: RequestContext[ClientSession, Any], params: mcp_types.CreateMessageRequestParams ) -> mcp_types.CreateMessageResult | mcp_types.ErrorData: """MCP sampling callback.""" if self.sampling_model is None: raise ValueError('Sampling model is not set') # pragma: no cover pai_messages = _mcp.map_from_mcp_params(params) model_settings = models.ModelSettings() if max_tokens := params.maxTokens: # pragma: no branch model_settings['max_tokens'] = max_tokens if temperature := params.temperature: # pragma: no branch model_settings['temperature'] = temperature if stop_sequences := params.stopSequences: # pragma: no branch model_settings['stop_sequences'] = stop_sequences model_response = await model_request(self.sampling_model, pai_messages, model_settings=model_settings) return mcp_types.CreateMessageResult( role='assistant', content=_mcp.map_from_model_response(model_response), model=self.sampling_model.model_name, ) async def _handle_notification( self, message: RequestResponder[mcp_types.ServerRequest, mcp_types.ClientResult] | mcp_types.ServerNotification | Exception, ) -> None: """Handle notifications from the MCP server, invalidating caches as needed.""" if isinstance(message, mcp_types.ServerNotification): # pragma: no branch if isinstance(message.root, mcp_types.ToolListChangedNotification): self._cached_tools = None elif isinstance(message.root, mcp_types.ResourceListChangedNotification): self._cached_resources = None async def _map_tool_result_part( self, part: mcp_types.ContentBlock ) -> str | messages.BinaryContent | dict[str, Any] | list[Any]: # See https://github.com/jlowin/fastmcp/blob/main/docs/servers/tools.mdx#return-values if isinstance(part, mcp_types.TextContent): text = part.text if text.startswith(('[', '{')): try: return pydantic_core.from_json(text) except ValueError: pass return text elif isinstance(part, mcp_types.ImageContent): return messages.BinaryImage(data=base64.b64decode(part.data), media_type=part.mimeType) elif isinstance(part, mcp_types.AudioContent): # NOTE: The FastMCP server doesn't support audio content. # See for more details. return messages.BinaryContent( data=base64.b64decode(part.data), media_type=part.mimeType ) # pragma: no cover elif isinstance(part, mcp_types.EmbeddedResource): resource = part.resource return self._get_content(resource) elif isinstance(part, mcp_types.ResourceLink): return await self.read_resource(str(part.uri)) else: assert_never(part) def _get_content( self, resource: mcp_types.TextResourceContents | mcp_types.BlobResourceContents ) -> str | messages.BinaryContent: if isinstance(resource, mcp_types.TextResourceContents): return resource.text elif isinstance(resource, mcp_types.BlobResourceContents): return messages.BinaryContent.narrow_type( messages.BinaryContent( data=base64.b64decode(resource.blob), media_type=resource.mimeType or 'application/octet-stream' ) ) else: assert_never(resource) def __eq__(self, value: object, /) -> bool: return isinstance(value, MCPServer) and self.id == value.id and self.tool_prefix == value.tool_prefix ``` #### tool_prefix ```python tool_prefix: str | None = tool_prefix ``` A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` #### log_level ```python log_level: LoggingLevel | None = log_level ``` The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. #### log_handler ```python log_handler: LoggingFnT | None = log_handler ``` A handler for logging messages from the server. #### timeout ```python timeout: float = timeout ``` The timeout in seconds to wait for the client to initialize. #### read_timeout ```python read_timeout: float = read_timeout ``` Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). #### process_tool_call ```python process_tool_call: ProcessToolCallback | None = ( process_tool_call ) ``` Hook to customize tool calling and optionally pass extra metadata. #### allow_sampling ```python allow_sampling: bool = allow_sampling ``` Whether to allow MCP sampling through this client. #### sampling_model ```python sampling_model: Model | None = sampling_model ``` The model to use for sampling. #### max_retries ```python max_retries: int = max_retries ``` The maximum number of times to retry a tool call. #### elicitation_callback ```python elicitation_callback: ElicitationFnT | None = ( elicitation_callback ) ``` Callback function to handle elicitation requests from the server. #### cache_tools ```python cache_tools: bool = cache_tools ``` Whether to cache the list of tools. When enabled (default), tools are fetched once and cached until either: - The server sends a `notifications/tools/list_changed` notification - The connection is closed Set to `False` for servers that change tools dynamically without sending notifications. #### cache_resources ```python cache_resources: bool = cache_resources ``` Whether to cache the list of resources. When enabled (default), resources are fetched once and cached until either: - The server sends a `notifications/resources/list_changed` notification - The connection is closed Set to `False` for servers that change resources dynamically without sending notifications. #### client_streams ```python client_streams() -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[ SessionMessage | Exception ], MemoryObjectSendStream[SessionMessage], ] ] ``` Create the streams for the MCP server. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield ``` #### server_info ```python server_info: Implementation ``` Access the information send by the MCP server during initialization. #### capabilities ```python capabilities: ServerCapabilities ``` Access the capabilities advertised by the MCP server during initialization. #### instructions ```python instructions: str | None ``` Access the instructions sent by the MCP server during initialization. #### list_tools ```python list_tools() -> list[Tool] ``` Retrieve tools that are currently active on the server. Tools are cached by default, with cache invalidation on: - `notifications/tools/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_tools=False` for servers that change tools without sending notifications. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Tools are cached by default, with cache invalidation on: - `notifications/tools/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_tools=False` for servers that change tools without sending notifications. """ async with self: if self.cache_tools: if self._cached_tools is not None: return self._cached_tools result = await self._client.list_tools() self._cached_tools = result.tools return result.tools else: result = await self._client.list_tools() return result.tools ``` #### direct_call_tool ```python direct_call_tool( name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult ``` Call a tool on the server. Parameters: | Name | Type | Description | Default | | ---------- | ---------------- | ---------------------------------- | --------------------------------- | | `name` | `str` | The name of the tool to call. | *required* | | `args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `metadata` | \`dict[str, Any] | None\` | Request-level metadata (optional) | Returns: | Type | Description | | ------------ | ---------------------------- | | `ToolResult` | The result of the tool call. | Raises: | Type | Description | | ------------ | ----------------------- | | `ModelRetry` | If the tool call fails. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except mcp_exceptions.McpError as e: raise exceptions.ModelRetry(e.error.message) if result.isError: message: str | None = None if result.content: # pragma: no branch text_parts = [part.text for part in result.content if isinstance(part, mcp_types.TextContent)] message = '\n'.join(text_parts) raise exceptions.ModelRetry(message or 'MCP tool call failed') # Prefer structured content if there are only text parts, which per the docs would contain the JSON-encoded structured content for backward compatibility. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if (structured := result.structuredContent) and not any( not isinstance(part, mcp_types.TextContent) for part in result.content ): # The MCP SDK wraps primitives and generic types like list in a `result` key, but we want to use the raw value returned by the tool function. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if isinstance(structured, dict) and len(structured) == 1 and 'result' in structured: return structured['result'] return structured mapped = [await self._map_tool_result_part(part) for part in result.content] return mapped[0] if len(mapped) == 1 else mapped ``` #### list_resources ```python list_resources() -> list[Resource] ``` Retrieve resources that are currently present on the server. Resources are cached by default, with cache invalidation on: - `notifications/resources/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_resources=False` for servers that change resources without sending notifications. Raises: | Type | Description | | ---------- | ------------------------------- | | `MCPError` | If the server returns an error. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def list_resources(self) -> list[Resource]: """Retrieve resources that are currently present on the server. Resources are cached by default, with cache invalidation on: - `notifications/resources/list_changed` notifications from the server - Connection close (cache is cleared in `__aexit__`) Set `cache_resources=False` for servers that change resources without sending notifications. Raises: MCPError: If the server returns an error. """ async with self: if not self.capabilities.resources: return [] try: if self.cache_resources: if self._cached_resources is not None: return self._cached_resources result = await self._client.list_resources() resources = [Resource.from_mcp_sdk(r) for r in result.resources] self._cached_resources = resources return resources else: result = await self._client.list_resources() return [Resource.from_mcp_sdk(r) for r in result.resources] except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e ``` #### list_resource_templates ```python list_resource_templates() -> list[ResourceTemplate] ``` Retrieve resource templates that are currently present on the server. Raises: | Type | Description | | ---------- | ------------------------------- | | `MCPError` | If the server returns an error. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def list_resource_templates(self) -> list[ResourceTemplate]: """Retrieve resource templates that are currently present on the server. Raises: MCPError: If the server returns an error. """ async with self: # Ensure server is running if not self.capabilities.resources: return [] try: result = await self._client.list_resource_templates() except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e return [ResourceTemplate.from_mcp_sdk(t) for t in result.resourceTemplates] ``` #### read_resource ```python read_resource( uri: str, ) -> str | BinaryContent | list[str | BinaryContent] ``` ```python read_resource( uri: Resource, ) -> str | BinaryContent | list[str | BinaryContent] ``` ```python read_resource( uri: str | Resource, ) -> str | BinaryContent | list[str | BinaryContent] ``` Read the contents of a specific resource by URI. Parameters: | Name | Type | Description | Default | | ----- | ----- | ----------- | ------------------------------------------------------ | | `uri` | \`str | Resource\` | The URI of the resource to read, or a Resource object. | Returns: | Type | Description | | ----- | ------------- | | \`str | BinaryContent | | \`str | BinaryContent | Raises: | Type | Description | | ---------- | ------------------------------- | | `MCPError` | If the server returns an error. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def read_resource( self, uri: str | Resource ) -> str | messages.BinaryContent | list[str | messages.BinaryContent]: """Read the contents of a specific resource by URI. Args: uri: The URI of the resource to read, or a Resource object. Returns: The resource contents. If the resource has a single content item, returns that item directly. If the resource has multiple content items, returns a list of items. Raises: MCPError: If the server returns an error. """ resource_uri = uri if isinstance(uri, str) else uri.uri async with self: # Ensure server is running try: result = await self._client.read_resource(AnyUrl(resource_uri)) except mcp_exceptions.McpError as e: raise MCPError.from_mcp_sdk(e) from e return ( self._get_content(result.contents[0]) if len(result.contents) == 1 else [self._get_content(resource) for resource in result.contents] ) ``` #### __aenter__ ```python __aenter__() -> Self ``` Enter the MCP server context. This will initialize the connection to the server. If this server is an MCPServerStdio, the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), message_handler=self._handle_notification, client_info=self.client_info, ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): result = await self._client.initialize() self._server_info = result.serverInfo self._server_capabilities = ServerCapabilities.from_mcp_sdk(result.capabilities) self._instructions = result.instructions if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self ``` #### is_running ```python is_running: bool ``` Check if the MCP server is running. ### MCPServerStdio Bases: `MCPServer` Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. Note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStdio(MCPServer): """Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. """ command: str """The command to run.""" args: Sequence[str] """The arguments to pass to the command.""" env: dict[str, str] | None """The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. """ cwd: str | Path | None """The working directory to use when spawning the process.""" # last fields are re-defined from the parent class so they appear as fields tool_prefix: str | None log_level: mcp_types.LoggingLevel | None log_handler: LoggingFnT | None timeout: float read_timeout: float process_tool_call: ProcessToolCallback | None allow_sampling: bool sampling_model: models.Model | None max_retries: int elicitation_callback: ElicitationFnT | None = None cache_tools: bool cache_resources: bool def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, cache_tools: bool = True, cache_resources: bool = True, id: str | None = None, client_info: mcp_types.Implementation | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. cache_tools: Whether to cache the list of tools. See [`MCPServer.cache_tools`][pydantic_ai.mcp.MCPServer.cache_tools]. cache_resources: Whether to cache the list of resources. See [`MCPServer.cache_resources`][pydantic_ai.mcp.MCPServer.cache_resources]. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. client_info: Information describing the MCP client implementation. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, cache_tools, cache_resources, id=id, client_info=client_info, ) @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStdio(**dct), core_schema.typed_dict_schema( { 'command': core_schema.typed_dict_field(core_schema.str_schema()), 'args': core_schema.typed_dict_field(core_schema.list_schema(core_schema.str_schema())), 'env': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False, ), } ), ) @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: server = StdioServerParameters(command=self.command, args=list(self.args), env=self.env, cwd=self.cwd) async with stdio_client(server=server) as (read_stream, write_stream): yield read_stream, write_stream def __repr__(self) -> str: repr_args = [ f'command={self.command!r}', f'args={self.args!r}', ] if self.id: repr_args.append(f'id={self.id!r}') # pragma: lax no cover return f'{self.__class__.__name__}({", ".join(repr_args)})' def __eq__(self, value: object, /) -> bool: return ( super().__eq__(value) and isinstance(value, MCPServerStdio) and self.command == value.command and self.args == value.args and self.env == value.env and self.cwd == value.cwd ) ```` #### __init__ ```python __init__( command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, cache_tools: bool = True, cache_resources: bool = True, id: str | None = None, client_info: Implementation | None = None ) ``` Build a new MCP server. Parameters: | Name | Type | Description | Default | | ---------------------- | --------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `command` | `str` | The command to run. | *required* | | `args` | `Sequence[str]` | The arguments to pass to the command. | *required* | | `env` | \`dict[str, str] | None\` | The environment variables to set in the subprocess. | | `cwd` | \`str | Path | None\` | | `tool_prefix` | \`str | None\` | A prefix to add to all tools that are registered with the server. | | `log_level` | \`LoggingLevel | None\` | The log level to set when connecting to the server, if any. | | `log_handler` | \`LoggingFnT | None\` | A handler for logging messages from the server. | | `timeout` | `float` | The timeout in seconds to wait for the client to initialize. | `5` | | `read_timeout` | `float` | Maximum time in seconds to wait for new messages before timing out. | `5 * 60` | | `process_tool_call` | \`ProcessToolCallback | None\` | Hook to customize tool calling and optionally pass extra metadata. | | `allow_sampling` | `bool` | Whether to allow MCP sampling through this client. | `True` | | `sampling_model` | \`Model | None\` | The model to use for sampling. | | `max_retries` | `int` | The maximum number of times to retry a tool call. | `1` | | `elicitation_callback` | \`ElicitationFnT | None\` | Callback function to handle elicitation requests from the server. | | `cache_tools` | `bool` | Whether to cache the list of tools. See MCPServer.cache_tools. | `True` | | `cache_resources` | `bool` | Whether to cache the list of resources. See MCPServer.cache_resources. | `True` | | `id` | \`str | None\` | An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. | | `client_info` | \`Implementation | None\` | Information describing the MCP client implementation. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, cache_tools: bool = True, cache_resources: bool = True, id: str | None = None, client_info: mcp_types.Implementation | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. cache_tools: Whether to cache the list of tools. See [`MCPServer.cache_tools`][pydantic_ai.mcp.MCPServer.cache_tools]. cache_resources: Whether to cache the list of resources. See [`MCPServer.cache_resources`][pydantic_ai.mcp.MCPServer.cache_resources]. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. client_info: Information describing the MCP client implementation. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, cache_tools, cache_resources, id=id, client_info=client_info, ) ``` #### command ```python command: str = command ``` The command to run. #### args ```python args: Sequence[str] = args ``` The arguments to pass to the command. #### env ```python env: dict[str, str] | None = env ``` The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. #### cwd ```python cwd: str | Path | None = cwd ``` The working directory to use when spawning the process. ### MCPServerSSE Bases: `_MCPServerHTTP` An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerSSE(_MCPServerHTTP): """An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerSSE(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) # sse_client has a hang bug (https://github.com/modelcontextprotocol/python-sdk/issues/1811) # that prevents testing SSE transport in CI. # TODO: Remove pragma and add a test # once https://github.com/modelcontextprotocol/python-sdk/pull/1838 is released. @asynccontextmanager async def client_streams( # pragma: no cover self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: if self.http_client and self.headers: raise ValueError('`http_client` is mutually exclusive with `headers`.') if self.http_client is not None: def httpx_client_factory( headers: dict[str, str] | None = None, timeout: httpx.Timeout | None = None, auth: httpx.Auth | None = None, ) -> httpx.AsyncClient: assert self.http_client is not None return self.http_client async with sse_client( url=self.url, timeout=self.timeout, sse_read_timeout=self.read_timeout, httpx_client_factory=httpx_client_factory, ) as (read_stream, write_stream, *_): yield read_stream, write_stream else: async with sse_client( url=self.url, timeout=self.timeout, sse_read_timeout=self.read_timeout, headers=self.headers, ) as (read_stream, write_stream, *_): yield read_stream, write_stream def __eq__(self, value: object, /) -> bool: return super().__eq__(value) and isinstance(value, MCPServerSSE) and self.url == value.url ```` ### MCPServerHTTP Bases: `MCPServerSSE` Deprecated The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead. An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python @deprecated('The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead.') class MCPServerHTTP(MCPServerSSE): """An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10" test="skip"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` """ ```` ### MCPServerStreamableHTTP Bases: `_MCPServerHTTP` An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStreamableHTTP(_MCPServerHTTP): """An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[server]) ``` """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStreamableHTTP(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: if self.http_client and self.headers: raise ValueError('`http_client` is mutually exclusive with `headers`.') aexit_stack = AsyncExitStack() http_client = self.http_client or await aexit_stack.enter_async_context( httpx.AsyncClient(timeout=httpx.Timeout(self.timeout, read=self.read_timeout), headers=self.headers) ) read_stream, write_stream, *_ = await aexit_stack.enter_async_context( streamable_http_client(self.url, http_client=http_client) ) try: yield read_stream, write_stream finally: await aexit_stack.aclose() def __eq__(self, value: object, /) -> bool: return super().__eq__(value) and isinstance(value, MCPServerStreamableHTTP) and self.url == value.url ```` ### ToolResult ```python ToolResult = ( str | BinaryContent | dict[str, Any] | list[Any] | Sequence[ str | BinaryContent | dict[str, Any] | list[Any] ] ) ``` The result type of an MCP tool call. ### CallToolFunc ```python CallToolFunc = Callable[ [str, dict[str, Any], dict[str, Any] | None], Awaitable[ToolResult], ] ``` A function type that represents a tool call. ### ProcessToolCallback ```python ProcessToolCallback = Callable[ [RunContext[Any], CallToolFunc, str, dict[str, Any]], Awaitable[ToolResult], ] ``` A process tool callback. It accepts a run context, the original tool call function, a tool name, and arguments. Allows wrapping an MCP server tool call to customize it, including adding extra request metadata. ### MCPServerConfig Bases: `BaseModel` Configuration for MCP servers. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python class MCPServerConfig(BaseModel): """Configuration for MCP servers.""" mcp_servers: Annotated[ dict[ str, Annotated[ Annotated[MCPServerStdio, Tag('stdio')] | Annotated[MCPServerStreamableHTTP, Tag('streamable-http')] | Annotated[MCPServerSSE, Tag('sse')], Discriminator(_mcp_server_discriminator), ], ], Field(alias='mcpServers'), ] ``` ### load_mcp_servers ```python load_mcp_servers( config_path: str | Path, ) -> list[ MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE ] ``` Load MCP servers from a configuration file. Environment variables can be referenced in the configuration file using: - `${VAR_NAME}` syntax - expands to the value of VAR_NAME, raises error if not defined - `${VAR_NAME:-default}` syntax - expands to VAR_NAME if set, otherwise uses the default value Parameters: | Name | Type | Description | Default | | ------------- | ----- | ----------- | ----------------------------------- | | `config_path` | \`str | Path\` | The path to the configuration file. | Returns: | Type | Description | | ---------------------- | ----------------------- | | \`list\[MCPServerStdio | MCPServerStreamableHTTP | Raises: | Type | Description | | ------------------- | ----------------------------------------------------------------------------------------------------------- | | `FileNotFoundError` | If the configuration file does not exist. | | `ValidationError` | If the configuration file does not match the schema. | | `ValueError` | If an environment variable referenced in the configuration is not defined and no default value is provided. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def load_mcp_servers(config_path: str | Path) -> list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]: """Load MCP servers from a configuration file. Environment variables can be referenced in the configuration file using: - `${VAR_NAME}` syntax - expands to the value of VAR_NAME, raises error if not defined - `${VAR_NAME:-default}` syntax - expands to VAR_NAME if set, otherwise uses the default value Args: config_path: The path to the configuration file. Returns: A list of MCP servers. Raises: FileNotFoundError: If the configuration file does not exist. ValidationError: If the configuration file does not match the schema. ValueError: If an environment variable referenced in the configuration is not defined and no default value is provided. """ config_path = Path(config_path) if not config_path.exists(): raise FileNotFoundError(f'Config file {config_path} not found') config_data = pydantic_core.from_json(config_path.read_bytes()) expanded_config_data = _expand_env_vars(config_data) config = MCPServerConfig.model_validate(expanded_config_data) servers: list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE] = [] for name, server in config.mcp_servers.items(): server.id = name server.tool_prefix = name servers.append(server) return servers ``` # `pydantic_ai.messages` The structure of ModelMessage can be shown as a graph: ``` graph RL SystemPromptPart(SystemPromptPart) --- ModelRequestPart UserPromptPart(UserPromptPart) --- ModelRequestPart ToolReturnPart(ToolReturnPart) --- ModelRequestPart RetryPromptPart(RetryPromptPart) --- ModelRequestPart TextPart(TextPart) --- ModelResponsePart ToolCallPart(ToolCallPart) --- ModelResponsePart ThinkingPart(ThinkingPart) --- ModelResponsePart ModelRequestPart("ModelRequestPart
(Union)") --- ModelRequest ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage ModelResponsePart("ModelResponsePart
(Union)") --- ModelResponse ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage
(Union)") ``` ### FinishReason ```python FinishReason: TypeAlias = Literal[ "stop", "length", "content_filter", "tool_call", "error" ] ``` Reason the model finished generating the response, normalized to OpenTelemetry values. ### ForceDownloadMode ```python ForceDownloadMode: TypeAlias = bool | Literal["allow-local"] ``` Type for the force_download parameter on FileUrl subclasses. - `False`: The URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata). - `True`: The file is always downloaded with SSRF protection (blocks private IPs and cloud metadata). - `'allow-local'`: The file is always downloaded, allowing private IPs but still blocking cloud metadata. ### ProviderDetailsDelta ```python ProviderDetailsDelta: TypeAlias = ( dict[str, Any] | Callable[[dict[str, Any] | None], dict[str, Any]] | None ) ``` Type for provider_details input: can be a static dict, a callback to update existing details, or None. ### SystemPromptPart A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class SystemPromptPart: """A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. """ content: str """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" dynamic_ref: str | None = None """The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see [`system_prompt`][pydantic_ai.agent.Agent.system_prompt] for more information. """ part_kind: Literal['system-prompt'] = 'system-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> LogRecord: return LogRecord( attributes={'event.name': 'gen_ai.system.message'}, body={'role': 'system', **({'content': self.content} if settings.include_content else {})}, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: return [_otel_messages.TextPart(type='text', **{'content': self.content} if settings.include_content else {})] __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### dynamic_ref ```python dynamic_ref: str | None = None ``` The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see system_prompt for more information. #### part_kind ```python part_kind: Literal['system-prompt'] = 'system-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### FileUrl Bases: `ABC` Abstract base class for any URL-based file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass(repr=False, config=pydantic.ConfigDict(validate_by_name=True)) class FileUrl(ABC): """Abstract base class for any URL-based file.""" url: str """The URL of the file.""" _: KW_ONLY force_download: ForceDownloadMode = False """Controls whether the file is downloaded and how SSRF protection is applied: * If `False`, the URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata). * If `True`, the file is always downloaded with SSRF protection (blocks private IPs and cloud metadata). * If `'allow-local'`, the file is always downloaded, allowing private IPs but still blocking cloud metadata. """ vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images """ _media_type: Annotated[str | None, pydantic.Field(alias='media_type', default=None, exclude=True)] = field( compare=False, default=None ) _identifier: Annotated[str | None, pydantic.Field(alias='identifier', default=None, exclude=True)] = field( compare=False, default=None ) # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the `media_type` and `identifier` aliases. def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: ForceDownloadMode = False, vendor_metadata: dict[str, Any] | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: ... # pragma: no cover @pydantic.computed_field @property def media_type(self) -> str: """Return the media type of the file, based on the URL or the provided `media_type`.""" return self._media_type or self._infer_media_type() @pydantic.computed_field @property def identifier(self) -> str: """The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. """ return self._identifier or _multi_modal_content_identifier(self.url) @abstractmethod def _infer_media_type(self) -> str: """Infer the media type of the file based on the URL.""" raise NotImplementedError @property @abstractmethod def format(self) -> str: """The file format.""" raise NotImplementedError __repr__ = _utils.dataclasses_no_defaults_repr ``` #### url ```python url: str ``` The URL of the file. #### force_download ```python force_download: ForceDownloadMode = False ``` Controls whether the file is downloaded and how SSRF protection is applied: - If `False`, the URL is sent directly to providers that support it. For providers that don't, the file is downloaded with SSRF protection (blocks private IPs and cloud metadata). - If `True`, the file is always downloaded with SSRF protection (blocks private IPs and cloud metadata). - If `'allow-local'`, the file is always downloaded, allowing private IPs but still blocking cloud metadata. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = None ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images #### media_type ```python media_type: str ``` Return the media type of the file, based on the URL or the provided `media_type`. #### identifier ```python identifier: str ``` The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. #### format ```python format: str ``` The file format. ### VideoUrl Bases: `FileUrl` A URL to a video. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass(repr=False, config=pydantic.ConfigDict(validate_by_name=True)) class VideoUrl(FileUrl): """A URL to a video.""" url: str """The URL of the video.""" _: KW_ONLY kind: Literal['video-url'] = 'video-url' """Type identifier, this is available on all parts as a discriminator.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the aliases for the `_media_type` and `_identifier` fields. def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: ForceDownloadMode = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['video-url'] = 'video-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: ... # pragma: no cover def _infer_media_type(self) -> str: """Return the media type of the video, based on the url.""" # Assume that YouTube videos are mp4 because there would be no extension # to infer from. This should not be a problem, as Gemini disregards media # type for YouTube URLs. if self.is_youtube: return 'video/mp4' mime_type, _ = _mime_types.guess_type(self.url) if mime_type is None: raise ValueError( f'Could not infer media type from video URL: {self.url}. Explicitly provide a `media_type` instead.' ) return mime_type @property def is_youtube(self) -> bool: """True if the URL has a YouTube domain.""" parsed = urlparse(self.url) hostname = parsed.hostname return hostname in ('youtu.be', 'youtube.com', 'www.youtube.com') @property def format(self) -> VideoFormat: """The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _video_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the video. #### kind ```python kind: Literal['video-url'] = 'video-url' ``` Type identifier, this is available on all parts as a discriminator. #### is_youtube ```python is_youtube: bool ``` True if the URL has a YouTube domain. #### format ```python format: VideoFormat ``` The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### AudioUrl Bases: `FileUrl` A URL to an audio file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass(repr=False, config=pydantic.ConfigDict(validate_by_name=True)) class AudioUrl(FileUrl): """A URL to an audio file.""" url: str """The URL of the audio file.""" _: KW_ONLY kind: Literal['audio-url'] = 'audio-url' """Type identifier, this is available on all parts as a discriminator.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the aliases for the `_media_type` and `_identifier` fields. def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: ForceDownloadMode = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['audio-url'] = 'audio-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: ... # pragma: no cover def _infer_media_type(self) -> str: """Return the media type of the audio file, based on the url. References: - Gemini: https://ai.google.dev/gemini-api/docs/audio#supported-formats """ mime_type, _ = _mime_types.guess_type(self.url) if mime_type is None: raise ValueError( f'Could not infer media type from audio URL: {self.url}. Explicitly provide a `media_type` instead.' ) return mime_type @property def format(self) -> AudioFormat: """The file format of the audio file.""" return _audio_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the audio file. #### kind ```python kind: Literal['audio-url'] = 'audio-url' ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: AudioFormat ``` The file format of the audio file. ### ImageUrl Bases: `FileUrl` A URL to an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass(repr=False, config=pydantic.ConfigDict(validate_by_name=True)) class ImageUrl(FileUrl): """A URL to an image.""" url: str """The URL of the image.""" _: KW_ONLY kind: Literal['image-url'] = 'image-url' """Type identifier, this is available on all parts as a discriminator.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the aliases for the `_media_type` and `_identifier` fields. def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: ForceDownloadMode = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['image-url'] = 'image-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: ... # pragma: no cover def _infer_media_type(self) -> str: """Return the media type of the image, based on the url.""" mime_type, _ = _mime_types.guess_type(self.url) if mime_type is None: raise ValueError( f'Could not infer media type from image URL: {self.url}. Explicitly provide a `media_type` instead.' ) return mime_type @property def format(self) -> ImageFormat: """The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _image_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the image. #### kind ```python kind: Literal['image-url'] = 'image-url' ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: ImageFormat ``` The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### DocumentUrl Bases: `FileUrl` The URL of the document. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass(repr=False, config=pydantic.ConfigDict(validate_by_name=True)) class DocumentUrl(FileUrl): """The URL of the document.""" url: str """The URL of the document.""" _: KW_ONLY kind: Literal['document-url'] = 'document-url' """Type identifier, this is available on all parts as a discriminator.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the aliases for the `_media_type` and `_identifier` fields. def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: ForceDownloadMode = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['document-url'] = 'document-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: ... # pragma: no cover def _infer_media_type(self) -> str: """Return the media type of the document, based on the url.""" mime_type, _ = _mime_types.guess_type(self.url) if mime_type is None: raise ValueError( f'Could not infer media type from document URL: {self.url}. Explicitly provide a `media_type` instead.' ) return mime_type @property def format(self) -> DocumentFormat: """The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ media_type = self.media_type try: return _document_format_lookup[media_type] except KeyError as e: raise ValueError(f'Unknown document media type: {media_type}') from e ``` #### url ```python url: str ``` The URL of the document. #### kind ```python kind: Literal['document-url'] = 'document-url' ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: DocumentFormat ``` The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### BinaryContent Binary content, e.g. an audio or image file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass( repr=False, config=pydantic.ConfigDict( ser_json_bytes='base64', val_json_bytes='base64', ), ) class BinaryContent: """Binary content, e.g. an audio or image file.""" data: bytes """The binary file data. Use `.base64` to get the base64-encoded string. """ _: KW_ONLY media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str """The media type of the binary data.""" vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images """ _identifier: Annotated[str | None, pydantic.Field(alias='identifier', default=None, exclude=True)] = field( compare=False, default=None ) kind: Literal['binary'] = 'binary' """Type identifier, this is available on all parts as a discriminator.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the `identifier` alias for the `_identifier` field. def __init__( self, data: bytes, *, media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str, identifier: str | None = None, vendor_metadata: dict[str, Any] | None = None, kind: Literal['binary'] = 'binary', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _identifier: str | None = None, ) -> None: ... # pragma: no cover @staticmethod def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage: """Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image.""" if bc.is_image: return BinaryImage( data=bc.data, media_type=bc.media_type, identifier=bc.identifier, vendor_metadata=bc.vendor_metadata, ) else: return bc @classmethod def from_data_uri(cls, data_uri: str) -> BinaryContent: """Create a `BinaryContent` from a data URI.""" prefix = 'data:' if not data_uri.startswith(prefix): raise ValueError('Data URI must start with "data:"') media_type, data = data_uri[len(prefix) :].split(';base64,', 1) return cls.narrow_type(cls(data=base64.b64decode(data), media_type=media_type)) @classmethod def from_path(cls, path: PathLike[str]) -> BinaryContent: """Create a `BinaryContent` from a path. Defaults to 'application/octet-stream' if the media type cannot be inferred. Raises: FileNotFoundError: if the file does not exist. PermissionError: if the file cannot be read. """ path = Path(path) if not path.exists(): raise FileNotFoundError(f'File not found: {path}') media_type, _ = _mime_types.guess_type(path) if media_type is None: media_type = 'application/octet-stream' return cls.narrow_type(cls(data=path.read_bytes(), media_type=media_type)) @pydantic.computed_field @property def identifier(self) -> str: """Identifier for the binary content, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. """ return self._identifier or _multi_modal_content_identifier(self.data) @property def data_uri(self) -> str: """Convert the `BinaryContent` to a data URI.""" return f'data:{self.media_type};base64,{self.base64}' @property def base64(self) -> str: """Return the binary data as a base64-encoded string. Default encoding is UTF-8.""" return base64.b64encode(self.data).decode() @property def is_audio(self) -> bool: """Return `True` if the media type is an audio type.""" return self.media_type.startswith('audio/') @property def is_image(self) -> bool: """Return `True` if the media type is an image type.""" return self.media_type.startswith('image/') @property def is_video(self) -> bool: """Return `True` if the media type is a video type.""" return self.media_type.startswith('video/') @property def is_document(self) -> bool: """Return `True` if the media type is a document type.""" return self.media_type in _document_format_lookup @property def format(self) -> str: """The file format of the binary content.""" try: if self.is_audio: return _audio_format_lookup[self.media_type] elif self.is_image: return _image_format_lookup[self.media_type] elif self.is_video: return _video_format_lookup[self.media_type] else: return _document_format_lookup[self.media_type] except KeyError as e: raise ValueError(f'Unknown media type: {self.media_type}') from e __repr__ = _utils.dataclasses_no_defaults_repr ``` #### data ```python data: bytes ``` The binary file data. Use `.base64` to get the base64-encoded string. #### media_type ```python media_type: ( AudioMediaType | ImageMediaType | DocumentMediaType | str ) ``` The media type of the binary data. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = None ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images - `XaiModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images #### kind ```python kind: Literal['binary'] = 'binary' ``` Type identifier, this is available on all parts as a discriminator. #### narrow_type ```python narrow_type( bc: BinaryContent, ) -> BinaryContent | BinaryImage ``` Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @staticmethod def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage: """Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image.""" if bc.is_image: return BinaryImage( data=bc.data, media_type=bc.media_type, identifier=bc.identifier, vendor_metadata=bc.vendor_metadata, ) else: return bc ``` #### from_data_uri ```python from_data_uri(data_uri: str) -> BinaryContent ``` Create a `BinaryContent` from a data URI. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def from_data_uri(cls, data_uri: str) -> BinaryContent: """Create a `BinaryContent` from a data URI.""" prefix = 'data:' if not data_uri.startswith(prefix): raise ValueError('Data URI must start with "data:"') media_type, data = data_uri[len(prefix) :].split(';base64,', 1) return cls.narrow_type(cls(data=base64.b64decode(data), media_type=media_type)) ``` #### from_path ```python from_path(path: PathLike[str]) -> BinaryContent ``` Create a `BinaryContent` from a path. Defaults to 'application/octet-stream' if the media type cannot be inferred. Raises: | Type | Description | | ------------------- | --------------------------- | | `FileNotFoundError` | if the file does not exist. | | `PermissionError` | if the file cannot be read. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def from_path(cls, path: PathLike[str]) -> BinaryContent: """Create a `BinaryContent` from a path. Defaults to 'application/octet-stream' if the media type cannot be inferred. Raises: FileNotFoundError: if the file does not exist. PermissionError: if the file cannot be read. """ path = Path(path) if not path.exists(): raise FileNotFoundError(f'File not found: {path}') media_type, _ = _mime_types.guess_type(path) if media_type is None: media_type = 'application/octet-stream' return cls.narrow_type(cls(data=path.read_bytes(), media_type=media_type)) ``` #### identifier ```python identifier: str ``` Identifier for the binary content, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. #### data_uri ```python data_uri: str ``` Convert the `BinaryContent` to a data URI. #### base64 ```python base64: str ``` Return the binary data as a base64-encoded string. Default encoding is UTF-8. #### is_audio ```python is_audio: bool ``` Return `True` if the media type is an audio type. #### is_image ```python is_image: bool ``` Return `True` if the media type is an image type. #### is_video ```python is_video: bool ``` Return `True` if the media type is a video type. #### is_document ```python is_document: bool ``` Return `True` if the media type is a document type. #### format ```python format: str ``` The file format of the binary content. ### BinaryImage Bases: `BinaryContent` Binary content that's guaranteed to be an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @pydantic_dataclass( repr=False, config=pydantic.ConfigDict( ser_json_bytes='base64', val_json_bytes='base64', ), ) class BinaryImage(BinaryContent): """Binary content that's guaranteed to be an image.""" # `pydantic_dataclass` replaces `__init__` so this method is never used. # The signature is kept so that pyright/IDE hints recognize the `identifier` alias for the `_identifier` field. def __init__( self, data: bytes, *, media_type: ImageMediaType | str, identifier: str | None = None, vendor_metadata: dict[str, Any] | None = None, kind: Literal['binary'] = 'binary', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _identifier: str | None = None, ) -> None: ... # pragma: no cover def __post_init__(self): if not self.is_image: raise ValueError('`BinaryImage` must have a media type that starts with "image/"') ``` ### CachePoint A cache point marker for prompt caching. Can be inserted into UserPromptPart.content to mark cache boundaries. Models that don't support caching will filter these out. Supported by: - Anthropic - Amazon Bedrock (Converse API) Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass class CachePoint: """A cache point marker for prompt caching. Can be inserted into UserPromptPart.content to mark cache boundaries. Models that don't support caching will filter these out. Supported by: - Anthropic - Amazon Bedrock (Converse API) """ kind: Literal['cache-point'] = 'cache-point' """Type identifier, this is available on all parts as a discriminator.""" ttl: Literal['5m', '1h'] = '5m' """The cache time-to-live, either "5m" (5 minutes) or "1h" (1 hour). Supported by: * Anthropic (automatically omitted for Bedrock, as it does not support explicit TTL). See https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration for more information.""" ``` #### kind ```python kind: Literal['cache-point'] = 'cache-point' ``` Type identifier, this is available on all parts as a discriminator. #### ttl ```python ttl: Literal['5m', '1h'] = '5m' ``` The cache time-to-live, either "5m" (5 minutes) or "1h" (1 hour). Supported by: - Anthropic (automatically omitted for Bedrock, as it does not support explicit TTL). See https://docs.claude.com/en/docs/build-with-claude/prompt-caching#1-hour-cache-duration for more information. ### MULTI_MODAL_CONTENT_TYPES ```python MULTI_MODAL_CONTENT_TYPES = ( ImageUrl, AudioUrl, DocumentUrl, VideoUrl, BinaryContent, ) ``` Tuple of multi-modal content types for use with isinstance() checks. ### MultiModalContent ```python MultiModalContent = Annotated[ ImageUrl | AudioUrl | DocumentUrl | VideoUrl | BinaryContent, Discriminator("kind"), ] ``` Union of all multi-modal content types with a discriminator for Pydantic validation. ### ToolReturn A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturn: """A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use """ return_value: ToolReturnContent """The return value to be used in the tool response.""" _: KW_ONLY content: str | Sequence[UserContent] | None = None """The content to be sent to the model as a UserPromptPart.""" metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" kind: Literal['tool-return'] = 'tool-return' __repr__ = _utils.dataclasses_no_defaults_repr ``` #### return_value ```python return_value: ToolReturnContent ``` The return value to be used in the tool response. #### content ```python content: str | Sequence[UserContent] | None = None ``` The content to be sent to the model as a UserPromptPart. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. ### UserPromptPart A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of Agent.run, Agent.run_sync, and Agent.run_stream. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class UserPromptPart: """A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of [`Agent.run`][pydantic_ai.agent.AbstractAgent.run], [`Agent.run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], and [`Agent.run_stream`][pydantic_ai.agent.AbstractAgent.run_stream]. """ content: str | Sequence[UserContent] """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" part_kind: Literal['user-prompt'] = 'user-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> LogRecord: content: Any = [{'kind': part.pop('type'), **part} for part in self.otel_message_parts(settings)] for part in content: if part['kind'] == 'binary' and 'content' in part: part['binary_content'] = part.pop('content') content = [ part['content'] if part == {'kind': 'text', 'content': part.get('content')} else part for part in content ] if content in ([{'kind': 'text'}], [self.content]): content = content[0] return LogRecord(attributes={'event.name': 'gen_ai.user.message'}, body={'content': content, 'role': 'user'}) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] content: Sequence[UserContent] = [self.content] if isinstance(self.content, str) else self.content for part in content: if isinstance(part, str): parts.append( _otel_messages.TextPart(type='text', **({'content': part} if settings.include_content else {})) ) elif isinstance(part, ImageUrl | AudioUrl | DocumentUrl | VideoUrl): parts.append( _otel_messages.MediaUrlPart( type=part.kind, **{'url': part.url} if settings.include_content else {}, ) ) elif isinstance(part, BinaryContent): converted_part = _otel_messages.BinaryDataPart(type='binary', media_type=part.media_type) if settings.include_content and settings.include_binary_content: converted_part['content'] = part.base64 parts.append(converted_part) elif isinstance(part, CachePoint): # CachePoint is a marker, not actual content - skip it for otel pass else: parts.append({'type': part.kind}) # pragma: no cover return parts __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str | Sequence[UserContent] ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### part_kind ```python part_kind: Literal['user-prompt'] = 'user-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### BaseToolReturnPart Base class for tool return parts. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolReturnPart: """Base class for tool return parts.""" tool_name: str """The name of the "tool" was called.""" content: ToolReturnContent """The return value.""" tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ _: KW_ONLY metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the tool returned.""" def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict json_content = tool_return_ta.dump_python(self.content, mode='json') if isinstance(json_content, dict): return json_content # type: ignore[reportUnknownReturn] else: return {'return_value': json_content} def otel_event(self, settings: InstrumentationSettings) -> LogRecord: body: AnyValue = { 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, } if settings.include_content: body['content'] = self.content # pyright: ignore[reportArgumentType] return LogRecord( body=body, attributes={'event.name': 'gen_ai.tool.message'}, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: from .models.instrumented import InstrumentedModel part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, ) if settings.include_content and self.content is not None: part['result'] = InstrumentedModel.serialize_any(self.content) return [part] def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the "tool" was called. #### content ```python content: ToolReturnContent ``` The return value. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the tool returned. #### model_response_str ```python model_response_str() -> str ``` Return a string representation of the content for the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() ``` #### model_response_object ```python model_response_object() -> dict[str, Any] ``` Return a dictionary representation of the content, wrapping non-dict types appropriately. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict json_content = tool_return_ta.dump_python(self.content, mode='json') if isinstance(json_content, dict): return json_content # type: ignore[reportUnknownReturn] else: return {'return_value': json_content} ``` #### has_content ```python has_content() -> bool ``` Return `True` if the tool return has content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover ``` ### ToolReturnPart Bases: `BaseToolReturnPart` A tool return message, this encodes the result of running a tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturnPart(BaseToolReturnPart): """A tool return message, this encodes the result of running a tool.""" _: KW_ONLY part_kind: Literal['tool-return'] = 'tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-return'] = 'tool-return' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolReturnPart Bases: `BaseToolReturnPart` A tool return message from a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolReturnPart(BaseToolReturnPart): """A tool return message from a built-in tool.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response. Required to be set when `provider_details` is set. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.""" part_kind: Literal['builtin-tool-return'] = 'builtin-tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Required to be set when `provider_details` is set. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_kind ```python part_kind: Literal["builtin-tool-return"] = ( "builtin-tool-return" ) ``` Part type identifier, this is available on all parts as a discriminator. ### RetryPromptPart A message back to a model asking it to try again. This can be sent for a number of reasons: - Pydantic validation of tool arguments failed, here content is derived from a Pydantic ValidationError - a tool raised a ModelRetry exception - no tool was found for the tool name - the model returned plain text when a structured response was expected - Pydantic validation of a structured response failed, here content is derived from a Pydantic ValidationError - an output validator raised a ModelRetry exception Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ````python @dataclass(repr=False) class RetryPromptPart: """A message back to a model asking it to try again. This can be sent for a number of reasons: * Pydantic validation of tool arguments failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * a tool raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception * no tool was found for the tool name * the model returned plain text when a structured response was expected * Pydantic validation of a structured response failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * an output validator raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception """ content: list[pydantic_core.ErrorDetails] | str """Details of why and how the model should retry. If the retry was triggered by a [`ValidationError`][pydantic_core.ValidationError], this will be a list of error details. """ _: KW_ONLY tool_name: str | None = None """The name of the tool that was called, if any.""" tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the retry was triggered.""" part_kind: Literal['retry-prompt'] = 'retry-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) plural = isinstance(self.content, list) and len(self.content) != 1 description = ( f'{len(self.content)} validation error{"s" if plural else ""}:\n```json\n{json_errors.decode()}\n```' ) return f'{description}\n\nFix the errors and try again.' def otel_event(self, settings: InstrumentationSettings) -> LogRecord: if self.tool_name is None: return LogRecord( attributes={'event.name': 'gen_ai.user.message'}, body={'content': self.model_response(), 'role': 'user'}, ) else: return LogRecord( attributes={'event.name': 'gen_ai.tool.message'}, body={ **({'content': self.model_response()} if settings.include_content else {}), 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, }, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: if self.tool_name is None: return [_otel_messages.TextPart(type='text', content=self.model_response())] else: part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, ) if settings.include_content: part['result'] = self.model_response() return [part] __repr__ = _utils.dataclasses_no_defaults_repr ```` #### content ```python content: list[ErrorDetails] | str ``` Details of why and how the model should retry. If the retry was triggered by a ValidationError, this will be a list of error details. #### tool_name ```python tool_name: str | None = None ``` The name of the tool that was called, if any. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the retry was triggered. #### part_kind ```python part_kind: Literal['retry-prompt'] = 'retry-prompt' ``` Part type identifier, this is available on all parts as a discriminator. #### model_response ```python model_response() -> str ``` Return a string message describing why the retry is requested. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ````python def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) plural = isinstance(self.content, list) and len(self.content) != 1 description = ( f'{len(self.content)} validation error{"s" if plural else ""}:\n```json\n{json_errors.decode()}\n```' ) return f'{description}\n\nFix the errors and try again.' ```` ### ModelRequestPart ```python ModelRequestPart = Annotated[ SystemPromptPart | UserPromptPart | ToolReturnPart | RetryPromptPart, Discriminator("part_kind"), ] ``` A message part sent by Pydantic AI to a model. ### ModelRequest A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelRequest: """A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model.""" parts: Sequence[ModelRequestPart] """The parts of the user message.""" _: KW_ONLY # Default is None for backwards compatibility with old serialized messages that don't have this field. # Using a default_factory would incorrectly fill in the current time for deserialized historical messages. timestamp: datetime | None = None """The timestamp when the request was sent to the model.""" instructions: str | None = None """The instructions for the model.""" kind: Literal['request'] = 'request' """Message type identifier, this is available on all parts as a discriminator.""" run_id: str | None = None """The unique identifier of the agent run in which this message originated.""" metadata: dict[str, Any] | None = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelRequestPart] ``` The parts of the user message. #### timestamp ```python timestamp: datetime | None = None ``` The timestamp when the request was sent to the model. #### instructions ```python instructions: str | None = None ``` The instructions for the model. #### kind ```python kind: Literal['request'] = 'request' ``` Message type identifier, this is available on all parts as a discriminator. #### run_id ```python run_id: str | None = None ``` The unique identifier of the agent run in which this message originated. #### metadata ```python metadata: dict[str, Any] | None = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. #### user_text_prompt ```python user_text_prompt( user_prompt: str, *, instructions: str | None = None ) -> ModelRequest ``` Create a `ModelRequest` with a single user prompt as text. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) ``` ### TextPart A plain text response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPart: """A plain text response from a model.""" content: str """The text content of the response.""" _: KW_ONLY id: str | None = None """An optional identifier of the text part. When this field is set, `provider_name` is required to identify the provider that generated this data. """ provider_name: str | None = None """The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ part_kind: Literal['text'] = 'text' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The text content of the response. #### id ```python id: str | None = None ``` An optional identifier of the text part. When this field is set, `provider_name` is required to identify the provider that generated this data. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_kind ```python part_kind: Literal['text'] = 'text' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the text content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) ``` ### ThinkingPart A thinking response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ThinkingPart: """A thinking response from a model.""" content: str """The thinking content of the response.""" _: KW_ONLY id: str | None = None """The identifier of the thinking part. When this field is set, `provider_name` is required to identify the provider that generated this data. """ signature: str | None = None """The signature of the thinking. Supported by: * Anthropic (corresponds to the `signature` field) * Bedrock (corresponds to the `signature` field) * Google (corresponds to the `thought_signature` field) * OpenAI (corresponds to the `encrypted_content` field) When this field is set, `provider_name` is required to identify the provider that generated this data. """ provider_name: str | None = None """The name of the provider that generated the response. Signatures are only sent back to the same provider. Required to be set when `provider_details`, `id` or `signature` is set. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ part_kind: Literal['thinking'] = 'thinking' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The thinking content of the response. #### id ```python id: str | None = None ``` The identifier of the thinking part. When this field is set, `provider_name` is required to identify the provider that generated this data. #### signature ```python signature: str | None = None ``` The signature of the thinking. Supported by: - Anthropic (corresponds to the `signature` field) - Bedrock (corresponds to the `signature` field) - Google (corresponds to the `thought_signature` field) - OpenAI (corresponds to the `encrypted_content` field) When this field is set, `provider_name` is required to identify the provider that generated this data. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Signatures are only sent back to the same provider. Required to be set when `provider_details`, `id` or `signature` is set. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_kind ```python part_kind: Literal['thinking'] = 'thinking' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the thinking content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) ``` ### FilePart A file response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FilePart: """A file response from a model.""" content: Annotated[BinaryContent, pydantic.AfterValidator(BinaryImage.narrow_type)] """The file content of the response.""" _: KW_ONLY id: str | None = None """The identifier of the file part. When this field is set, `provider_name` is required to identify the provider that generated this data. """ provider_name: str | None = None """The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ part_kind: Literal['file'] = 'file' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the file content is non-empty.""" return bool(self.content.data) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: Annotated[ BinaryContent, AfterValidator(narrow_type) ] ``` The file content of the response. #### id ```python id: str | None = None ``` The identifier of the file part. When this field is set, `provider_name` is required to identify the provider that generated this data. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Required to be set when `provider_details` or `id` is set. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_kind ```python part_kind: Literal['file'] = 'file' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the file content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the file content is non-empty.""" return bool(self.content.data) ``` ### BaseToolCallPart A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolCallPart: """A tool call from a model.""" tool_name: str """The name of the tool to call.""" args: str | dict[str, Any] | None = None """The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. """ tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ _: KW_ONLY id: str | None = None """An optional identifier of the tool call part, separate from the tool call ID. This is used by some APIs like OpenAI Responses. When this field is set, `provider_name` is required to identify the provider that generated this data. """ provider_name: str | None = None """The name of the provider that generated the response. Builtin tool calls are only sent back to the same provider. Required to be set when `provider_details` or `id` is set. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the tool to call. #### args ```python args: str | dict[str, Any] | None = None ``` The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### id ```python id: str | None = None ``` An optional identifier of the tool call part, separate from the tool call ID. This is used by some APIs like OpenAI Responses. When this field is set, `provider_name` is required to identify the provider that generated this data. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Builtin tool calls are only sent back to the same provider. Required to be set when `provider_details` or `id` is set. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### args_as_dict ```python args_as_dict() -> dict[str, Any] ``` Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) ``` #### args_as_json_str ```python args_as_json_str() -> str ``` Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() ``` #### has_content ```python has_content() -> bool ``` Return `True` if the arguments contain any data. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) ``` ### ToolCallPart Bases: `BaseToolCallPart` A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolCallPart(BaseToolCallPart): """A tool call from a model.""" _: KW_ONLY part_kind: Literal['tool-call'] = 'tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-call'] = 'tool-call' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolCallPart Bases: `BaseToolCallPart` A tool call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolCallPart(BaseToolCallPart): """A tool call to a built-in tool.""" _: KW_ONLY part_kind: Literal['builtin-tool-call'] = 'builtin-tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal["builtin-tool-call"] = ( "builtin-tool-call" ) ``` Part type identifier, this is available on all parts as a discriminator. ### ModelResponsePart ```python ModelResponsePart = Annotated[ TextPart | ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart | ThinkingPart | FilePart, Discriminator("part_kind"), ] ``` A message part returned by a model. ### ModelResponse A response from a model, e.g. a message from the model to the Pydantic AI app. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelResponse: """A response from a model, e.g. a message from the model to the Pydantic AI app.""" parts: Sequence[ModelResponsePart] """The parts of the model message.""" _: KW_ONLY usage: RequestUsage = field(default_factory=RequestUsage) """Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. """ model_name: str | None = None """The name of the model that generated the response.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp when the response was received locally. This is always a high-precision local datetime. Provider-specific timestamps (if available) are stored in `provider_details['timestamp']`. """ kind: Literal['response'] = 'response' """Message type identifier, this is available on all parts as a discriminator.""" provider_name: str | None = None """The name of the LLM provider that generated the response.""" provider_url: str | None = None """The base URL of the LLM provider that generated the response.""" provider_details: Annotated[ dict[str, Any] | None, # `vendor_details` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_details', 'vendor_details')), ] = None """Additional data returned by the provider that can't be mapped to standard fields.""" provider_response_id: Annotated[ str | None, # `vendor_id` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_response_id', 'vendor_id')), ] = None """request ID as specified by the model provider. This can be used to track the specific request to the model.""" finish_reason: FinishReason | None = None """Reason the model finished generating the response, normalized to OpenTelemetry values.""" run_id: str | None = None """The unique identifier of the agent run in which this message originated.""" metadata: dict[str, Any] | None = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" @property def text(self) -> str | None: """Get the text in the response.""" texts: list[str] = [] last_part: ModelResponsePart | None = None for part in self.parts: if isinstance(part, TextPart): # Adjacent text parts should be joined together, but if there are parts in between # (like built-in tool calls) they should have newlines between them if isinstance(last_part, TextPart): texts[-1] += part.content else: texts.append(part.content) last_part = part if not texts: return None return '\n\n'.join(texts) @property def thinking(self) -> str | None: """Get the thinking in the response.""" thinking_parts = [part.content for part in self.parts if isinstance(part, ThinkingPart)] if not thinking_parts: return None return '\n\n'.join(thinking_parts) @property def files(self) -> list[BinaryContent]: """Get the files in the response.""" return [part.content for part in self.parts if isinstance(part, FilePart)] @property def images(self) -> list[BinaryImage]: """Get the images in the response.""" return [file for file in self.files if isinstance(file, BinaryImage)] @property def tool_calls(self) -> list[ToolCallPart]: """Get the tool calls in the response.""" return [part for part in self.parts if isinstance(part, ToolCallPart)] @property def builtin_tool_calls(self) -> list[tuple[BuiltinToolCallPart, BuiltinToolReturnPart]]: """Get the builtin tool calls and results in the response.""" calls = [part for part in self.parts if isinstance(part, BuiltinToolCallPart)] if not calls: return [] returns_by_id = {part.tool_call_id: part for part in self.parts if isinstance(part, BuiltinToolReturnPart)} return [ (call_part, returns_by_id[call_part.tool_call_id]) for call_part in calls if call_part.tool_call_id in returns_by_id ] @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' # Try matching on provider_api_url first as this is more specific, then fall back to provider_id. if self.provider_url: try: return calc_price( self.usage, self.model_name, provider_api_url=self.provider_url, genai_request_timestamp=self.timestamp, ) except LookupError: pass return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) def otel_events(self, settings: InstrumentationSettings) -> list[LogRecord]: """Return OpenTelemetry events for the response.""" result: list[LogRecord] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = LogRecord(attributes={'event.name': 'gen_ai.assistant.message'}, body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) elif isinstance(part, FilePart): body.setdefault('content', []).append( { 'kind': 'binary', 'media_type': part.content.media_type, **( {'binary_content': part.content.base64} if settings.include_content and settings.include_binary_content else {} ), } ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] for part in self.parts: if isinstance(part, TextPart): parts.append( _otel_messages.TextPart( type='text', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, ThinkingPart): parts.append( _otel_messages.ThinkingPart( type='thinking', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, FilePart): converted_part = _otel_messages.BinaryDataPart(type='binary', media_type=part.content.media_type) if settings.include_content and settings.include_binary_content: converted_part['content'] = part.content.base64 parts.append(converted_part) elif isinstance(part, BaseToolCallPart): call_part = _otel_messages.ToolCallPart(type='tool_call', id=part.tool_call_id, name=part.tool_name) if isinstance(part, BuiltinToolCallPart): call_part['builtin'] = True if settings.include_content and part.args is not None: from .models.instrumented import InstrumentedModel if isinstance(part.args, str): call_part['arguments'] = part.args else: call_part['arguments'] = {k: InstrumentedModel.serialize_any(v) for k, v in part.args.items()} parts.append(call_part) elif isinstance(part, BuiltinToolReturnPart): return_part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=part.tool_call_id, name=part.tool_name, builtin=True, ) if settings.include_content and part.content is not None: # pragma: no branch from .models.instrumented import InstrumentedModel return_part['result'] = InstrumentedModel.serialize_any(part.content) parts.append(return_part) return parts @property @deprecated('`vendor_details` is deprecated, use `provider_details` instead') def vendor_details(self) -> dict[str, Any] | None: return self.provider_details @property @deprecated('`vendor_id` is deprecated, use `provider_response_id` instead') def vendor_id(self) -> str | None: return self.provider_response_id @property @deprecated('`provider_request_id` is deprecated, use `provider_response_id` instead') def provider_request_id(self) -> str | None: return self.provider_response_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelResponsePart] ``` The parts of the model message. #### usage ```python usage: RequestUsage = field(default_factory=RequestUsage) ``` Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. #### model_name ```python model_name: str | None = None ``` The name of the model that generated the response. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp when the response was received locally. This is always a high-precision local datetime. Provider-specific timestamps (if available) are stored in `provider_details['timestamp']`. #### kind ```python kind: Literal['response'] = 'response' ``` Message type identifier, this is available on all parts as a discriminator. #### provider_name ```python provider_name: str | None = None ``` The name of the LLM provider that generated the response. #### provider_url ```python provider_url: str | None = None ``` The base URL of the LLM provider that generated the response. #### provider_details ```python provider_details: Annotated[ dict[str, Any] | None, Field( validation_alias=AliasChoices( provider_details, vendor_details ) ), ] = None ``` Additional data returned by the provider that can't be mapped to standard fields. #### provider_response_id ```python provider_response_id: Annotated[ str | None, Field( validation_alias=AliasChoices( provider_response_id, vendor_id ) ), ] = None ``` request ID as specified by the model provider. This can be used to track the specific request to the model. #### finish_reason ```python finish_reason: FinishReason | None = None ``` Reason the model finished generating the response, normalized to OpenTelemetry values. #### run_id ```python run_id: str | None = None ``` The unique identifier of the agent run in which this message originated. #### metadata ```python metadata: dict[str, Any] | None = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. #### text ```python text: str | None ``` Get the text in the response. #### thinking ```python thinking: str | None ``` Get the thinking in the response. #### files ```python files: list[BinaryContent] ``` Get the files in the response. #### images ```python images: list[BinaryImage] ``` Get the images in the response. #### tool_calls ```python tool_calls: list[ToolCallPart] ``` Get the tool calls in the response. #### builtin_tool_calls ```python builtin_tool_calls: list[ tuple[BuiltinToolCallPart, BuiltinToolReturnPart] ] ``` Get the builtin tool calls and results in the response. #### price ```python price() -> PriceCalculation ``` Deprecated `price` is deprecated, use `cost` instead Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() ``` #### cost ```python cost() -> PriceCalculation ``` Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' # Try matching on provider_api_url first as this is more specific, then fall back to provider_id. if self.provider_url: try: return calc_price( self.usage, self.model_name, provider_api_url=self.provider_url, genai_request_timestamp=self.timestamp, ) except LookupError: pass return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ``` #### otel_events ```python otel_events( settings: InstrumentationSettings, ) -> list[LogRecord] ``` Return OpenTelemetry events for the response. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def otel_events(self, settings: InstrumentationSettings) -> list[LogRecord]: """Return OpenTelemetry events for the response.""" result: list[LogRecord] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = LogRecord(attributes={'event.name': 'gen_ai.assistant.message'}, body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) elif isinstance(part, FilePart): body.setdefault('content', []).append( { 'kind': 'binary', 'media_type': part.content.media_type, **( {'binary_content': part.content.base64} if settings.include_content and settings.include_binary_content else {} ), } ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result ``` ### ModelMessage ```python ModelMessage = Annotated[ ModelRequest | ModelResponse, Discriminator("kind") ] ``` Any message sent to or returned by a model. ### ModelMessagesTypeAdapter ```python ModelMessagesTypeAdapter = TypeAdapter( list[ModelMessage], config=ConfigDict( defer_build=True, ser_json_bytes="base64", val_json_bytes="base64", ), ) ``` Pydantic TypeAdapter for (de)serializing messages. ### TextPartDelta A partial update (delta) for a `TextPart` to append new text content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPartDelta: """A partial update (delta) for a `TextPart` to append new text content.""" content_delta: str """The incremental text content to add to the existing `TextPart` content.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial TextPart does not have a `provider_name` or it has changed. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ part_delta_kind: Literal['text'] = 'text' """Part delta type identifier, used as a discriminator.""" def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace( part, content=part.content + self.content_delta, provider_name=self.provider_name or part.provider_name, provider_details={**(part.provider_details or {}), **(self.provider_details or {})} or None, ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str ``` The incremental text content to add to the existing `TextPart` content. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial TextPart does not have a `provider_name` or it has changed. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_delta_kind ```python part_delta_kind: Literal['text'] = 'text' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> TextPart ``` Apply this text delta to an existing `TextPart`. Parameters: | Name | Type | Description | Default | | ------ | ------------------- | ----------------------------------------------------------- | ---------- | | `part` | `ModelResponsePart` | The existing model response part, which must be a TextPart. | *required* | Returns: | Type | Description | | ---------- | ----------------------------------------- | | `TextPart` | A new TextPart with updated text content. | Raises: | Type | Description | | ------------ | -------------------------- | | `ValueError` | If part is not a TextPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace( part, content=part.content + self.content_delta, provider_name=self.provider_name or part.provider_name, provider_details={**(part.provider_details or {}), **(self.provider_details or {})} or None, ) ``` ### ThinkingPartDelta A partial update (delta) for a `ThinkingPart` to append new thinking content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ThinkingPartDelta: """A partial update (delta) for a `ThinkingPart` to append new thinking content.""" content_delta: str | None = None """The incremental thinking content to add to the existing `ThinkingPart` content.""" signature_delta: str | None = None """Optional signature delta. Note this is never treated as a delta — it can replace None. """ provider_name: str | None = None """Optional provider name for the thinking part. Signatures are only sent back to the same provider. Required to be set when `provider_details` is set and the initial ThinkingPart does not have a `provider_name` or it has changed. """ provider_details: ProviderDetailsDelta = None """Additional data returned by the provider that can't be mapped to standard fields. Can be a dict to merge with existing details, or a callable that takes the existing details and returns updated details. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data.""" part_delta_kind: Literal['thinking'] = 'thinking' """Part delta type identifier, used as a discriminator.""" @overload def apply(self, part: ModelResponsePart) -> ThinkingPart: ... @overload def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: ... def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name # Resolve callable provider_details if needed resolved_details = ( self.provider_details(part.provider_details) if callable(self.provider_details) else self.provider_details ) new_provider_details = {**(part.provider_details or {}), **(resolved_details or {})} or None return replace( part, content=new_content, signature=new_signature, provider_name=new_provider_name, provider_details=new_provider_details, ) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) if self.provider_details is not None: if callable(self.provider_details): if callable(part.provider_details): existing_fn = part.provider_details new_fn = self.provider_details def chained_both(d: dict[str, Any] | None) -> dict[str, Any]: return new_fn(existing_fn(d)) part = replace(part, provider_details=chained_both) else: part = replace(part, provider_details=self.provider_details) # pragma: no cover elif callable(part.provider_details): existing_fn = part.provider_details new_dict = self.provider_details def chained_dict(d: dict[str, Any] | None) -> dict[str, Any]: return {**existing_fn(d), **new_dict} part = replace(part, provider_details=chained_dict) else: existing = part.provider_details if isinstance(part.provider_details, dict) else {} part = replace(part, provider_details={**existing, **self.provider_details}) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str | None = None ``` The incremental thinking content to add to the existing `ThinkingPart` content. #### signature_delta ```python signature_delta: str | None = None ``` Optional signature delta. Note this is never treated as a delta — it can replace None. #### provider_name ```python provider_name: str | None = None ``` Optional provider name for the thinking part. Signatures are only sent back to the same provider. Required to be set when `provider_details` is set and the initial ThinkingPart does not have a `provider_name` or it has changed. #### provider_details ```python provider_details: ProviderDetailsDelta = None ``` Additional data returned by the provider that can't be mapped to standard fields. Can be a dict to merge with existing details, or a callable that takes the existing details and returns updated details. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_delta_kind ```python part_delta_kind: Literal['thinking'] = 'thinking' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> ThinkingPart ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` Apply this thinking delta to an existing `ThinkingPart`. Parameters: | Name | Type | Description | Default | | ------ | ------------------- | ------------------- | --------------------------------------------------------------- | | `part` | \`ModelResponsePart | ThinkingPartDelta\` | The existing model response part, which must be a ThinkingPart. | Returns: | Type | Description | | -------------- | ------------------- | | \`ThinkingPart | ThinkingPartDelta\` | Raises: | Type | Description | | ------------ | ------------------------------ | | `ValueError` | If part is not a ThinkingPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name # Resolve callable provider_details if needed resolved_details = ( self.provider_details(part.provider_details) if callable(self.provider_details) else self.provider_details ) new_provider_details = {**(part.provider_details or {}), **(resolved_details or {})} or None return replace( part, content=new_content, signature=new_signature, provider_name=new_provider_name, provider_details=new_provider_details, ) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) if self.provider_details is not None: if callable(self.provider_details): if callable(part.provider_details): existing_fn = part.provider_details new_fn = self.provider_details def chained_both(d: dict[str, Any] | None) -> dict[str, Any]: return new_fn(existing_fn(d)) part = replace(part, provider_details=chained_both) else: part = replace(part, provider_details=self.provider_details) # pragma: no cover elif callable(part.provider_details): existing_fn = part.provider_details new_dict = self.provider_details def chained_dict(d: dict[str, Any] | None) -> dict[str, Any]: return {**existing_fn(d), **new_dict} part = replace(part, provider_details=chained_dict) else: existing = part.provider_details if isinstance(part.provider_details, dict) else {} part = replace(part, provider_details={**existing, **self.provider_details}) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) ``` ### ToolCallPartDelta A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ToolCallPartDelta: """A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID.""" tool_name_delta: str | None = None """Incremental text to add to the existing tool name, if any.""" args_delta: str | dict[str, Any] | None = None """Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. """ tool_call_id: str | None = None """Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised.""" provider_name: str | None = None """The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial ToolCallPart does not have a `provider_name` or it has changed. """ provider_details: dict[str, Any] | None = None """Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. """ part_delta_kind: Literal['tool_call'] = 'tool_call' """Part delta type identifier, used as a discriminator.""" def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart( self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id(), provider_name=self.provider_name, provider_details=self.provider_details, ) @overload def apply(self, part: ModelResponsePart) -> ToolCallPart | BuiltinToolCallPart: ... @overload def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: ... def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart | BuiltinToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts, BuiltinToolCallParts, or ToolCallPartDeltas, not {part}' ) def _apply_to_delta(self, delta: ToolCallPartDelta) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Internal helper to apply this delta to another delta.""" if self.tool_name_delta: # Append incremental text to the existing tool_name_delta updated_tool_name_delta = (delta.tool_name_delta or '') + self.tool_name_delta delta = replace(delta, tool_name_delta=updated_tool_name_delta) if isinstance(self.args_delta, str): if isinstance(delta.args_delta, dict): raise UnexpectedModelBehavior( f'Cannot apply JSON deltas to non-JSON tool arguments ({delta=}, {self=})' ) updated_args_delta = (delta.args_delta or '') + self.args_delta delta = replace(delta, args_delta=updated_args_delta) elif isinstance(self.args_delta, dict): if isinstance(delta.args_delta, str): raise UnexpectedModelBehavior( f'Cannot apply dict deltas to non-dict tool arguments ({delta=}, {self=})' ) updated_args_delta = {**(delta.args_delta or {}), **self.args_delta} delta = replace(delta, args_delta=updated_args_delta) if self.tool_call_id: delta = replace(delta, tool_call_id=self.tool_call_id) if self.provider_name: delta = replace(delta, provider_name=self.provider_name) if self.provider_details: merged_provider_details = {**(delta.provider_details or {}), **self.provider_details} delta = replace(delta, provider_details=merged_provider_details) # If we now have enough data to create a full ToolCallPart, do so if delta.tool_name_delta is not None: return ToolCallPart( delta.tool_name_delta, delta.args_delta, delta.tool_call_id or _generate_tool_call_id(), provider_name=delta.provider_name, provider_details=delta.provider_details, ) return delta def _apply_to_part(self, part: ToolCallPart | BuiltinToolCallPart) -> ToolCallPart | BuiltinToolCallPart: """Internal helper to apply this delta directly to a `ToolCallPart` or `BuiltinToolCallPart`.""" if self.tool_name_delta: # Append incremental text to the existing tool_name tool_name = part.tool_name + self.tool_name_delta part = replace(part, tool_name=tool_name) if isinstance(self.args_delta, str): if isinstance(part.args, dict): raise UnexpectedModelBehavior(f'Cannot apply JSON deltas to non-JSON tool arguments ({part=}, {self=})') updated_json = (part.args or '') + self.args_delta part = replace(part, args=updated_json) elif isinstance(self.args_delta, dict): if isinstance(part.args, str): raise UnexpectedModelBehavior(f'Cannot apply dict deltas to non-dict tool arguments ({part=}, {self=})') updated_dict = {**(part.args or {}), **self.args_delta} part = replace(part, args=updated_dict) if self.tool_call_id: part = replace(part, tool_call_id=self.tool_call_id) if self.provider_name: part = replace(part, provider_name=self.provider_name) if self.provider_details: merged_provider_details = {**(part.provider_details or {}), **self.provider_details} part = replace(part, provider_details=merged_provider_details) return part __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name_delta ```python tool_name_delta: str | None = None ``` Incremental text to add to the existing tool name, if any. #### args_delta ```python args_delta: str | dict[str, Any] | None = None ``` Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. #### tool_call_id ```python tool_call_id: str | None = None ``` Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. This is required to be set when `provider_details` is set and the initial ToolCallPart does not have a `provider_name` or it has changed. #### provider_details ```python provider_details: dict[str, Any] | None = None ``` Additional data returned by the provider that can't be mapped to standard fields. This is used for data that is required to be sent back to APIs, as well as data users may want to access programmatically. When this field is set, `provider_name` is required to identify the provider that generated this data. #### part_delta_kind ```python part_delta_kind: Literal['tool_call'] = 'tool_call' ``` Part delta type identifier, used as a discriminator. #### as_part ```python as_part() -> ToolCallPart | None ``` Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: | Type | Description | | -------------- | ----------- | | \`ToolCallPart | None\` | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart( self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id(), provider_name=self.provider_name, provider_details=self.provider_details, ) ``` #### apply ```python apply( part: ModelResponsePart, ) -> ToolCallPart | BuiltinToolCallPart ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta ``` Apply this delta to a part or delta, returning a new part or delta with the changes applied. Parameters: | Name | Type | Description | Default | | ------ | ------------------- | ------------------- | ---------------------------------------------------- | | `part` | \`ModelResponsePart | ToolCallPartDelta\` | The existing model response part or delta to update. | Returns: | Type | Description | | -------------- | ------------------- | | \`ToolCallPart | BuiltinToolCallPart | Raises: | Type | Description | | ------------------------- | -------------------------------------------------------------------------------- | | `ValueError` | If part is neither a ToolCallPart, BuiltinToolCallPart, nor a ToolCallPartDelta. | | `UnexpectedModelBehavior` | If applying JSON deltas to dict arguments or vice versa. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart | BuiltinToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts, BuiltinToolCallParts, or ToolCallPartDeltas, not {part}' ) ``` ### ModelResponsePartDelta ```python ModelResponsePartDelta = Annotated[ TextPartDelta | ThinkingPartDelta | ToolCallPartDelta, Discriminator("part_delta_kind"), ] ``` A partial update (delta) for any model response part. ### PartStartEvent An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartStartEvent: """An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. """ index: int """The index of the part within the overall response parts list.""" part: ModelResponsePart """The newly started `ModelResponsePart`.""" previous_part_kind: ( Literal['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'file'] | None ) = None """The kind of the previous part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. """ event_kind: Literal['part_start'] = 'part_start' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### part ```python part: ModelResponsePart ``` The newly started `ModelResponsePart`. #### previous_part_kind ```python previous_part_kind: ( Literal[ "text", "thinking", "tool-call", "builtin-tool-call", "builtin-tool-return", "file", ] | None ) = None ``` The kind of the previous part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. #### event_kind ```python event_kind: Literal['part_start'] = 'part_start' ``` Event type identifier, used as a discriminator. ### PartDeltaEvent An event indicating a delta update for an existing part. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartDeltaEvent: """An event indicating a delta update for an existing part.""" index: int """The index of the part within the overall response parts list.""" delta: ModelResponsePartDelta """The delta to apply to the specified part.""" event_kind: Literal['part_delta'] = 'part_delta' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### delta ```python delta: ModelResponsePartDelta ``` The delta to apply to the specified part. #### event_kind ```python event_kind: Literal['part_delta'] = 'part_delta' ``` Event type identifier, used as a discriminator. ### PartEndEvent An event indicating that a part is complete. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartEndEvent: """An event indicating that a part is complete.""" index: int """The index of the part within the overall response parts list.""" part: ModelResponsePart """The complete `ModelResponsePart`.""" next_part_kind: ( Literal['text', 'thinking', 'tool-call', 'builtin-tool-call', 'builtin-tool-return', 'file'] | None ) = None """The kind of the next part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. """ event_kind: Literal['part_end'] = 'part_end' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### part ```python part: ModelResponsePart ``` The complete `ModelResponsePart`. #### next_part_kind ```python next_part_kind: ( Literal[ "text", "thinking", "tool-call", "builtin-tool-call", "builtin-tool-return", "file", ] | None ) = None ``` The kind of the next part, if any. This is useful for UI event streams to know whether to group parts of the same kind together when emitting events. #### event_kind ```python event_kind: Literal['part_end'] = 'part_end' ``` Event type identifier, used as a discriminator. ### FinalResultEvent An event indicating the response to the current model request matches the output schema and will produce a result. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class FinalResultEvent: """An event indicating the response to the current model request matches the output schema and will produce a result.""" tool_name: str | None """The name of the output tool that was called. `None` if the result is from text content and not from a tool.""" tool_call_id: str | None """The tool call ID, if any, that this result is associated with.""" event_kind: Literal['final_result'] = 'final_result' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str | None ``` The name of the output tool that was called. `None` if the result is from text content and not from a tool. #### tool_call_id ```python tool_call_id: str | None ``` The tool call ID, if any, that this result is associated with. #### event_kind ```python event_kind: Literal['final_result'] = 'final_result' ``` Event type identifier, used as a discriminator. ### ModelResponseStreamEvent ```python ModelResponseStreamEvent = Annotated[ PartStartEvent | PartDeltaEvent | PartEndEvent | FinalResultEvent, Discriminator("event_kind"), ] ``` An event in the model response stream, starting a new part, applying a delta to an existing one, indicating a part is complete, or indicating the final result. ### FunctionToolCallEvent An event indicating the start to a call to a function tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolCallEvent: """An event indicating the start to a call to a function tool.""" part: ToolCallPart """The (function) tool call to make.""" _: KW_ONLY event_kind: Literal['function_tool_call'] = 'function_tool_call' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id @property @deprecated('`call_id` is deprecated, use `tool_call_id` instead.') def call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### part ```python part: ToolCallPart ``` The (function) tool call to make. #### event_kind ```python event_kind: Literal["function_tool_call"] = ( "function_tool_call" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used for matching details about the call to its result. #### call_id ```python call_id: str ``` An ID used for matching details about the call to its result. ### FunctionToolResultEvent An event indicating the result of a function tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolResultEvent: """An event indicating the result of a function tool call.""" result: ToolReturnPart | RetryPromptPart """The result of the call to the function tool.""" _: KW_ONLY content: str | Sequence[UserContent] | None = None """The content that will be sent to the model as a UserPromptPart following the result.""" event_kind: Literal['function_tool_result'] = 'function_tool_result' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used to match the result to its original call.""" return self.result.tool_call_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### result ```python result: ToolReturnPart | RetryPromptPart ``` The result of the call to the function tool. #### content ```python content: str | Sequence[UserContent] | None = None ``` The content that will be sent to the model as a UserPromptPart following the result. #### event_kind ```python event_kind: Literal["function_tool_result"] = ( "function_tool_result" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used to match the result to its original call. ### BuiltinToolCallEvent Deprecated `BuiltinToolCallEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolCallPart` instead. An event indicating the start to a call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated( '`BuiltinToolCallEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolCallPart` instead.' ) @dataclass(repr=False) class BuiltinToolCallEvent: """An event indicating the start to a call to a built-in tool.""" part: BuiltinToolCallPart """The built-in tool call to make.""" _: KW_ONLY event_kind: Literal['builtin_tool_call'] = 'builtin_tool_call' """Event type identifier, used as a discriminator.""" ``` #### part ```python part: BuiltinToolCallPart ``` The built-in tool call to make. #### event_kind ```python event_kind: Literal["builtin_tool_call"] = ( "builtin_tool_call" ) ``` Event type identifier, used as a discriminator. ### BuiltinToolResultEvent Deprecated `BuiltinToolResultEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolReturnPart` instead. An event indicating the result of a built-in tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated( '`BuiltinToolResultEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolReturnPart` instead.' ) @dataclass(repr=False) class BuiltinToolResultEvent: """An event indicating the result of a built-in tool call.""" result: BuiltinToolReturnPart """The result of the call to the built-in tool.""" _: KW_ONLY event_kind: Literal['builtin_tool_result'] = 'builtin_tool_result' """Event type identifier, used as a discriminator.""" ``` #### result ```python result: BuiltinToolReturnPart ``` The result of the call to the built-in tool. #### event_kind ```python event_kind: Literal["builtin_tool_result"] = ( "builtin_tool_result" ) ``` Event type identifier, used as a discriminator. ### HandleResponseEvent ```python HandleResponseEvent = Annotated[ FunctionToolCallEvent | FunctionToolResultEvent | BuiltinToolCallEvent | BuiltinToolResultEvent, Discriminator("event_kind"), ] ``` An event yielded when handling a model response, indicating tool calls and results. ### AgentStreamEvent ```python AgentStreamEvent = Annotated[ ModelResponseStreamEvent | HandleResponseEvent, Discriminator("event_kind"), ] ``` An event in the agent stream: model response stream events and response-handling events. # `pydantic_ai.output` ### OutputDataT ```python OutputDataT = TypeVar( "OutputDataT", default=str, covariant=True ) ``` Covariant type variable for the output data type of a run. ### ToolOutput Bases: `Generic[OutputDataT]` Marker class to use a tool for output and optionally customize the tool. Example: tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-5.2', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class ToolOutput(Generic[OutputDataT]): """Marker class to use a tool for output and optionally customize the tool. Example: ```python {title="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-5.2', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` """ output: OutputTypeOrFunction[OutputDataT] """An output type or function.""" name: str | None """The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name.""" description: str | None """The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used.""" max_retries: int | None """The maximum number of retries for the tool.""" strict: bool | None """Whether to use strict mode for the tool.""" def __init__( self, type_: OutputTypeOrFunction[OutputDataT], *, name: str | None = None, description: str | None = None, max_retries: int | None = None, strict: bool | None = None, ): self.output = type_ self.name = name self.description = description self.max_retries = max_retries self.strict = strict ```` #### output ```python output: OutputTypeOrFunction[OutputDataT] = type_ ``` An output type or function. #### name ```python name: str | None = name ``` The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name. #### description ```python description: str | None = description ``` The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used. #### max_retries ```python max_retries: int | None = max_retries ``` The maximum number of retries for the tool. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the tool. ### NativeOutput Bases: `Generic[OutputDataT]` Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class NativeOutput(Generic[OutputDataT]): """Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: ```python {title="native_output.py" requires="tool_output.py"} from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-5.2', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" strict: bool | None """Whether to use strict mode for the output, if the model supports it.""" template: str | None """Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If no template is specified but the model's profile indicates that it requires the schema to be sent as a prompt, the default template specified on the profile will be used. """ def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, strict: bool | None = None, template: str | None = None, ): self.outputs = outputs self.name = name self.description = description self.strict = strict self.template = template ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the output, if the model supports it. #### template ```python template: str | None = template ``` Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If no template is specified but the model's profile indicates that it requires the schema to be sent as a prompt, the default template specified on the profile will be used. ### PromptedOutput Bases: `Generic[OutputDataT]` Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class PromptedOutput(Generic[OutputDataT]): """Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: ```python {title="prompted_output.py" requires="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-5.2', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" template: str | None """Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. """ def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, template: str | None = None, ): self.outputs = outputs self.name = name self.description = description self.template = template ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### template ```python template: str | None = template ``` Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. ### TextOutput Bases: `Generic[OutputDataT]` Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass class TextOutput(Generic[OutputDataT]): """Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-5.2', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` """ output_function: TextOutputFunc[OutputDataT] """The function that will be called to process the model's plain text output. The function must take a single string argument.""" ```` #### output_function ```python output_function: TextOutputFunc[OutputDataT] ``` The function that will be called to process the model's plain text output. The function must take a single string argument. ### StructuredDict ```python StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None, ) -> type[JsonSchemaValue] ``` Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Parameters: | Name | Type | Description | Default | | ------------- | ----------------- | ------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------- | | `json_schema` | `JsonSchemaValue` | A JSON schema of type object defining the structure of the dictionary content. | *required* | | `name` | \`str | None\` | Optional name of the structured output. If not provided, the title field of the JSON schema will be used if it's present. | | `description` | \`str | None\` | Optional description of the structured output. If not provided, the description field of the JSON schema will be used if it's present. | Example: structured_dict.py ```python from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-5.2', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python def StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None ) -> type[JsonSchemaValue]: """Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Args: json_schema: A JSON schema of type `object` defining the structure of the dictionary content. name: Optional name of the structured output. If not provided, the `title` field of the JSON schema will be used if it's present. description: Optional description of the structured output. If not provided, the `description` field of the JSON schema will be used if it's present. Example: ```python {title="structured_dict.py"} from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-5.2', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` """ json_schema = _utils.check_object_json_schema(json_schema) # Pydantic `TypeAdapter` fails when `object.__get_pydantic_json_schema__` has `$defs`, so we inline them # See https://github.com/pydantic/pydantic/issues/12145 if '$defs' in json_schema: json_schema = InlineDefsJsonSchemaTransformer(json_schema).walk() if '$defs' in json_schema: raise exceptions.UserError( '`StructuredDict` does not currently support recursive `$ref`s and `$defs`. See https://github.com/pydantic/pydantic/issues/12145 for more information.' ) if name: json_schema['title'] = name if description: json_schema['description'] = description class _StructuredDict(JsonSchemaValue): __is_model_like__ = True @classmethod def __get_pydantic_core_schema__( cls, source_type: Any, handler: GetCoreSchemaHandler ) -> core_schema.CoreSchema: return core_schema.dict_schema( keys_schema=core_schema.str_schema(), values_schema=core_schema.any_schema(), ) @classmethod def __get_pydantic_json_schema__( cls, core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler ) -> JsonSchemaValue: return json_schema return _StructuredDict ```` ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list[ToolCallPart]) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list[ToolCallPart]) """Tool calls that require human-in-the-loop approval.""" metadata: dict[str, dict[str, Any]] = field(default_factory=dict[str, dict[str, Any]]) """Metadata for deferred tool calls, keyed by `tool_call_id`.""" ``` #### calls ```python calls: list[ToolCallPart] = field( default_factory=list[ToolCallPart] ) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field( default_factory=list[ToolCallPart] ) ``` Tool calls that require human-in-the-loop approval. #### metadata ```python metadata: dict[str, dict[str, Any]] = field( default_factory=dict[str, dict[str, Any]] ) ``` Metadata for deferred tool calls, keyed by `tool_call_id`. # `pydantic_ai.profiles` Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @dataclass(kw_only=True) class ModelProfile: """Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used.""" supports_tools: bool = True """Whether the model supports tools.""" supports_json_schema_output: bool = False """Whether the model supports JSON schema output. This is also referred to as 'native' support for structured output. Relates to the `NativeOutput` output type. """ supports_json_object_output: bool = False """Whether the model supports a dedicated mode to enforce JSON output, without necessarily sending a schema. E.g. [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode) Relates to the `PromptedOutput` output type. """ supports_image_output: bool = False """Whether the model supports image output.""" default_structured_output_mode: StructuredOutputMode = 'tool' """The default structured output mode to use for the model.""" prompted_output_template: str = dedent( """ Always respond with a JSON object that's compatible with this schema: {schema} Don't include any text or Markdown fencing before or after. """ ) """The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output.""" native_output_requires_schema_in_instructions: bool = False """Whether to add prompted output template in native structured output mode""" json_schema_transformer: type[JsonSchemaTransformer] | None = None """The transformer to use to make JSON schemas for tools and structured output compatible with the model.""" thinking_tags: tuple[str, str] = ('', '') """The tags used to indicate thinking parts in the model's output. Defaults to ('', '').""" ignore_streamed_leading_whitespace: bool = False """Whether to ignore leading whitespace when streaming a response. This is a workaround for models that emit `\n\n\n` or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using `run_stream` with `str` a valid `output_type`. This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. """ supported_builtin_tools: frozenset[type[AbstractBuiltinTool]] = field( default_factory=lambda: SUPPORTED_BUILTIN_TOOLS ) """The set of builtin tool types that this model/profile supports. Defaults to ALL builtin tools. Profile functions should explicitly restrict this based on model capabilities. """ @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### supports_tools ```python supports_tools: bool = True ``` Whether the model supports tools. ### supports_json_schema_output ```python supports_json_schema_output: bool = False ``` Whether the model supports JSON schema output. This is also referred to as 'native' support for structured output. Relates to the `NativeOutput` output type. ### supports_json_object_output ```python supports_json_object_output: bool = False ``` Whether the model supports a dedicated mode to enforce JSON output, without necessarily sending a schema. E.g. [OpenAI's JSON mode](https://platform.openai.com/docs/guides/structured-outputs#json-mode) Relates to the `PromptedOutput` output type. ### supports_image_output ```python supports_image_output: bool = False ``` Whether the model supports image output. ### default_structured_output_mode ```python default_structured_output_mode: StructuredOutputMode = ( "tool" ) ``` The default structured output mode to use for the model. ### prompted_output_template ```python prompted_output_template: str = dedent( "\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n " ) ``` The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output. ### native_output_requires_schema_in_instructions ```python native_output_requires_schema_in_instructions: bool = False ``` Whether to add prompted output template in native structured output mode ### json_schema_transformer ```python json_schema_transformer: ( type[JsonSchemaTransformer] | None ) = None ``` The transformer to use to make JSON schemas for tools and structured output compatible with the model. ### thinking_tags ```python thinking_tags: tuple[str, str] = ('', '') ``` The tags used to indicate thinking parts in the model's output. Defaults to ('', ''). ### ignore_streamed_leading_whitespace ```python ignore_streamed_leading_whitespace: bool = False ``` Whether to ignore leading whitespace when streaming a response. ```text This is a workaround for models that emit ` ``` `or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using`run_stream`with`str`a valid`output_type\`. ```text This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. ``` ### supported_builtin_tools ```python supported_builtin_tools: frozenset[ type[AbstractBuiltinTool] ] = field(default_factory=lambda: SUPPORTED_BUILTIN_TOOLS) ``` The set of builtin tool types that this model/profile supports. Defaults to ALL builtin tools. Profile functions should explicitly restrict this based on model capabilities. ### from_profile ```python from_profile(profile: ModelProfile | None) -> Self ``` Build a ModelProfile subclass instance from a ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) ``` ### update ```python update(profile: ModelProfile | None) -> Self ``` Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### SAMPLING_PARAMS ```python SAMPLING_PARAMS = ( "temperature", "top_p", "presence_penalty", "frequency_penalty", "logit_bias", "openai_logprobs", "openai_top_logprobs", ) ``` Sampling parameter names that are incompatible with reasoning. These parameters are not supported when reasoning is enabled (reasoning_effort != 'none'). See https://platform.openai.com/docs/guides/reasoning for details. ### OpenAIModelProfile Bases: `ModelProfile` Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(kw_only=True) class OpenAIModelProfile(ModelProfile): """Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_chat_thinking_field: str | None = None """Non-standard field name used by some providers for model thinking content in Chat Completions API responses. Plenty of providers use custom field names for thinking content. Ollama and newer versions of vLLM use `reasoning`, while DeepSeek, older vLLM and some others use `reasoning_content`. Notice that the thinking field configured here is currently limited to `str` type content. If `openai_chat_send_back_thinking_parts` is set to `'field'`, this field must be set to a non-None value.""" openai_chat_send_back_thinking_parts: Literal['auto', 'tags', 'field', False] = 'auto' """Whether the model includes thinking content in requests. This can be: * `'auto'` (default): Automatically detects how to send thinking content. If thinking was received in a custom field (tracked via `ThinkingPart.id` and `ThinkingPart.provider_name`), it's sent back in that same field. Otherwise, it's sent using tags. Only the `reasoning` and `reasoning_content` fields are checked by default when receiving responses. If your provider uses a different field name, you must explicitly set `openai_chat_thinking_field` to that field name. * `'tags'`: The thinking content is included in the main `content` field, enclosed within thinking tags as specified in `thinking_tags` profile option. * `'field'`: The thinking content is included in a separate field specified by `openai_chat_thinking_field`. * `False`: No thinking content is sent in the request. Defaults to `'auto'` to ensure thinking is sent back in the format expected by the model/provider.""" openai_supports_strict_tool_definition: bool = True """This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions.""" openai_supports_sampling_settings: bool = True """Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models.""" openai_unsupported_model_settings: Sequence[str] = () """A list of model settings that are not supported by this model.""" # Some OpenAI-compatible providers (e.g. MoonshotAI) currently do **not** accept # `tool_choice="required"`. This flag lets the calling model know whether it's # safe to pass that value along. Default is `True` to preserve existing # behaviour for OpenAI itself and most providers. openai_supports_tool_choice_required: bool = True """Whether the provider accepts the value ``tool_choice='required'`` in the request payload.""" openai_system_prompt_role: OpenAISystemPromptRole | None = None """The role to use for the system prompt message. If not provided, defaults to `'system'`.""" openai_chat_supports_web_search: bool = False """Whether the model supports web search in Chat Completions API.""" openai_chat_audio_input_encoding: Literal['base64', 'uri'] = 'base64' """The encoding to use for audio input in Chat Completions requests. - `'base64'`: Raw base64 encoded string. (Default, used by OpenAI) - `'uri'`: Data URI (e.g. `data:audio/wav;base64,...`). """ openai_chat_supports_file_urls: bool = False """Whether the Chat API supports file URLs directly in the `file_data` field. OpenAI's native Chat API only supports base64-encoded data, but some providers like OpenRouter support passing URLs directly. """ openai_supports_encrypted_reasoning_content: bool = False """Whether the model supports including encrypted reasoning content in the response.""" openai_supports_reasoning: bool = False """Whether the model supports reasoning (o-series, GPT-5+). When True, sampling parameters may need to be dropped depending on reasoning_effort setting.""" openai_supports_reasoning_effort_none: bool = False """Whether the model supports sampling parameters (temperature, top_p, etc.) when reasoning_effort='none'. Models like GPT-5.1 and GPT-5.2 default to reasoning_effort='none' and support sampling params in that mode. When reasoning is enabled (low/medium/high/xhigh), sampling params are not supported.""" openai_responses_requires_function_call_status_none: bool = False """Whether the Responses API requires the `status` field on function tool calls to be `None`. This is required by vLLM Responses API versions before https://github.com/vllm-project/vllm/pull/26706. See https://github.com/pydantic/pydantic-ai/issues/3245 for more details. """ def __post_init__(self): # pragma: no cover if not self.openai_supports_sampling_settings: warnings.warn( 'The `openai_supports_sampling_settings` has no effect, and it will be removed in future versions. ' 'Use `openai_unsupported_model_settings` instead.', DeprecationWarning, ) if self.openai_chat_send_back_thinking_parts == 'field' and not self.openai_chat_thinking_field: raise UserError( 'If `openai_chat_send_back_thinking_parts` is "field", ' '`openai_chat_thinking_field` must be set to a non-None value.' ) ``` #### openai_chat_thinking_field ```python openai_chat_thinking_field: str | None = None ``` Non-standard field name used by some providers for model thinking content in Chat Completions API responses. Plenty of providers use custom field names for thinking content. Ollama and newer versions of vLLM use `reasoning`, while DeepSeek, older vLLM and some others use `reasoning_content`. Notice that the thinking field configured here is currently limited to `str` type content. If `openai_chat_send_back_thinking_parts` is set to `'field'`, this field must be set to a non-None value. #### openai_chat_send_back_thinking_parts ```python openai_chat_send_back_thinking_parts: Literal[ "auto", "tags", "field", False ] = "auto" ``` Whether the model includes thinking content in requests. This can be: * `'auto'` (default): Automatically detects how to send thinking content. If thinking was received in a custom field (tracked via `ThinkingPart.id` and `ThinkingPart.provider_name`), it's sent back in that same field. Otherwise, it's sent using tags. Only the `reasoning` and `reasoning_content` fields are checked by default when receiving responses. If your provider uses a different field name, you must explicitly set `openai_chat_thinking_field` to that field name. * `'tags'`: The thinking content is included in the main `content` field, enclosed within thinking tags as specified in `thinking_tags` profile option. * `'field'`: The thinking content is included in a separate field specified by `openai_chat_thinking_field`. * `False`: No thinking content is sent in the request. Defaults to `'auto'` to ensure thinking is sent back in the format expected by the model/provider. #### openai_supports_strict_tool_definition ```python openai_supports_strict_tool_definition: bool = True ``` This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions. #### openai_supports_sampling_settings ```python openai_supports_sampling_settings: bool = True ``` Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models. #### openai_unsupported_model_settings ```python openai_unsupported_model_settings: Sequence[str] = () ``` A list of model settings that are not supported by this model. #### openai_supports_tool_choice_required ```python openai_supports_tool_choice_required: bool = True ``` Whether the provider accepts the value `tool_choice='required'` in the request payload. #### openai_system_prompt_role ```python openai_system_prompt_role: OpenAISystemPromptRole | None = ( None ) ``` The role to use for the system prompt message. If not provided, defaults to `'system'`. #### openai_chat_supports_web_search ```python openai_chat_supports_web_search: bool = False ``` Whether the model supports web search in Chat Completions API. #### openai_chat_audio_input_encoding ```python openai_chat_audio_input_encoding: Literal[ "base64", "uri" ] = "base64" ``` The encoding to use for audio input in Chat Completions requests. - `'base64'`: Raw base64 encoded string. (Default, used by OpenAI) - `'uri'`: Data URI (e.g. `data:audio/wav;base64,...`). #### openai_chat_supports_file_urls ```python openai_chat_supports_file_urls: bool = False ``` Whether the Chat API supports file URLs directly in the `file_data` field. OpenAI's native Chat API only supports base64-encoded data, but some providers like OpenRouter support passing URLs directly. #### openai_supports_encrypted_reasoning_content ```python openai_supports_encrypted_reasoning_content: bool = False ``` Whether the model supports including encrypted reasoning content in the response. #### openai_supports_reasoning ```python openai_supports_reasoning: bool = False ``` Whether the model supports reasoning (o-series, GPT-5+). When True, sampling parameters may need to be dropped depending on reasoning_effort setting. #### openai_supports_reasoning_effort_none ```python openai_supports_reasoning_effort_none: bool = False ``` Whether the model supports sampling parameters (temperature, top_p, etc.) when reasoning_effort='none'. Models like GPT-5.1 and GPT-5.2 default to reasoning_effort='none' and support sampling params in that mode. When reasoning is enabled (low/medium/high/xhigh), sampling params are not supported. #### openai_responses_requires_function_call_status_none ```python openai_responses_requires_function_call_status_none: ( bool ) = False ``` Whether the Responses API requires the `status` field on function tool calls to be `None`. This is required by vLLM Responses API versions before https://github.com/vllm-project/vllm/pull/26706. See https://github.com/pydantic/pydantic-ai/issues/3245 for more details. ### openai_model_profile ```python openai_model_profile(model_name: str) -> ModelProfile ``` Get the model profile for an OpenAI model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python def openai_model_profile(model_name: str) -> ModelProfile: """Get the model profile for an OpenAI model.""" # GPT-5.1+ models use `reasoning={"effort": "none"}` by default, which allows sampling params. is_gpt_5_1_plus = model_name.startswith(('gpt-5.1', 'gpt-5.2')) # doesn't support `reasoning={"effort": "none"}` - default is set at 'medium' # see https://platform.openai.com/docs/guides/reasoning is_gpt_5 = model_name.startswith('gpt-5') and not is_gpt_5_1_plus # always reasoning is_o_series = model_name.startswith('o') thinking_always_enabled = is_o_series or (is_gpt_5 and 'gpt-5-chat' not in model_name) supports_reasoning = thinking_always_enabled or is_gpt_5_1_plus # The o1-mini model doesn't support the `system` role, so we default to `user`. # See https://github.com/pydantic/pydantic-ai/issues/974 for more details. openai_system_prompt_role = 'user' if model_name.startswith('o1-mini') else None # Check if the model supports web search (only specific search-preview models) supports_web_search = '-search-preview' in model_name supports_image_output = ( is_gpt_5 or is_gpt_5_1_plus or 'o3' in model_name or '4.1' in model_name or '4o' in model_name ) # Structured Outputs (output mode 'native') is only supported with the gpt-4o-mini, gpt-4o-mini-2024-07-18, # and gpt-4o-2024-08-06 model snapshots and later. We leave it in here for all models because the # `default_structured_output_mode` is `'tool'`, so `native` is only used when the user specifically uses # the `NativeOutput` marker, so an error from the API is acceptable. return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, supports_json_schema_output=True, supports_json_object_output=True, supports_image_output=supports_image_output, openai_system_prompt_role=openai_system_prompt_role, openai_chat_supports_web_search=supports_web_search, openai_supports_encrypted_reasoning_content=supports_reasoning, openai_supports_reasoning=supports_reasoning, openai_supports_reasoning_effort_none=is_gpt_5_1_plus, ) ``` ### OpenAIJsonSchemaTransformer Bases: `JsonSchemaTransformer` Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(init=False) class OpenAIJsonSchemaTransformer(JsonSchemaTransformer): """Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required """ def __init__(self, schema: JsonSchema, *, strict: bool | None = None): super().__init__(schema, strict=strict) self.root_ref = schema.get('$ref') def walk(self) -> JsonSchema: # Note: OpenAI does not support anyOf at the root in strict mode # However, we don't need to check for it here because we ensure in pydantic_ai._utils.check_object_json_schema # that the root schema either has type 'object' or is recursive. result = super().walk() # For recursive models, we need to tweak the schema to make it compatible with strict mode. # Because the following should never change the semantics of the schema we apply it unconditionally. if self.root_ref is not None: result.pop('$ref', None) # We replace references to the self.root_ref with just '#' in the transform method root_key = re.sub(r'^#/\$defs/', '', self.root_ref) result.update(self.defs.get(root_key) or {}) return result def transform(self, schema: JsonSchema) -> JsonSchema: # noqa: C901 # Remove unnecessary keys schema.pop('title', None) schema.pop('$schema', None) schema.pop('discriminator', None) default = schema.get('default', _sentinel) if default is not _sentinel: # the "default" keyword is not allowed in strict mode, but including it makes some Ollama models behave # better, so we keep it around when not strict if self.strict is True: schema.pop('default', None) elif self.strict is None: # pragma: no branch self.is_strict_compatible = False if schema_ref := schema.get('$ref'): if schema_ref == self.root_ref: schema['$ref'] = '#' if len(schema) > 1: # OpenAI Strict mode doesn't support siblings to "$ref", but _does_ allow siblings to "anyOf". # So if there is a "description" field or any other extra info, we move the "$ref" into an "anyOf": schema['anyOf'] = [{'$ref': schema.pop('$ref')}] # Track strict-incompatible keys incompatible_values: dict[str, Any] = {} for key in _STRICT_INCOMPATIBLE_KEYS: value = schema.get(key, _sentinel) if value is not _sentinel: incompatible_values[key] = value if format := schema.get('format'): if format not in _STRICT_COMPATIBLE_STRING_FORMATS: incompatible_values['format'] = format description = schema.get('description') if incompatible_values: if self.strict is True: notes: list[str] = [] for key, value in incompatible_values.items(): schema.pop(key) notes.append(f'{key}={value}') notes_string = ', '.join(notes) schema['description'] = notes_string if not description else f'{description} ({notes_string})' elif self.strict is None: # pragma: no branch self.is_strict_compatible = False schema_type = schema.get('type') if 'oneOf' in schema: # OpenAI does not support oneOf in strict mode if self.strict is True: schema['anyOf'] = schema.pop('oneOf') else: self.is_strict_compatible = False if schema_type == 'object': # Always ensure 'properties' key exists - OpenAI drops objects without it if 'properties' not in schema: schema['properties'] = dict[str, Any]() if self.strict is True: # additional properties are disallowed schema['additionalProperties'] = False # all properties are required schema['required'] = list(schema['properties'].keys()) elif self.strict is None: if schema.get('additionalProperties', None) not in (None, False): self.is_strict_compatible = False else: # additional properties are disallowed by default schema['additionalProperties'] = False if 'properties' not in schema or 'required' not in schema: self.is_strict_compatible = False else: required = schema['required'] for k in schema['properties'].keys(): if k not in required: self.is_strict_compatible = False return schema ``` ### anthropic_model_profile ```python anthropic_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Anthropic model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/anthropic.py` ```python def anthropic_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Anthropic model.""" models_that_support_json_schema_output = ( 'claude-haiku-4-5', 'claude-sonnet-4-5', 'claude-opus-4-1', 'claude-opus-4-5', 'claude-opus-4-6', ) """These models support both structured outputs and strict tool calling.""" # TODO update when new models are released that support structured outputs # https://docs.claude.com/en/docs/build-with-claude/structured-outputs#example-usage supports_json_schema_output = model_name.startswith(models_that_support_json_schema_output) return ModelProfile( thinking_tags=('', ''), supports_json_schema_output=supports_json_schema_output, ) ``` ### GoogleModelProfile Bases: `ModelProfile` Profile for models used with `GoogleModel`. ALL FIELDS MUST BE `google_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python @dataclass(kw_only=True) class GoogleModelProfile(ModelProfile): """Profile for models used with `GoogleModel`. ALL FIELDS MUST BE `google_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ google_supports_native_output_with_builtin_tools: bool = False """Whether the model supports native output with builtin tools. See https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools""" ``` #### google_supports_native_output_with_builtin_tools ```python google_supports_native_output_with_builtin_tools: bool = ( False ) ``` Whether the model supports native output with builtin tools. See https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools ### google_model_profile ```python google_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Google model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python def google_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Google model.""" is_image_model = 'image' in model_name is_3_or_newer = 'gemini-3' in model_name return GoogleModelProfile( json_schema_transformer=GoogleJsonSchemaTransformer, supports_image_output=is_image_model, supports_json_schema_output=is_3_or_newer or not is_image_model, supports_json_object_output=is_3_or_newer or not is_image_model, supports_tools=not is_image_model, google_supports_native_output_with_builtin_tools=is_3_or_newer, ) ``` ### GoogleJsonSchemaTransformer Bases: `JsonSchemaTransformer` Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini supports [a subset of OpenAPI v3.0.3](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations). Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python class GoogleJsonSchemaTransformer(JsonSchemaTransformer): """Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini supports [a subset of OpenAPI v3.0.3](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations). """ def transform(self, schema: JsonSchema) -> JsonSchema: # Remove properties not supported by Gemini schema.pop('$schema', None) if (const := schema.pop('const', None)) is not None: # Gemini doesn't support const, but it does support enum with a single value schema['enum'] = [const] # If type is not present, infer it from the const value for Gemini API compatibility if 'type' not in schema: if isinstance(const, str): schema['type'] = 'string' elif isinstance(const, bool): # bool must be checked before int since bool is a subclass of int in Python schema['type'] = 'boolean' elif isinstance(const, int): schema['type'] = 'integer' elif isinstance(const, float): schema['type'] = 'number' schema.pop('discriminator', None) schema.pop('examples', None) # Remove 'title' due to https://github.com/googleapis/python-genai/issues/1732 schema.pop('title', None) type_ = schema.get('type') if type_ == 'string' and (fmt := schema.pop('format', None)): description = schema.get('description') if description: schema['description'] = f'{description} (format: {fmt})' else: schema['description'] = f'Format: {fmt}' # Note: exclusiveMinimum/exclusiveMaximum are NOT yet supported schema.pop('exclusiveMinimum', None) schema.pop('exclusiveMaximum', None) return schema ``` ### meta_model_profile ```python meta_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Meta model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/meta.py` ```python def meta_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### amazon_model_profile ```python amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/amazon.py` ```python def amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### deepseek_model_profile ```python deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/deepseek.py` ```python def deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model.""" return ModelProfile(ignore_streamed_leading_whitespace='r1' in model_name) ``` ### GrokModelProfile Bases: `ModelProfile` Profile for Grok models (used with both GrokProvider and XaiProvider). ALL FIELDS MUST BE `grok_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/profiles/grok.py` ```python @dataclass(kw_only=True) class GrokModelProfile(ModelProfile): """Profile for Grok models (used with both GrokProvider and XaiProvider). ALL FIELDS MUST BE `grok_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ grok_supports_builtin_tools: bool = False """Whether the model supports builtin tools (web_search, code_execution, mcp).""" grok_supports_tool_choice_required: bool = True """Whether the provider accepts the value ``tool_choice='required'`` in the request payload.""" ``` #### grok_supports_builtin_tools ```python grok_supports_builtin_tools: bool = False ``` Whether the model supports builtin tools (web_search, code_execution, mcp). #### grok_supports_tool_choice_required ```python grok_supports_tool_choice_required: bool = True ``` Whether the provider accepts the value `tool_choice='required'` in the request payload. ### grok_model_profile ```python grok_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Grok model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/grok.py` ```python def grok_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Grok model.""" # Grok-4 models support builtin tools grok_supports_builtin_tools = model_name.startswith('grok-4') or 'code' in model_name # Set supported builtin tools based on model capability supported_builtin_tools: frozenset[type[AbstractBuiltinTool]] = ( SUPPORTED_BUILTIN_TOOLS if grok_supports_builtin_tools else frozenset() ) return GrokModelProfile( # xAI supports tool calling supports_tools=True, # xAI supports JSON schema output for structured responses supports_json_schema_output=True, # xAI supports JSON object output supports_json_object_output=True, # Support for builtin tools (web_search, code_execution, mcp) grok_supports_builtin_tools=grok_supports_builtin_tools, supported_builtin_tools=supported_builtin_tools, ) ``` ### mistral_model_profile ```python mistral_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Mistral model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/mistral.py` ```python def mistral_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Mistral model.""" return None ``` ### qwen_model_profile ```python qwen_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Qwen model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/qwen.py` ```python def qwen_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Qwen model.""" if model_name.startswith('qwen-3-coder'): return OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, openai_supports_tool_choice_required=False, openai_supports_strict_tool_definition=False, ignore_streamed_leading_whitespace=True, ) return ModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, ignore_streamed_leading_whitespace=True, ) ``` # `pydantic_ai.providers` Bases: `ABC`, `Generic[InterfaceClient]` Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python class Provider(ABC, Generic[InterfaceClient]): """Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. """ _client: InterfaceClient @property @abstractmethod def name(self) -> str: """The provider name.""" raise NotImplementedError() @property @abstractmethod def base_url(self) -> str: """The base URL for the provider API.""" raise NotImplementedError() @property @abstractmethod def client(self) -> InterfaceClient: """The client for the provider.""" raise NotImplementedError() def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover def __repr__(self) -> str: return f'{self.__class__.__name__}(name={self.name}, base_url={self.base_url})' # pragma: lax no cover ``` ### name ```python name: str ``` The provider name. ### base_url ```python base_url: str ``` The base URL for the provider API. ### client ```python client: InterfaceClient ``` The client for the provider. ### model_profile ```python model_profile(model_name: str) -> ModelProfile | None ``` The model profile for the named model, if available. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover ``` Create a new Gateway provider. Parameters: | Name | Type | Description | Default | | ------------------- | ------------------ | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `upstream_provider` | \`UpstreamProvider | str\` | The upstream provider to use. | | `route` | \`str | None\` | The name of the provider or routing group to use to handle the request. If not provided, the default routing group for the API format will be used. | | `api_key` | \`str | None\` | The API key to use for authentication. If not provided, the PYDANTIC_AI_GATEWAY_API_KEY environment variable will be used if available. | | `base_url` | \`str | None\` | The base URL to use for the Gateway. If not provided, the PYDANTIC_AI_GATEWAY_BASE_URL environment variable will be used if available. Otherwise, defaults to https://gateway.pydantic.dev/proxy. | | `http_client` | \`AsyncClient | None\` | The HTTP client to use for the Gateway. | Source code in `pydantic_ai_slim/pydantic_ai/providers/gateway.py` ```python def gateway_provider( upstream_provider: UpstreamProvider | str, /, *, # Every provider route: str | None = None, api_key: str | None = None, base_url: str | None = None, # OpenAI, Groq, Anthropic & Gemini - Only Bedrock doesn't have an HTTPX client. http_client: httpx.AsyncClient | None = None, ) -> Provider[Any]: """Create a new Gateway provider. Args: upstream_provider: The upstream provider to use. route: The name of the provider or routing group to use to handle the request. If not provided, the default routing group for the API format will be used. api_key: The API key to use for authentication. If not provided, the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable will be used if available. base_url: The base URL to use for the Gateway. If not provided, the `PYDANTIC_AI_GATEWAY_BASE_URL` environment variable will be used if available. Otherwise, defaults to `https://gateway.pydantic.dev/proxy`. http_client: The HTTP client to use for the Gateway. """ api_key = api_key or os.getenv('PYDANTIC_AI_GATEWAY_API_KEY', os.getenv('PAIG_API_KEY')) if not api_key: raise UserError( 'Set the `PYDANTIC_AI_GATEWAY_API_KEY` environment variable or pass it via `gateway_provider(..., api_key=...)`' ' to use the Pydantic AI Gateway provider.' ) base_url = ( base_url or os.getenv('PYDANTIC_AI_GATEWAY_BASE_URL', os.getenv('PAIG_BASE_URL')) or _infer_base_url(api_key) ) http_client = http_client or cached_async_http_client(provider=f'gateway/{upstream_provider}') http_client.event_hooks = {'request': [_request_hook(api_key)]} if route is None: # Use the implied providerId as the default route. route = normalize_gateway_provider(upstream_provider) base_url = _merge_url_path(base_url, route) if upstream_provider in ('openai', 'openai-chat', 'openai-responses', 'chat', 'responses'): from .openai import OpenAIProvider return OpenAIProvider(api_key=api_key, base_url=base_url, http_client=http_client) elif upstream_provider == 'groq': from .groq import GroqProvider return GroqProvider(api_key=api_key, base_url=base_url, http_client=http_client) elif upstream_provider == 'anthropic': from anthropic import AsyncAnthropic from .anthropic import AnthropicProvider return AnthropicProvider( anthropic_client=AsyncAnthropic(auth_token=api_key, base_url=base_url, http_client=http_client) ) elif upstream_provider in ('bedrock', 'converse'): from .bedrock import BedrockProvider return BedrockProvider( api_key=api_key, base_url=base_url, region_name='pydantic-ai-gateway', # Fake region name to avoid NoRegionError ) elif upstream_provider in ('google-vertex', 'gemini'): from .google import GoogleProvider return GoogleProvider(vertexai=True, api_key=api_key, base_url=base_url, http_client=http_client) else: raise UserError(f'Unknown upstream provider: {upstream_provider}') ``` Bases: `Provider[AsyncAnthropicClient]` Provider for Anthropic API. Source code in `pydantic_ai_slim/pydantic_ai/providers/anthropic.py` ```python class AnthropicProvider(Provider[AsyncAnthropicClient]): """Provider for Anthropic API.""" @property def name(self) -> str: return 'anthropic' @property def base_url(self) -> str: return str(self._client.base_url) @property def client(self) -> AsyncAnthropicClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = anthropic_model_profile(model_name) return ModelProfile(json_schema_transformer=AnthropicJsonSchemaTransformer).update(profile) @overload def __init__(self, *, anthropic_client: AsyncAnthropicClient | None = None) -> None: ... @overload def __init__( self, *, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None ) -> None: ... def __init__( self, *, api_key: str | None = None, base_url: str | None = None, anthropic_client: AsyncAnthropicClient | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Anthropic provider. Args: api_key: The API key to use for authentication, if not provided, the `ANTHROPIC_API_KEY` environment variable will be used if available. base_url: The base URL to use for the Anthropic API. anthropic_client: An existing Anthropic client to use. Accepts [`AsyncAnthropic`](https://github.com/anthropics/anthropic-sdk-python), [`AsyncAnthropicBedrock`](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock), [`AsyncAnthropicFoundry`](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry), or [`AsyncAnthropicVertex`](https://docs.anthropic.com/en/api/claude-on-vertex-ai). If provided, the `api_key` and `http_client` arguments will be ignored. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if anthropic_client is not None: assert http_client is None, 'Cannot provide both `anthropic_client` and `http_client`' assert api_key is None, 'Cannot provide both `anthropic_client` and `api_key`' self._client = anthropic_client else: api_key = api_key or os.getenv('ANTHROPIC_API_KEY') if not api_key: raise UserError( 'Set the `ANTHROPIC_API_KEY` environment variable or pass it via `AnthropicProvider(api_key=...)`' 'to use the Anthropic provider.' ) if http_client is not None: self._client = AsyncAnthropic(api_key=api_key, base_url=base_url, http_client=http_client) else: http_client = cached_async_http_client(provider='anthropic') self._client = AsyncAnthropic(api_key=api_key, base_url=base_url, http_client=http_client) ``` ### __init__ ```python __init__( *, anthropic_client: AsyncAnthropicClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, anthropic_client: AsyncAnthropicClient | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Anthropic provider. Parameters: | Name | Type | Description | Default | | ------------------ | ---------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the ANTHROPIC_API_KEY environment variable will be used if available. | | `base_url` | \`str | None\` | The base URL to use for the Anthropic API. | | `anthropic_client` | \`AsyncAnthropicClient | None\` | An existing Anthropic client to use. Accepts AsyncAnthropic, AsyncAnthropicBedrock, AsyncAnthropicFoundry, or AsyncAnthropicVertex. If provided, the api_key and http_client arguments will be ignored. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/anthropic.py` ```python def __init__( self, *, api_key: str | None = None, base_url: str | None = None, anthropic_client: AsyncAnthropicClient | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Anthropic provider. Args: api_key: The API key to use for authentication, if not provided, the `ANTHROPIC_API_KEY` environment variable will be used if available. base_url: The base URL to use for the Anthropic API. anthropic_client: An existing Anthropic client to use. Accepts [`AsyncAnthropic`](https://github.com/anthropics/anthropic-sdk-python), [`AsyncAnthropicBedrock`](https://docs.anthropic.com/en/api/claude-on-amazon-bedrock), [`AsyncAnthropicFoundry`](https://platform.claude.com/docs/en/build-with-claude/claude-in-microsoft-foundry), or [`AsyncAnthropicVertex`](https://docs.anthropic.com/en/api/claude-on-vertex-ai). If provided, the `api_key` and `http_client` arguments will be ignored. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if anthropic_client is not None: assert http_client is None, 'Cannot provide both `anthropic_client` and `http_client`' assert api_key is None, 'Cannot provide both `anthropic_client` and `api_key`' self._client = anthropic_client else: api_key = api_key or os.getenv('ANTHROPIC_API_KEY') if not api_key: raise UserError( 'Set the `ANTHROPIC_API_KEY` environment variable or pass it via `AnthropicProvider(api_key=...)`' 'to use the Anthropic provider.' ) if http_client is not None: self._client = AsyncAnthropic(api_key=api_key, base_url=base_url, http_client=http_client) else: http_client = cached_async_http_client(provider='anthropic') self._client = AsyncAnthropic(api_key=api_key, base_url=base_url, http_client=http_client) ``` ### GoogleProvider Bases: `Provider[Client]` Provider for Google. Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python class GoogleProvider(Provider[Client]): """Provider for Google.""" @property def name(self) -> str: return 'google-vertex' if self._client._api_client.vertexai else 'google-gla' # type: ignore[reportPrivateUsage] @property def base_url(self) -> str: return str(self._client._api_client._http_options.base_url) # type: ignore[reportPrivateUsage] @property def client(self) -> Client: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return google_model_profile(model_name) @overload def __init__( self, *, api_key: str, http_client: httpx.AsyncClient | None = None, base_url: str | None = None ) -> None: ... @overload def __init__( self, *, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | str | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None: ... @overload def __init__(self, *, client: Client) -> None: ... @overload def __init__( self, *, vertexai: bool = False, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None: ... def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | str | None = None, vertexai: bool | None = None, client: Client | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False` unless `location`, `project`, or `credentials` are provided. client: A pre-initialized client to use. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. base_url: The base URL for the Google API. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') vertex_ai_args_used = bool(location or project or credentials) if vertexai is None: vertexai = vertex_ai_args_used http_client = http_client or cached_async_http_client( provider='google-vertex' if vertexai else 'google-gla' ) # Note: google-genai's HttpOptions.timeout defaults to None, which causes # the SDK to explicitly pass timeout=None to httpx, overriding any timeout # configured on the httpx client. We must set the timeout here to ensure # requests actually time out. Read the timeout from the http_client if set, # otherwise use the default. The value is converted from seconds to milliseconds. timeout_seconds = http_client.timeout.read or DEFAULT_HTTP_TIMEOUT timeout_ms = int(timeout_seconds * 1000) http_options = HttpOptions( base_url=base_url, headers={'User-Agent': get_user_agent()}, httpx_async_client=http_client, timeout=timeout_ms, ) if not vertexai: if api_key is None: raise UserError( 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=False, api_key=api_key, http_options=http_options) else: if vertex_ai_args_used: api_key = None if api_key is None: project = project or os.getenv('GOOGLE_CLOUD_PROJECT') # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location = location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1' self._client = Client( vertexai=True, api_key=api_key, project=project, location=location, credentials=credentials, http_options=http_options, ) else: self._client = client # pragma: no cover ``` #### __init__ ```python __init__( *, api_key: str, http_client: AsyncClient | None = None, base_url: str | None = None ) -> None ``` ```python __init__( *, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | str | None ) = None, http_client: AsyncClient | None = None, base_url: str | None = None ) -> None ``` ```python __init__(*, client: Client) -> None ``` ```python __init__( *, vertexai: bool = False, api_key: str | None = None, http_client: AsyncClient | None = None, base_url: str | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | str | None ) = None, vertexai: bool | None = None, client: Client | None = None, http_client: AsyncClient | None = None, base_url: str | None = None ) -> None ``` Create a new Google provider. Parameters: | Name | Type | Description | Default | | ------------- | ------------------ | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key \_ to use for authentication. It can also be set via the GOOGLE_API_KEY environment variable. | | `credentials` | \`Credentials | None\` | The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. | | `project` | \`str | None\` | The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. | | `location` | \`VertexAILocation | Literal['global'] | str | | `vertexai` | \`bool | None\` | Force the use of the Vertex AI API. If False, the Google Generative Language API will be used. Defaults to False unless location, project, or credentials are provided. | | `client` | \`Client | None\` | A pre-initialized client to use. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | | `base_url` | \`str | None\` | The base URL for the Google API. | Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | str | None = None, vertexai: bool | None = None, client: Client | None = None, http_client: httpx.AsyncClient | None = None, base_url: str | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False` unless `location`, `project`, or `credentials` are provided. client: A pre-initialized client to use. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. base_url: The base URL for the Google API. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') vertex_ai_args_used = bool(location or project or credentials) if vertexai is None: vertexai = vertex_ai_args_used http_client = http_client or cached_async_http_client( provider='google-vertex' if vertexai else 'google-gla' ) # Note: google-genai's HttpOptions.timeout defaults to None, which causes # the SDK to explicitly pass timeout=None to httpx, overriding any timeout # configured on the httpx client. We must set the timeout here to ensure # requests actually time out. Read the timeout from the http_client if set, # otherwise use the default. The value is converted from seconds to milliseconds. timeout_seconds = http_client.timeout.read or DEFAULT_HTTP_TIMEOUT timeout_ms = int(timeout_seconds * 1000) http_options = HttpOptions( base_url=base_url, headers={'User-Agent': get_user_agent()}, httpx_async_client=http_client, timeout=timeout_ms, ) if not vertexai: if api_key is None: raise UserError( 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=False, api_key=api_key, http_options=http_options) else: if vertex_ai_args_used: api_key = None if api_key is None: project = project or os.getenv('GOOGLE_CLOUD_PROJECT') # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location = location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1' self._client = Client( vertexai=True, api_key=api_key, project=project, location=location, credentials=credentials, http_options=http_options, ) else: self._client = client # pragma: no cover ``` ### VertexAILocation ```python VertexAILocation = Literal[ "asia-east1", "asia-east2", "asia-northeast1", "asia-northeast3", "asia-south1", "asia-southeast1", "australia-southeast1", "europe-central2", "europe-north1", "europe-southwest1", "europe-west1", "europe-west2", "europe-west3", "europe-west4", "europe-west6", "europe-west8", "europe-west9", "me-central1", "me-central2", "me-west1", "northamerica-northeast1", "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-east5", "us-south1", "us-west1", "us-west4", ] ``` Regions available for Vertex AI. More details [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#genai-locations). ### OpenAIProvider Bases: `Provider[AsyncOpenAI]` Provider for OpenAI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python class OpenAIProvider(Provider[AsyncOpenAI]): """Provider for OpenAI API.""" @property def name(self) -> str: return 'openai' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... @overload def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: AsyncClient | None = None, ) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new OpenAI provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `base_url` | \`str | None\` | The base url for the OpenAI requests. If not provided, the OPENAI_BASE_URL environment variable will be used if available. Otherwise, defaults to OpenAI's base url. | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the OPENAI_API_KEY environment variable will be used if available. | | `openai_client` | \`AsyncOpenAI | None\` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### XaiProvider Bases: `Provider[AsyncClient]` Provider for xAI API (native xAI SDK). Source code in `pydantic_ai_slim/pydantic_ai/providers/xai.py` ```python class XaiProvider(Provider[AsyncClient]): """Provider for xAI API (native xAI SDK).""" @property def name(self) -> str: return 'xai' @property def base_url(self) -> str: return 'https://api.x.ai/v1' @property def client(self) -> AsyncClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return grok_model_profile(model_name) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, xai_client: AsyncClient) -> None: ... def __init__( self, *, api_key: str | None = None, xai_client: AsyncClient | None = None, ) -> None: """Create a new xAI provider. Args: api_key: The API key to use for authentication, if not provided, the `XAI_API_KEY` environment variable will be used if available. xai_client: An existing `xai_sdk.AsyncClient` to use. This takes precedence over `api_key`. """ if xai_client is not None: self._client = xai_client else: api_key = api_key or os.getenv('XAI_API_KEY') if not api_key: raise UserError( 'Set the `XAI_API_KEY` environment variable or pass it via `XaiProvider(api_key=...)`' 'to use the xAI provider.' ) self._client = AsyncClient(api_key=api_key) ``` #### __init__ ```python __init__() -> None ``` ```python __init__(*, api_key: str) -> None ``` ```python __init__(*, xai_client: AsyncClient) -> None ``` ```python __init__( *, api_key: str | None = None, xai_client: AsyncClient | None = None ) -> None ``` Create a new xAI provider. Parameters: | Name | Type | Description | Default | | ------------ | ------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the XAI_API_KEY environment variable will be used if available. | | `xai_client` | \`AsyncClient | None\` | An existing xai_sdk.AsyncClient to use. This takes precedence over api_key. | Source code in `pydantic_ai_slim/pydantic_ai/providers/xai.py` ```python def __init__( self, *, api_key: str | None = None, xai_client: AsyncClient | None = None, ) -> None: """Create a new xAI provider. Args: api_key: The API key to use for authentication, if not provided, the `XAI_API_KEY` environment variable will be used if available. xai_client: An existing `xai_sdk.AsyncClient` to use. This takes precedence over `api_key`. """ if xai_client is not None: self._client = xai_client else: api_key = api_key or os.getenv('XAI_API_KEY') if not api_key: raise UserError( 'Set the `XAI_API_KEY` environment variable or pass it via `XaiProvider(api_key=...)`' 'to use the xAI provider.' ) self._client = AsyncClient(api_key=api_key) ``` ### DeepSeekProvider Bases: `Provider[AsyncOpenAI]` Provider for DeepSeek API. Source code in `pydantic_ai_slim/pydantic_ai/providers/deepseek.py` ```python class DeepSeekProvider(Provider[AsyncOpenAI]): """Provider for DeepSeek API.""" @property def name(self) -> str: return 'deepseek' @property def base_url(self) -> str: return 'https://api.deepseek.com' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = deepseek_model_profile(model_name) # As DeepSeekProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly. # This was not the case when using a DeepSeek model with another model class (e.g. BedrockConverseModel or GroqModel), # so we won't do this in `deepseek_model_profile` unless we learn it's always needed. return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, supports_json_object_output=True, openai_chat_thinking_field='reasoning_content', # Starting from DeepSeek v3.2, DeepSeek requires sending thinking parts for optimal agentic performance. openai_chat_send_back_thinking_parts='field', # DeepSeek v3.2 reasoning mode does not support tool_choice=required yet openai_supports_tool_choice_required=(model_name != 'deepseek-reasoner'), ).update(profile) @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... @overload def __init__( self, *, api_key: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('DEEPSEEK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `DEEPSEEK_API_KEY` environment variable or pass it via `DeepSeekProvider(api_key=...)`' 'to use the DeepSeek provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='deepseek') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### BedrockModelProfile Bases: `ModelProfile` Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python @dataclass(kw_only=True) class BedrockModelProfile(ModelProfile): """Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ bedrock_supports_tool_choice: bool = False bedrock_tool_result_format: Literal['text', 'json'] = 'text' bedrock_send_back_thinking_parts: bool = False bedrock_supports_prompt_caching: bool = False bedrock_supports_tool_caching: bool = False ``` ### bedrock_amazon_model_profile ```python bedrock_amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model used via Bedrock.""" profile = _without_builtin_tools(amazon_model_profile(model_name)) if 'nova' in model_name: profile = BedrockModelProfile( bedrock_supports_tool_choice=True, bedrock_supports_prompt_caching=True, ).update(profile) if 'nova-2' in model_name: profile.supported_builtin_tools = frozenset({CodeExecutionTool}) return profile ``` ### bedrock_deepseek_model_profile ```python bedrock_deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model used via Bedrock.""" profile = deepseek_model_profile(model_name) if 'r1' in model_name: return BedrockModelProfile(bedrock_send_back_thinking_parts=True).update(profile) return profile # pragma: no cover ``` ### remove_bedrock_geo_prefix ```python remove_bedrock_geo_prefix(model_name: str) -> str ``` Remove inference geographic prefix from model ID if present. Bedrock supports cross-region inference using geographic prefixes like 'us.', 'eu.', 'apac.', etc. This function strips those prefixes. Example 'us.amazon.titan-embed-text-v2:0' -> 'amazon.titan-embed-text-v2:0' 'amazon.titan-embed-text-v2:0' -> 'amazon.titan-embed-text-v2:0' Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def remove_bedrock_geo_prefix(model_name: str) -> str: """Remove inference geographic prefix from model ID if present. Bedrock supports cross-region inference using geographic prefixes like 'us.', 'eu.', 'apac.', etc. This function strips those prefixes. Example: 'us.amazon.titan-embed-text-v2:0' -> 'amazon.titan-embed-text-v2:0' 'amazon.titan-embed-text-v2:0' -> 'amazon.titan-embed-text-v2:0' """ for prefix in BEDROCK_GEO_PREFIXES: if model_name.startswith(f'{prefix}.'): return model_name.removeprefix(f'{prefix}.') return model_name ``` ### BedrockProvider Bases: `Provider[BaseClient]` Provider for AWS Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python class BedrockProvider(Provider[BaseClient]): """Provider for AWS Bedrock.""" @property def name(self) -> str: return 'bedrock' @property def base_url(self) -> str: return self._client.meta.endpoint_url @property def client(self) -> BaseClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile: dict[str, Callable[[str], ModelProfile | None]] = { 'anthropic': lambda model_name: BedrockModelProfile( bedrock_supports_tool_choice=True, bedrock_send_back_thinking_parts=True, bedrock_supports_prompt_caching=True, bedrock_supports_tool_caching=True, ).update(_without_builtin_tools(anthropic_model_profile(model_name))), 'mistral': lambda model_name: BedrockModelProfile(bedrock_tool_result_format='json').update( _without_builtin_tools(mistral_model_profile(model_name)) ), 'cohere': lambda model_name: _without_builtin_tools(cohere_model_profile(model_name)), 'amazon': bedrock_amazon_model_profile, 'meta': lambda model_name: _without_builtin_tools(meta_model_profile(model_name)), 'deepseek': lambda model_name: _without_builtin_tools(bedrock_deepseek_model_profile(model_name)), } # Split the model name into parts parts = model_name.split('.', 2) # Handle regional prefixes if len(parts) > 2 and parts[0] in BEDROCK_GEO_PREFIXES: parts = parts[1:] # required format is provider.model-name-with-version if len(parts) < 2: return None provider = parts[0] model_name_with_version = parts[1] # Remove version suffix if it matches the format (e.g. "-v1:0" or "-v14") version_match = re.match(r'(.+)-v\d+(?::\d+)?$', model_name_with_version) if version_match: model_name = version_match.group(1) else: model_name = model_name_with_version if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, bedrock_client: BaseClient) -> None: ... @overload def __init__( self, *, api_key: str, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: ... @overload def __init__( self, *, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: ... def __init__( self, *, bedrock_client: BaseClient | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, api_key: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. aws_access_key_id: The AWS access key ID. If not set, the `AWS_ACCESS_KEY_ID` environment variable will be used if available. aws_secret_access_key: The AWS secret access key. If not set, the `AWS_SECRET_ACCESS_KEY` environment variable will be used if available. aws_session_token: The AWS session token. If not set, the `AWS_SESSION_TOKEN` environment variable will be used if available. api_key: The API key for Bedrock client. Can be used instead of `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token`. If not set, the `AWS_BEARER_TOKEN_BEDROCK` environment variable will be used if available. base_url: The base URL for the Bedrock client. region_name: The AWS region name. If not set, the `AWS_DEFAULT_REGION` environment variable will be used if available. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) config: dict[str, Any] = { 'read_timeout': read_timeout, 'connect_timeout': connect_timeout, } api_key = api_key or os.getenv('AWS_BEARER_TOKEN_BEDROCK') try: if api_key is not None: session = boto3.Session( botocore_session=_BearerTokenSession(api_key), region_name=region_name, profile_name=profile_name, ) config['signature_version'] = 'bearer' else: # pragma: lax no cover session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(**config), endpoint_url=base_url, ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` #### __init__ ```python __init__(*, bedrock_client: BaseClient) -> None ``` ```python __init__( *, api_key: str, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` ```python __init__( *, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` ```python __init__( *, bedrock_client: BaseClient | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, api_key: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` Initialize the Bedrock provider. Parameters: | Name | Type | Description | Default | | ----------------------- | ------------ | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `bedrock_client` | \`BaseClient | None\` | A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. | | `aws_access_key_id` | \`str | None\` | The AWS access key ID. If not set, the AWS_ACCESS_KEY_ID environment variable will be used if available. | | `aws_secret_access_key` | \`str | None\` | The AWS secret access key. If not set, the AWS_SECRET_ACCESS_KEY environment variable will be used if available. | | `aws_session_token` | \`str | None\` | The AWS session token. If not set, the AWS_SESSION_TOKEN environment variable will be used if available. | | `api_key` | \`str | None\` | The API key for Bedrock client. Can be used instead of aws_access_key_id, aws_secret_access_key, and aws_session_token. If not set, the AWS_BEARER_TOKEN_BEDROCK environment variable will be used if available. | | `base_url` | \`str | None\` | The base URL for the Bedrock client. | | `region_name` | \`str | None\` | The AWS region name. If not set, the AWS_DEFAULT_REGION environment variable will be used if available. | | `profile_name` | \`str | None\` | The AWS profile name. | | `aws_read_timeout` | \`float | None\` | The read timeout for Bedrock client. | | `aws_connect_timeout` | \`float | None\` | The connect timeout for Bedrock client. | Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def __init__( self, *, bedrock_client: BaseClient | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, base_url: str | None = None, region_name: str | None = None, profile_name: str | None = None, api_key: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. aws_access_key_id: The AWS access key ID. If not set, the `AWS_ACCESS_KEY_ID` environment variable will be used if available. aws_secret_access_key: The AWS secret access key. If not set, the `AWS_SECRET_ACCESS_KEY` environment variable will be used if available. aws_session_token: The AWS session token. If not set, the `AWS_SESSION_TOKEN` environment variable will be used if available. api_key: The API key for Bedrock client. Can be used instead of `aws_access_key_id`, `aws_secret_access_key`, and `aws_session_token`. If not set, the `AWS_BEARER_TOKEN_BEDROCK` environment variable will be used if available. base_url: The base URL for the Bedrock client. region_name: The AWS region name. If not set, the `AWS_DEFAULT_REGION` environment variable will be used if available. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) config: dict[str, Any] = { 'read_timeout': read_timeout, 'connect_timeout': connect_timeout, } api_key = api_key or os.getenv('AWS_BEARER_TOKEN_BEDROCK') try: if api_key is not None: session = boto3.Session( botocore_session=_BearerTokenSession(api_key), region_name=region_name, profile_name=profile_name, ) config['signature_version'] = 'bearer' else: # pragma: lax no cover session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(**config), endpoint_url=base_url, ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` ### groq_moonshotai_model_profile ```python groq_moonshotai_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an MoonshotAI model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def groq_moonshotai_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an MoonshotAI model used with the Groq provider.""" return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( moonshotai_model_profile(model_name) ) ``` ### meta_groq_model_profile ```python meta_groq_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Meta model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def meta_groq_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model used with the Groq provider.""" if model_name in {'llama-4-maverick-17b-128e-instruct', 'llama-4-scout-17b-16e-instruct'}: return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( meta_model_profile(model_name) ) else: return meta_model_profile(model_name) ``` ### GroqProvider Bases: `Provider[AsyncGroq]` Provider for Groq API. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python class GroqProvider(Provider[AsyncGroq]): """Provider for Groq API.""" @property def name(self) -> str: return 'groq' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncGroq: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'meta-llama/': meta_groq_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'moonshotai/': groq_moonshotai_model_profile, 'compound-': groq_model_profile, 'openai/': openai_model_profile, } for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): if prefix.endswith('/'): model_name = model_name[len(prefix) :] return profile_func(model_name) return None @overload def __init__(self, *, groq_client: AsyncGroq | None = None) -> None: ... @overload def __init__( self, *, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None ) -> None: ... def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, groq_client: AsyncGroq | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Groq provider. Parameters: | Name | Type | Description | Default | | ------------- | ------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the GROQ_API_KEY environment variable will be used if available. | | `base_url` | \`str | None\` | The base url for the Groq requests. If not provided, the GROQ_BASE_URL environment variable will be used if available. Otherwise, defaults to Groq's base url. | | `groq_client` | \`AsyncGroq | None\` | An existing AsyncGroq client to use. If provided, api_key and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### AzureProvider Bases: `Provider[AsyncOpenAI]` Provider for Azure OpenAI API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python class AzureProvider(Provider[AsyncOpenAI]): """Provider for Azure OpenAI API. See for more information. """ @property def name(self) -> str: return 'azure' @property def base_url(self) -> str: assert self._base_url is not None return self._base_url @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: model_name = model_name.lower() prefix_to_profile = { 'llama': meta_model_profile, 'meta-': meta_model_profile, 'deepseek': deepseek_model_profile, 'mistralai-': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere-': cohere_model_profile, 'grok': grok_model_profile, } for prefix, profile_func in prefix_to_profile.items(): if model_name.startswith(prefix): if prefix.endswith('-'): model_name = model_name[len(prefix) :] profile = profile_func(model_name) # As AzureProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) # OpenAI models are unprefixed return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncAzureOpenAI) -> None: ... @overload def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` #### __init__ ```python __init__(*, openai_client: AsyncAzureOpenAI) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Azure provider. Parameters: | Name | Type | Description | Default | | ---------------- | ------------------ | ----------- | ---------------------------------------------------------------------------------------------------------------------------------------- | | `azure_endpoint` | \`str | None\` | The Azure endpoint to use for authentication, if not provided, the AZURE_OPENAI_ENDPOINT environment variable will be used if available. | | `api_version` | \`str | None\` | The API version to use for authentication, if not provided, the OPENAI_API_VERSION environment variable will be used if available. | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the AZURE_OPENAI_API_KEY environment variable will be used if available. | | `openai_client` | \`AsyncAzureOpenAI | None\` | An existing AsyncAzureOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` ### CohereProvider Bases: `Provider[AsyncClientV2]` Provider for Cohere API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python class CohereProvider(Provider[AsyncClientV2]): """Provider for Cohere API.""" @property def name(self) -> str: return 'cohere' @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def client(self) -> AsyncClientV2: return self._client @property def v1_client(self) -> AsyncClient | None: return self._v1_client def model_profile(self, model_name: str) -> ModelProfile | None: return cohere_model_profile(model_name) def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client self._v1_client = None else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) self._v1_client = AsyncClient(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) self._v1_client = AsyncClient(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` #### __init__ ```python __init__( *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Cohere provider. Parameters: | Name | Type | Description | Default | | --------------- | --------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the CO_API_KEY environment variable will be used if available. | | `cohere_client` | \`AsyncClientV2 | None\` | An existing AsyncClientV2 client to use. If provided, api_key and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client self._v1_client = None else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) self._v1_client = AsyncClient(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) self._v1_client = AsyncClient(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncClient]` Provider for VoyageAI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/voyageai.py` ```python class VoyageAIProvider(Provider[AsyncClient]): """Provider for VoyageAI API.""" @property def name(self) -> str: return 'voyageai' @property def base_url(self) -> str: return self._client._params.get('base_url') or 'https://api.voyageai.com/v1' # type: ignore @property def client(self) -> AsyncClient: return self._client @overload def __init__(self, *, voyageai_client: AsyncClient) -> None: ... @overload def __init__(self, *, api_key: str | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, voyageai_client: AsyncClient | None = None, ) -> None: """Create a new VoyageAI provider. Args: api_key: The API key to use for authentication, if not provided, the `VOYAGE_API_KEY` environment variable will be used if available. voyageai_client: An existing [AsyncClient](https://github.com/voyage-ai/voyageai-python) client to use. If provided, `api_key` must be `None`. """ if voyageai_client is not None: assert api_key is None, 'Cannot provide both `voyageai_client` and `api_key`' self._client = voyageai_client else: api_key = api_key or os.getenv('VOYAGE_API_KEY') if not api_key: raise UserError( 'Set the `VOYAGE_API_KEY` environment variable or pass it via `VoyageAIProvider(api_key=...)` ' 'to use the VoyageAI provider.' ) self._client = AsyncClient(api_key=api_key) ``` ### __init__ ```python __init__(*, voyageai_client: AsyncClient) -> None ``` ```python __init__(*, api_key: str | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, voyageai_client: AsyncClient | None = None ) -> None ``` Create a new VoyageAI provider. Parameters: | Name | Type | Description | Default | | ----------------- | ------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the VOYAGE_API_KEY environment variable will be used if available. | | `voyageai_client` | \`AsyncClient | None\` | An existing AsyncClient client to use. If provided, api_key must be None. | Source code in `pydantic_ai_slim/pydantic_ai/providers/voyageai.py` ```python def __init__( self, *, api_key: str | None = None, voyageai_client: AsyncClient | None = None, ) -> None: """Create a new VoyageAI provider. Args: api_key: The API key to use for authentication, if not provided, the `VOYAGE_API_KEY` environment variable will be used if available. voyageai_client: An existing [AsyncClient](https://github.com/voyage-ai/voyageai-python) client to use. If provided, `api_key` must be `None`. """ if voyageai_client is not None: assert api_key is None, 'Cannot provide both `voyageai_client` and `api_key`' self._client = voyageai_client else: api_key = api_key or os.getenv('VOYAGE_API_KEY') if not api_key: raise UserError( 'Set the `VOYAGE_API_KEY` environment variable or pass it via `VoyageAIProvider(api_key=...)` ' 'to use the VoyageAI provider.' ) self._client = AsyncClient(api_key=api_key) ``` Bases: `Provider[AsyncOpenAI]` Provider for Cerebras API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cerebras.py` ```python class CerebrasProvider(Provider[AsyncOpenAI]): """Provider for Cerebras API.""" @property def name(self) -> str: return 'cerebras' @property def base_url(self) -> str: return 'https://api.cerebras.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'qwen': qwen_model_profile, 'gpt-oss': harmony_model_profile, 'zai': zai_model_profile, } profile = None model_name_lower = model_name.lower() for prefix, profile_func in prefix_to_profile.items(): if model_name_lower.startswith(prefix): profile = profile_func(model_name_lower) break # According to https://inference-docs.cerebras.ai/resources/openai#currently-unsupported-openai-features, # Cerebras doesn't support some model settings. # openai_chat_supports_web_search=False is default, so not required here unsupported_model_settings = ( 'frequency_penalty', 'logit_bias', 'presence_penalty', 'parallel_tool_calls', 'service_tier', ) return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_unsupported_model_settings=unsupported_model_settings, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cerebras provider. Args: api_key: The API key to use for authentication, if not provided, the `CEREBRAS_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('CEREBRAS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `CEREBRAS_API_KEY` environment variable or pass it via `CerebrasProvider(api_key=...)` ' 'to use the Cerebras provider.' ) default_headers = {'X-Cerebras-3rd-Party-Integration': 'pydantic-ai'} if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) else: http_client = cached_async_http_client(provider='cerebras') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) ``` ### __init__ ```python __init__() -> None ``` ```python __init__(*, api_key: str) -> None ``` ```python __init__(*, api_key: str, http_client: AsyncClient) -> None ``` ```python __init__(*, http_client: AsyncClient) -> None ``` ```python __init__( *, openai_client: AsyncOpenAI | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Cerebras provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | ---------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the CEREBRAS_API_KEY environment variable will be used if available. | | `openai_client` | \`AsyncOpenAI | None\` | An existing AsyncOpenAI client to use. If provided, api_key and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/cerebras.py` ```python def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cerebras provider. Args: api_key: The API key to use for authentication, if not provided, the `CEREBRAS_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('CEREBRAS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `CEREBRAS_API_KEY` environment variable or pass it via `CerebrasProvider(api_key=...)` ' 'to use the Cerebras provider.' ) default_headers = {'X-Cerebras-3rd-Party-Integration': 'pydantic-ai'} if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) else: http_client = cached_async_http_client(provider='cerebras') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) ``` Bases: `Provider[Mistral]` Provider for Mistral API. Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python class MistralProvider(Provider[Mistral]): """Provider for Mistral API.""" @property def name(self) -> str: return 'mistral' @property def base_url(self) -> str: return self.client.sdk_configuration.get_server_details()[0] @property def client(self) -> Mistral: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return mistral_model_profile(model_name) @overload def __init__(self, *, mistral_client: Mistral | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None, http_client: httpx.AsyncClient | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` ### __init__ ```python __init__(*, mistral_client: Mistral | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Mistral provider. Parameters: | Name | Type | Description | Default | | ---------------- | ------------- | ----------- | --------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the MISTRAL_API_KEY environment variable will be used if available. | | `mistral_client` | \`Mistral | None\` | An existing Mistral client to use, if provided, api_key and http_client must be None. | | `base_url` | \`str | None\` | The base url for the Mistral requests. | | `http_client` | \`AsyncClient | None\` | An existing async client to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for Fireworks AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/fireworks.py` ```python class FireworksProvider(Provider[AsyncOpenAI]): """Provider for Fireworks AI API.""" @property def name(self) -> str: return 'fireworks' @property def base_url(self) -> str: return 'https://api.fireworks.ai/inference/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'gemma': google_model_profile, } prefix = 'accounts/fireworks/models/' profile = None if model_name.startswith(prefix): model_name = model_name[len(prefix) :] for provider, profile_func in prefix_to_profile.items(): if model_name.startswith(provider): profile = profile_func(model_name) break # As the Fireworks API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('FIREWORKS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `FIREWORKS_API_KEY` environment variable or pass it via `FireworksProvider(api_key=...)`' 'to use the Fireworks AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='fireworks') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Deprecated `GrokProvider` is deprecated, use `XaiProvider` with `XaiModel` instead for the native xAI SDK. See for more details. Provider for Grok API (OpenAI-compatible interface). Source code in `pydantic_ai_slim/pydantic_ai/providers/grok.py` ```python @deprecated( '`GrokProvider` is deprecated, use `XaiProvider` with `XaiModel` instead for the native xAI SDK. ' 'See for more details.' ) class GrokProvider(Provider[AsyncOpenAI]): """Provider for Grok API (OpenAI-compatible interface).""" @property def name(self) -> str: return 'grok' @property def base_url(self) -> str: return 'https://api.x.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = grok_model_profile(model_name) # As the Grok API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, Grok does not support strict tool definitions: https://github.com/pydantic/pydantic-ai/issues/1846 return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_strict_tool_definition=False ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('GROK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GROK_API_KEY` environment variable or pass it via `GrokProvider(api_key=...)`' 'to use the Grok provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='grok') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Together AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/together.py` ```python class TogetherProvider(Provider[AsyncOpenAI]): """Provider for Together AI API.""" @property def name(self) -> str: return 'together' @property def base_url(self) -> str: return 'https://api.together.xyz/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, } profile = None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As the Together API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('TOGETHER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `TOGETHER_API_KEY` environment variable or pass it via `TogetherProvider(api_key=...)`' 'to use the Together AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='together') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Heroku API. Source code in `pydantic_ai_slim/pydantic_ai/providers/heroku.py` ```python class HerokuProvider(Provider[AsyncOpenAI]): """Provider for Heroku API.""" @property def name(self) -> str: return 'heroku' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # As the Heroku API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: if openai_client is not None: assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: api_key = api_key or os.getenv('HEROKU_INFERENCE_KEY') if not api_key: raise UserError( 'Set the `HEROKU_INFERENCE_KEY` environment variable or pass it via `HerokuProvider(api_key=...)`' 'to use the Heroku provider.' ) base_url = base_url or os.getenv('HEROKU_INFERENCE_URL', 'https://us.inference.heroku.com') base_url = base_url.rstrip('/') + '/v1' if http_client is not None: self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='heroku') self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python class GitHubProvider(Provider[AsyncOpenAI]): """Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. """ @property def name(self) -> str: return 'github' @property def base_url(self) -> str: return 'https://models.github.ai/inference' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'xai': grok_model_profile, 'meta': meta_model_profile, 'microsoft': openai_model_profile, 'mistral-ai': mistral_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, } profile = None # If the model name does not contain a provider prefix, we assume it's an OpenAI model if '/' not in model_name: return openai_model_profile(model_name) provider, model_name = model_name.lower().split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As GitHubProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__() -> None ``` ```python __init__(*, api_key: str) -> None ``` ```python __init__(*, api_key: str, http_client: AsyncClient) -> None ``` ```python __init__( *, openai_client: AsyncOpenAI | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new GitHub Models provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | The GitHub token to use for authentication. If not provided, the GITHUB_API_KEY environment variable will be used if available. | | `openai_client` | \`AsyncOpenAI | None\` | An existing AsyncOpenAI client to use. If provided, api_key and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for OpenRouter API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openrouter.py` ```python class OpenRouterProvider(Provider[AsyncOpenAI]): """Provider for OpenRouter API.""" @property def name(self) -> str: return 'openrouter' @property def base_url(self) -> str: return 'https://openrouter.ai/api/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'google': _openrouter_google_model_profile, 'openai': openai_model_profile, 'anthropic': anthropic_model_profile, 'mistralai': mistral_model_profile, 'qwen': qwen_model_profile, 'x-ai': grok_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'deepseek': deepseek_model_profile, 'meta-llama': meta_model_profile, 'moonshotai': moonshotai_model_profile, } profile = None provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As OpenRouterProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_chat_send_back_thinking_parts='field', openai_chat_thinking_field='reasoning', openai_chat_supports_file_urls=True, ).update(profile) @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... @overload def __init__( self, *, api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, *, api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Configure the provider with either an API key or prebuilt client. Args: api_key: OpenRouter API key. Falls back to ``OPENROUTER_API_KEY`` when omitted and required unless ``openai_client`` is provided. app_url: Optional url for app attribution. Falls back to ``OPENROUTER_APP_URL`` when omitted. app_title: Optional title for app attribution. Falls back to ``OPENROUTER_APP_TITLE`` when omitted. openai_client: Existing ``AsyncOpenAI`` client to reuse instead of creating one internally. http_client: Custom ``httpx.AsyncClient`` to pass into the ``AsyncOpenAI`` constructor when building a client. Raises: UserError: If no API key is available and no ``openai_client`` is provided. """ api_key = api_key or os.getenv('OPENROUTER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `OPENROUTER_API_KEY` environment variable or pass it via `OpenRouterProvider(api_key=...)`' 'to use the OpenRouter provider.' ) attribution_headers: dict[str, str] = {} if http_referer := app_url or os.getenv('OPENROUTER_APP_URL'): attribution_headers['HTTP-Referer'] = http_referer if x_title := app_title or os.getenv('OPENROUTER_APP_TITLE'): attribution_headers['X-Title'] = x_title if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=attribution_headers ) else: http_client = cached_async_http_client(provider='openrouter') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=attribution_headers ) ``` ### __init__ ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( *, api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Configure the provider with either an API key or prebuilt client. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | ---------------------------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | OpenRouter API key. Falls back to OPENROUTER_API_KEY when omitted and required unless openai_client is provided. | | `app_url` | \`str | None\` | Optional url for app attribution. Falls back to OPENROUTER_APP_URL when omitted. | | `app_title` | \`str | None\` | Optional title for app attribution. Falls back to OPENROUTER_APP_TITLE when omitted. | | `openai_client` | \`AsyncOpenAI | None\` | Existing AsyncOpenAI client to reuse instead of creating one internally. | | `http_client` | \`AsyncClient | None\` | Custom httpx.AsyncClient to pass into the AsyncOpenAI constructor when building a client. | Raises: | Type | Description | | ----------- | ------------------------------------------------------------ | | `UserError` | If no API key is available and no openai_client is provided. | Source code in `pydantic_ai_slim/pydantic_ai/providers/openrouter.py` ```python def __init__( self, *, api_key: str | None = None, app_url: str | None = None, app_title: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Configure the provider with either an API key or prebuilt client. Args: api_key: OpenRouter API key. Falls back to ``OPENROUTER_API_KEY`` when omitted and required unless ``openai_client`` is provided. app_url: Optional url for app attribution. Falls back to ``OPENROUTER_APP_URL`` when omitted. app_title: Optional title for app attribution. Falls back to ``OPENROUTER_APP_TITLE`` when omitted. openai_client: Existing ``AsyncOpenAI`` client to reuse instead of creating one internally. http_client: Custom ``httpx.AsyncClient`` to pass into the ``AsyncOpenAI`` constructor when building a client. Raises: UserError: If no API key is available and no ``openai_client`` is provided. """ api_key = api_key or os.getenv('OPENROUTER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `OPENROUTER_API_KEY` environment variable or pass it via `OpenRouterProvider(api_key=...)`' 'to use the OpenRouter provider.' ) attribution_headers: dict[str, str] = {} if http_referer := app_url or os.getenv('OPENROUTER_APP_URL'): attribution_headers['HTTP-Referer'] = http_referer if x_title := app_title or os.getenv('OPENROUTER_APP_TITLE'): attribution_headers['X-Title'] = x_title if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=attribution_headers ) else: http_client = cached_async_http_client(provider='openrouter') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=attribution_headers ) ``` Bases: `Provider[AsyncOpenAI]` Provider for Vercel AI Gateway API. Source code in `pydantic_ai_slim/pydantic_ai/providers/vercel.py` ```python class VercelProvider(Provider[AsyncOpenAI]): """Provider for Vercel AI Gateway API.""" @property def name(self) -> str: return 'vercel' @property def base_url(self) -> str: return 'https://ai-gateway.vercel.sh/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'anthropic': anthropic_model_profile, 'bedrock': amazon_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'openai': openai_model_profile, 'vertex': google_model_profile, 'xai': grok_model_profile, } profile = None try: provider, model_name = model_name.split('/', 1) except ValueError: raise UserError(f"Model name must be in 'provider/model' format, got: {model_name!r}") if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As VercelProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: # Support Vercel AI Gateway's standard environment variables api_key = api_key or os.getenv('VERCEL_AI_GATEWAY_API_KEY') or os.getenv('VERCEL_OIDC_TOKEN') if not api_key and openai_client is None: raise UserError( 'Set the `VERCEL_AI_GATEWAY_API_KEY` or `VERCEL_OIDC_TOKEN` environment variable ' 'or pass the API key via `VercelProvider(api_key=...)` to use the Vercel provider.' ) default_headers = {'http-referer': 'https://ai.pydantic.dev/', 'x-title': 'pydantic-ai'} if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) else: http_client = cached_async_http_client(provider='vercel') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) ``` Bases: `Provider[AsyncInferenceClient]` Provider for Hugging Face. Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python class HuggingFaceProvider(Provider[AsyncInferenceClient]): """Provider for Hugging Face.""" @property def name(self) -> str: return 'huggingface' @property def base_url(self) -> str: return self.client.model # type: ignore @property def client(self) -> AsyncInferenceClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, 'moonshotai': moonshotai_model_profile, } if '/' not in model_name: return None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name: Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` ### __init__ ```python __init__( *, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__(*, api_key: str | None = None) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None ``` Create a new Hugging Face provider. Parameters: | Name | Type | Description | Default | | --------------- | ---------------------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `base_url` | \`str | None\` | The base url for the Hugging Face requests. | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the HF_TOKEN environment variable will be used if available. | | `hf_client` | \`AsyncInferenceClient | None\` | An existing AsyncInferenceClient client to use. If not provided, a new instance will be created. | | `http_client` | \`AsyncClient | None\` | (currently ignored) An existing httpx.AsyncClient to use for making HTTP requests. | | `provider_name` | \`str | None\` | Name of the provider to use for inference. available providers can be found in the HF Inference Providers documentation. defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If base_url is passed, then provider_name is not used. | Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name: Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` Bases: `Provider[AsyncOpenAI]` Provider for MoonshotAI platform (Kimi models). Source code in `pydantic_ai_slim/pydantic_ai/providers/moonshotai.py` ```python class MoonshotAIProvider(Provider[AsyncOpenAI]): """Provider for MoonshotAI platform (Kimi models).""" @property def name(self) -> str: return 'moonshotai' @property def base_url(self) -> str: # OpenAI-compatible endpoint, see MoonshotAI docs return 'https://api.moonshot.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = moonshotai_model_profile(model_name) # As the MoonshotAI API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, MoonshotAI does not support strict tool definitions # https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi#about-tool_choice # "Please note that the current version of Kimi API does not support the tool_choice=required parameter." return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_tool_choice_required=False, supports_json_object_output=True, openai_chat_thinking_field='reasoning_content', openai_chat_send_back_thinking_parts='field', ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('MOONSHOTAI_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `MOONSHOTAI_API_KEY` environment variable or pass it via ' '`MoonshotAIProvider(api_key=...)` to use the MoonshotAI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='moonshotai') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for local or remote Ollama API. Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python class OllamaProvider(Provider[AsyncOpenAI]): """Provider for local or remote Ollama API.""" @property def name(self) -> str: return 'ollama' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'qwq': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'command': cohere_model_profile, 'gpt-oss': harmony_model_profile, } profile = None for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): profile = profile_func(model_name) # As OllamaProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_chat_thinking_field='reasoning', ).update(profile) def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new Ollama provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------- | | `base_url` | \`str | None\` | The base url for the Ollama requests. If not provided, the OLLAMA_BASE_URL environment variable will be used if available. | | `api_key` | \`str | None\` | The API key to use for authentication, if not provided, the OLLAMA_API_KEY environment variable will be used if available. | | `openai_client` | \`AsyncOpenAI | None\` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | | `http_client` | \`AsyncClient | None\` | An existing httpx.AsyncClient to use for making HTTP requests. | Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for LiteLLM API. Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python class LiteLLMProvider(Provider[AsyncOpenAI]): """Provider for LiteLLM API.""" @property def name(self) -> str: return 'litellm' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # Map provider prefixes to their profile functions provider_to_profile = { 'anthropic': anthropic_model_profile, 'openai': openai_model_profile, 'google': google_model_profile, 'mistralai': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'bedrock': amazon_model_profile, 'meta-llama': meta_model_profile, 'meta': meta_model_profile, 'groq': groq_model_profile, 'deepseek': deepseek_model_profile, 'moonshotai': moonshotai_model_profile, 'x-ai': grok_model_profile, 'qwen': qwen_model_profile, } profile = None # Check if model name contains a provider prefix (e.g., "anthropic/claude-3") if '/' in model_name: provider_prefix, model_suffix = model_name.split('/', 1) if provider_prefix in provider_to_profile: profile = provider_to_profile[provider_prefix](model_suffix) # If no profile found, default to OpenAI profile if profile is None: profile = openai_model_profile(model_name) # As LiteLLMProvider is used with OpenAIModel, which uses OpenAIJsonSchemaTransformer, # we maintain that behavior return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, ) -> None: ... @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncHTTPClient, ) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` ### __init__ ```python __init__( *, api_key: str | None = None, api_base: str | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncClient ) -> None ``` ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Initialize a LiteLLM provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | ----------------------------------------------------------------------------------------------- | | `api_key` | \`str | None\` | API key for the model provider. If None, LiteLLM will try to get it from environment variables. | | `api_base` | \`str | None\` | Base URL for the model provider. Use this for custom endpoints or self-hosted models. | | `openai_client` | \`AsyncOpenAI | None\` | Pre-configured OpenAI client. If provided, other parameters are ignored. | | `http_client` | \`AsyncClient | None\` | Custom HTTP client to use. | Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` Bases: `Provider[AsyncOpenAI]` Provider for Nebius AI Studio API. Source code in `pydantic_ai_slim/pydantic_ai/providers/nebius.py` ```python class NebiusProvider(Provider[AsyncOpenAI]): """Provider for Nebius AI Studio API.""" @property def name(self) -> str: return 'nebius' @property def base_url(self) -> str: return 'https://api.studio.nebius.com/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'meta-llama': meta_model_profile, 'deepseek-ai': deepseek_model_profile, 'qwen': qwen_model_profile, 'google': google_model_profile, 'openai': harmony_model_profile, # used for gpt-oss models on Nebius 'mistralai': mistral_model_profile, 'moonshotai': moonshotai_model_profile, } profile = None try: model_name = model_name.lower() provider, model_name = model_name.split('/', 1) except ValueError: raise UserError(f"Model name must be in 'provider/model' format, got: {model_name!r}") if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As NebiusProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('NEBIUS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `NEBIUS_API_KEY` environment variable or pass it via ' '`NebiusProvider(api_key=...)` to use the Nebius AI Studio provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='nebius') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for OVHcloud AI Endpoints. Source code in `pydantic_ai_slim/pydantic_ai/providers/ovhcloud.py` ```python class OVHcloudProvider(Provider[AsyncOpenAI]): """Provider for OVHcloud AI Endpoints.""" @property def name(self) -> str: return 'ovhcloud' @property def base_url(self) -> str: return 'https://oai.endpoints.kepler.ai.cloud.ovh.net/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: model_name = model_name.lower() prefix_to_profile = { 'llama': meta_model_profile, 'meta-': meta_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'gpt': harmony_model_profile, 'qwen': qwen_model_profile, } profile = None for prefix, profile_func in prefix_to_profile.items(): if model_name.startswith(prefix): profile = profile_func(model_name) # As the OVHcloud AI Endpoints API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('OVHCLOUD_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `OVHCLOUD_API_KEY` environment variable or pass it via ' '`OVHcloudProvider(api_key=...)` to use OVHcloud AI Endpoints provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ovhcloud') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Alibaba Cloud Model Studio (DashScope) OpenAI-compatible API. Source code in `pydantic_ai_slim/pydantic_ai/providers/alibaba.py` ```python class AlibabaProvider(Provider[AsyncOpenAI]): """Provider for Alibaba Cloud Model Studio (DashScope) OpenAI-compatible API.""" @property def name(self) -> str: return 'alibaba' @property def base_url(self) -> str: return self._base_url @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: base_profile = qwen_model_profile(model_name) # Wrap/merge into OpenAIModelProfile openai_profile = OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(base_profile) # For Qwen Omni models, force URI audio input encoding if 'omni' in model_name.lower(): openai_profile = OpenAIModelProfile(openai_chat_audio_input_encoding='uri').update(openai_profile) return openai_profile @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str, base_url: str | None = None) -> None: ... @overload def __init__(self, *, api_key: str, base_url: str | None = None, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, base_url: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: if openai_client is not None: self._client = openai_client self._base_url = str(openai_client.base_url) else: # NOTE: We support DASHSCOPE_API_KEY for compatibility with Alibaba's official docs. api_key = api_key or os.getenv('ALIBABA_API_KEY') or os.getenv('DASHSCOPE_API_KEY') if not api_key: raise UserError( 'Set the `ALIBABA_API_KEY` environment variable or pass it via ' '`AlibabaProvider(api_key=...)` to use the Alibaba provider.' ) self._base_url = base_url or 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1' if http_client is None: http_client = cached_async_http_client(provider='alibaba') self._client = AsyncOpenAI(base_url=self._base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for SambaNova AI models. SambaNova uses an OpenAI-compatible API. Source code in `pydantic_ai_slim/pydantic_ai/providers/sambanova.py` ```python class SambaNovaProvider(Provider[AsyncOpenAI]): """Provider for SambaNova AI models. SambaNova uses an OpenAI-compatible API. """ @property def name(self) -> str: """Return the provider name.""" return 'sambanova' @property def base_url(self) -> str: """Return the base URL.""" return self._base_url @property def client(self) -> AsyncOpenAI: """Return the AsyncOpenAI client.""" return self._client def model_profile(self, model_name: str) -> ModelProfile | None: """Get model profile for SambaNova models. SambaNova serves models from multiple families including Meta Llama, DeepSeek, Qwen, and Mistral. Model profiles are matched based on model name prefixes. """ prefix_to_profile = { 'deepseek-': deepseek_model_profile, 'meta-llama-': meta_model_profile, 'llama-': meta_model_profile, 'qwen': qwen_model_profile, 'mistral': mistral_model_profile, } profile = None model_name_lower = model_name.lower() for prefix, profile_func in prefix_to_profile.items(): if model_name_lower.startswith(prefix): profile = profile_func(model_name) break # Wrap into OpenAIModelProfile since SambaNova is OpenAI-compatible return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) def __init__( self, *, api_key: str | None = None, base_url: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Initialize SambaNova provider. Args: api_key: SambaNova API key. If not provided, reads from SAMBANOVA_API_KEY env var. base_url: Custom API base URL. Defaults to https://api.sambanova.ai/v1 openai_client: Optional pre-configured OpenAI client http_client: Optional custom httpx.AsyncClient for making HTTP requests Raises: UserError: If API key is not provided and SAMBANOVA_API_KEY env var is not set """ if openai_client is not None: self._client = openai_client self._base_url = str(openai_client.base_url) else: # Get API key from parameter or environment api_key = api_key or os.getenv('SAMBANOVA_API_KEY') if not api_key: raise UserError( 'Set the `SAMBANOVA_API_KEY` environment variable or pass it via ' '`SambaNovaProvider(api_key=...)` to use the SambaNova provider.' ) # Set base URL (default to SambaNova API endpoint) self._base_url = base_url or os.getenv('SAMBANOVA_BASE_URL', 'https://api.sambanova.ai/v1') # Create http client and AsyncOpenAI client http_client = http_client or cached_async_http_client(provider='sambanova') self._client = AsyncOpenAI(base_url=self._base_url, api_key=api_key, http_client=http_client) ``` ### name ```python name: str ``` Return the provider name. ### base_url ```python base_url: str ``` Return the base URL. ### client ```python client: AsyncOpenAI ``` Return the AsyncOpenAI client. ### model_profile ```python model_profile(model_name: str) -> ModelProfile | None ``` Get model profile for SambaNova models. SambaNova serves models from multiple families including Meta Llama, DeepSeek, Qwen, and Mistral. Model profiles are matched based on model name prefixes. Source code in `pydantic_ai_slim/pydantic_ai/providers/sambanova.py` ```python def model_profile(self, model_name: str) -> ModelProfile | None: """Get model profile for SambaNova models. SambaNova serves models from multiple families including Meta Llama, DeepSeek, Qwen, and Mistral. Model profiles are matched based on model name prefixes. """ prefix_to_profile = { 'deepseek-': deepseek_model_profile, 'meta-llama-': meta_model_profile, 'llama-': meta_model_profile, 'qwen': qwen_model_profile, 'mistral': mistral_model_profile, } profile = None model_name_lower = model_name.lower() for prefix, profile_func in prefix_to_profile.items(): if model_name_lower.startswith(prefix): profile = profile_func(model_name) break # Wrap into OpenAIModelProfile since SambaNova is OpenAI-compatible return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) ``` ### __init__ ```python __init__( *, api_key: str | None = None, base_url: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Initialize SambaNova provider. Parameters: | Name | Type | Description | Default | | --------------- | ------------- | ----------- | ------------------------------------------------------------------------- | | `api_key` | \`str | None\` | SambaNova API key. If not provided, reads from SAMBANOVA_API_KEY env var. | | `base_url` | \`str | None\` | Custom API base URL. Defaults to https://api.sambanova.ai/v1 | | `openai_client` | \`AsyncOpenAI | None\` | Optional pre-configured OpenAI client | | `http_client` | \`AsyncClient | None\` | Optional custom httpx.AsyncClient for making HTTP requests | Raises: | Type | Description | | ----------- | ------------------------------------------------------------------- | | `UserError` | If API key is not provided and SAMBANOVA_API_KEY env var is not set | Source code in `pydantic_ai_slim/pydantic_ai/providers/sambanova.py` ```python def __init__( self, *, api_key: str | None = None, base_url: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Initialize SambaNova provider. Args: api_key: SambaNova API key. If not provided, reads from SAMBANOVA_API_KEY env var. base_url: Custom API base URL. Defaults to https://api.sambanova.ai/v1 openai_client: Optional pre-configured OpenAI client http_client: Optional custom httpx.AsyncClient for making HTTP requests Raises: UserError: If API key is not provided and SAMBANOVA_API_KEY env var is not set """ if openai_client is not None: self._client = openai_client self._base_url = str(openai_client.base_url) else: # Get API key from parameter or environment api_key = api_key or os.getenv('SAMBANOVA_API_KEY') if not api_key: raise UserError( 'Set the `SAMBANOVA_API_KEY` environment variable or pass it via ' '`SambaNovaProvider(api_key=...)` to use the SambaNova provider.' ) # Set base URL (default to SambaNova API endpoint) self._base_url = base_url or os.getenv('SAMBANOVA_BASE_URL', 'https://api.sambanova.ai/v1') # Create http client and AsyncOpenAI client http_client = http_client or cached_async_http_client(provider='sambanova') self._client = AsyncOpenAI(base_url=self._base_url, api_key=api_key, http_client=http_client) ``` # `pydantic_ai.result` ### StreamedRunResult Bases: `Generic[AgentDepsT, OutputDataT]` Result of a streamed run that returns structured data via a tool call. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @dataclass(init=False) class StreamedRunResult(Generic[AgentDepsT, OutputDataT]): """Result of a streamed run that returns structured data via a tool call.""" _all_messages: list[_messages.ModelMessage] _new_message_index: int _stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None _on_complete: Callable[[], Awaitable[None]] | None = None _run_result: AgentRunResult[OutputDataT] | None = None is_complete: bool = field(default=False, init=False) """Whether the stream has all been received. This is set to `True` when one of [`stream_output`][pydantic_ai.result.StreamedRunResult.stream_output], [`stream_text`][pydantic_ai.result.StreamedRunResult.stream_text], [`stream_responses`][pydantic_ai.result.StreamedRunResult.stream_responses] or [`get_output`][pydantic_ai.result.StreamedRunResult.get_output] completes. """ @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None, on_complete: Callable[[], Awaitable[None]] | None, ) -> None: ... @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, *, run_result: AgentRunResult[OutputDataT], ) -> None: ... def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None, on_complete: Callable[[], Awaitable[None]] | None = None, run_result: AgentRunResult[OutputDataT] | None = None, ) -> None: self._all_messages = all_messages self._new_message_index = new_message_index self._stream_response = stream_response self._on_complete = on_complete self._run_result = run_result def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: yield self.response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self.response yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self.response) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover @property def response(self) -> _messages.ModelResponse: """Return the current state of the response.""" if self._run_result is not None: return self._run_result.response elif self._stream_response is not None: return self._stream_response.get() else: raise ValueError('No stream response or run result provided') # pragma: no cover @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" if self._run_result is not None: return self._run_result.metadata elif self._stream_response is not None: return self._stream_response.metadata else: return None # TODO (v2): Make this a property def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover # TODO (v2): Make this a property def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover @property def run_id(self) -> str: """The unique identifier for the agent run.""" if self._run_result is not None: return self._run_result.run_id elif self._stream_response is not None: return self._stream_response.run_id else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def _marked_completed(self, message: _messages.ModelResponse | None = None) -> None: if self.is_complete: return self.is_complete = True if message is not None: if self._stream_response: # pragma: no branch message.run_id = self._stream_response.run_id self._all_messages.append(message) if self._on_complete is not None: await self._on_complete() ``` #### is_complete ```python is_complete: bool = field(default=False, init=False) ``` Whether the stream has all been received. This is set to `True` when one of stream_output, stream_text, stream_responses or get_output completes. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | ----------------- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | --------------------- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### stream ```python stream( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Deprecated `StreamedRunResult.stream` is deprecated, use `stream_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output ``` #### stream_output ```python stream_output( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Parameters: | Name | Type | Description | Default | | ------------- | ------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the output chunks by. None means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. | Returns: | Type | Description | | ---------------------------- | --------------------------------------- | | `AsyncIterator[OutputDataT]` | An async iterable of the response data. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_text ```python stream_text( *, delta: bool = False, debounce_by: float | None = 0.1 ) -> AsyncIterator[str] ``` Stream the text result as an async iterable. Note Result validators will NOT be called on the text result if `delta=True`. Parameters: | Name | Type | Description | Default | | ------------- | ------- | --------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `delta` | `bool` | if True, yield each chunk of text as it is received, if False (default), yield the full text up to the current point. | `False` | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_structured ```python stream_structured( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Deprecated `StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last ``` #### stream_responses ```python stream_responses( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Stream the response as an async iterable of Structured LLM Messages. Parameters: | Name | Type | Description | Default | | ------------- | ------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | Returns: | Type | Description | | ------------------------------------------- | ------------------------------------------------------------------------------------------ | | `AsyncIterator[tuple[ModelResponse, bool]]` | An async iterable of the structured response message and whether that is the last message. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: yield self.response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self.response yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### get_output ```python get_output() -> OutputDataT ``` Stream the whole response, validate and return it. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self.response) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### response ```python response: ModelResponse ``` Return the current state of the response. #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### timestamp ```python timestamp() -> datetime ``` Get the timestamp of the response. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### run_id ```python run_id: str ``` The unique identifier for the agent run. #### validate_structured_output ```python validate_structured_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Deprecated `validate_structured_output` is deprecated, use `validate_response_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) ``` #### validate_response_output ```python validate_response_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Validate a structured result message. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` ### StreamedRunResultSync Bases: `Generic[AgentDepsT, OutputDataT]` Synchronous wrapper for StreamedRunResult that only exposes sync methods. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @dataclass(init=False) class StreamedRunResultSync(Generic[AgentDepsT, OutputDataT]): """Synchronous wrapper for [`StreamedRunResult`][pydantic_ai.result.StreamedRunResult] that only exposes sync methods.""" _streamed_run_result: StreamedRunResult[AgentDepsT, OutputDataT] def __init__(self, streamed_run_result: StreamedRunResult[AgentDepsT, OutputDataT]) -> None: self._streamed_run_result = streamed_run_result def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ return self._streamed_run_result.all_messages(output_tool_return_content=output_tool_return_content) def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResultSync.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return self._streamed_run_result.all_messages_json(output_tool_return_content=output_tool_return_content) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self._streamed_run_result.new_messages(output_tool_return_content=output_tool_return_content) def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResultSync.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return self._streamed_run_result.new_messages_json(output_tool_return_content=output_tool_return_content) def stream_output(self, *, debounce_by: float | None = 0.1) -> Iterator[OutputDataT]: """Stream the output as an iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An iterable of the response data. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_output(debounce_by=debounce_by)) def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> Iterator[str]: """Stream the text result as an iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_text(delta=delta, debounce_by=debounce_by)) def stream_responses(self, *, debounce_by: float | None = 0.1) -> Iterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An iterable of the structured response message and whether that is the last message. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_responses(debounce_by=debounce_by)) def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" return _utils.get_event_loop().run_until_complete(self._streamed_run_result.get_output()) @property def response(self) -> _messages.ModelResponse: """Return the current state of the response.""" return self._streamed_run_result.response def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ return self._streamed_run_result.usage() def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._streamed_run_result.timestamp() @property def run_id(self) -> str: """The unique identifier for the agent run.""" return self._streamed_run_result.run_id @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" return self._streamed_run_result.metadata def validate_response_output(self, message: _messages.ModelResponse, *, allow_partial: bool = False) -> OutputDataT: """Validate a structured result message.""" return _utils.get_event_loop().run_until_complete( self._streamed_run_result.validate_response_output(message, allow_partial=allow_partial) ) @property def is_complete(self) -> bool: """Whether the stream has all been received. This is set to `True` when one of [`stream_output`][pydantic_ai.result.StreamedRunResultSync.stream_output], [`stream_text`][pydantic_ai.result.StreamedRunResultSync.stream_text], [`stream_responses`][pydantic_ai.result.StreamedRunResultSync.stream_responses] or [`get_output`][pydantic_ai.result.StreamedRunResultSync.get_output] completes. """ return self._streamed_run_result.is_complete ``` #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of messages. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | ----------------- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ return self._streamed_run_result.all_messages(output_tool_return_content=output_tool_return_content) ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResultSync.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return self._streamed_run_result.all_messages_json(output_tool_return_content=output_tool_return_content) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | --------------------- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self._streamed_run_result.new_messages(output_tool_return_content=output_tool_return_content) ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResultSync.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return self._streamed_run_result.new_messages_json(output_tool_return_content=output_tool_return_content) ``` #### stream_output ```python stream_output( *, debounce_by: float | None = 0.1 ) -> Iterator[OutputDataT] ``` Stream the output as an iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Parameters: | Name | Type | Description | Default | | ------------- | ------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the output chunks by. None means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. | Returns: | Type | Description | | ----------------------- | --------------------------------- | | `Iterator[OutputDataT]` | An iterable of the response data. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def stream_output(self, *, debounce_by: float | None = 0.1) -> Iterator[OutputDataT]: """Stream the output as an iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An iterable of the response data. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_output(debounce_by=debounce_by)) ``` #### stream_text ```python stream_text( *, delta: bool = False, debounce_by: float | None = 0.1 ) -> Iterator[str] ``` Stream the text result as an iterable. Note Result validators will NOT be called on the text result if `delta=True`. Parameters: | Name | Type | Description | Default | | ------------- | ------- | --------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `delta` | `bool` | if True, yield each chunk of text as it is received, if False (default), yield the full text up to the current point. | `False` | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> Iterator[str]: """Stream the text result as an iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_text(delta=delta, debounce_by=debounce_by)) ``` #### stream_responses ```python stream_responses( *, debounce_by: float | None = 0.1 ) -> Iterator[tuple[ModelResponse, bool]] ``` Stream the response as an iterable of Structured LLM Messages. Parameters: | Name | Type | Description | Default | | ------------- | ------- | ----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `debounce_by` | \`float | None\` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | Returns: | Type | Description | | -------------------------------------- | ------------------------------------------------------------------------------------ | | `Iterator[tuple[ModelResponse, bool]]` | An iterable of the structured response message and whether that is the last message. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def stream_responses(self, *, debounce_by: float | None = 0.1) -> Iterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An iterable of the structured response message and whether that is the last message. """ return _utils.sync_async_iterator(self._streamed_run_result.stream_responses(debounce_by=debounce_by)) ``` #### get_output ```python get_output() -> OutputDataT ``` Stream the whole response, validate and return it. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" return _utils.get_event_loop().run_until_complete(self._streamed_run_result.get_output()) ``` #### response ```python response: ModelResponse ``` Return the current state of the response. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ return self._streamed_run_result.usage() ``` #### timestamp ```python timestamp() -> datetime ``` Get the timestamp of the response. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._streamed_run_result.timestamp() ``` #### run_id ```python run_id: str ``` The unique identifier for the agent run. #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### validate_response_output ```python validate_response_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Validate a structured result message. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def validate_response_output(self, message: _messages.ModelResponse, *, allow_partial: bool = False) -> OutputDataT: """Validate a structured result message.""" return _utils.get_event_loop().run_until_complete( self._streamed_run_result.validate_response_output(message, allow_partial=allow_partial) ) ``` #### is_complete ```python is_complete: bool ``` Whether the stream has all been received. This is set to `True` when one of stream_output, stream_text, stream_responses or get_output completes. # `pydantic_ai.retries` Retries utilities based on tenacity, especially for HTTP requests. This module provides HTTP transport wrappers and wait strategies that integrate with the tenacity library to add retry capabilities to HTTP requests. The transports can be used with HTTP clients that support custom transports (such as httpx), while the wait strategies can be used with any tenacity retry decorator. The module includes: - TenacityTransport: Synchronous HTTP transport with retry capabilities - AsyncTenacityTransport: Asynchronous HTTP transport with retry capabilities - wait_retry_after: Wait strategy that respects HTTP Retry-After headers ### RetryConfig Bases: `TypedDict` The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python class RetryConfig(TypedDict, total=False): """The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. """ sleep: Callable[[int | float], None | Awaitable[None]] """A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`.""" stop: StopBaseT """ A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`.""" wait: WaitBaseT """ A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`.""" retry: SyncRetryBaseT | RetryBaseT """A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`.""" before: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`.""" after: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`.""" before_sleep: Callable[[RetryCallState], None | Awaitable[None]] | None """ An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`.""" reraise: bool """Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`.""" retry_error_cls: type[RetryError] """The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`.""" retry_error_callback: Callable[[RetryCallState], Any | Awaitable[Any]] | None """An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`.""" ``` #### sleep ```python sleep: Callable[[int | float], None | Awaitable[None]] ``` A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`. #### stop ```python stop: StopBaseT ``` A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`. #### wait ```python wait: WaitBaseT ``` A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`. #### retry ```python retry: RetryBaseT | RetryBaseT ``` A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`. #### before ```python before: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`. #### after ```python after: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`. #### before_sleep ```python before_sleep: ( Callable[[RetryCallState], None | Awaitable[None]] | None ) ``` An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`. #### reraise ```python reraise: bool ``` Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`. #### retry_error_cls ```python retry_error_cls: type[RetryError] ``` The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`. #### retry_error_callback ```python retry_error_callback: ( Callable[[RetryCallState], Any | Awaitable[Any]] | None ) ``` An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`. ### TenacityTransport Bases: `BaseTransport` Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | ------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `wrapped` | \`BaseTransport | None\` | The underlying transport to wrap and add retry functionality to. | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | \`Callable\[[Response], Any\] | None\` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | Example ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class TenacityTransport(BaseTransport): """Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: BaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or HTTPTransport() self.validate_response = validate_response def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) def __enter__(self) -> TenacityTransport: self.wrapped.__enter__() return self def __exit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: self.wrapped.__exit__(exc_type, exc_value, traceback) def close(self) -> None: self.wrapped.close() # pragma: no cover ```` #### handle_request ```python handle_request(request: Request) -> Response ``` Handle an HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --------- | --------- | --------------------------- | ---------- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | ---------- | ------------------ | | `Response` | The HTTP response. | Raises: | Type | Description | | -------------- | --------------------------------------------------------------------- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) ``` ### AsyncTenacityTransport Bases: `AsyncBaseTransport` Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | ------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `wrapped` | \`AsyncBaseTransport | None\` | The underlying async transport to wrap and add retry functionality to. | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | \`Callable\[[Response], Any\] | None\` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class AsyncTenacityTransport(AsyncBaseTransport): """Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying async transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: AsyncBaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or AsyncHTTPTransport() self.validate_response = validate_response async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) async def __aenter__(self) -> AsyncTenacityTransport: await self.wrapped.__aenter__() return self async def __aexit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: await self.wrapped.__aexit__(exc_type, exc_value, traceback) async def aclose(self) -> None: await self.wrapped.aclose() ```` #### handle_async_request ```python handle_async_request(request: Request) -> Response ``` Handle an async HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --------- | --------- | --------------------------- | ---------- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | ---------- | ------------------ | | `Response` | The HTTP response. | Raises: | Type | Description | | -------------- | --------------------------------------------------------------------- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) ``` ### wait_retry_after ```python wait_retry_after( fallback_strategy: ( Callable[[RetryCallState], float] | None ) = None, max_wait: float = 300, ) -> Callable[[RetryCallState], float] ``` Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Parameters: | Name | Type | Description | Default | | ------------------- | ------------------------------------- | ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | | `fallback_strategy` | \`Callable\[[RetryCallState], float\] | None\` | Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. | | `max_wait` | `float` | Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). | `300` | Returns: | Type | Description | | ----------------------------------- | ---------------------------------------------------------------- | | `Callable[[RetryCallState], float]` | A wait function that can be used with tenacity retry decorators. | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python def wait_retry_after( fallback_strategy: Callable[[RetryCallState], float] | None = None, max_wait: float = 300 ) -> Callable[[RetryCallState], float]: """Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Args: fallback_strategy: Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. max_wait: Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). Returns: A wait function that can be used with tenacity retry decorators. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ if fallback_strategy is None: fallback_strategy = wait_exponential(multiplier=1, max=60) def wait_func(state: RetryCallState) -> float: exc = state.outcome.exception() if state.outcome else None if isinstance(exc, HTTPStatusError): retry_after = exc.response.headers.get('retry-after') if retry_after: try: # Try parsing as seconds first wait_seconds = int(retry_after) return min(float(wait_seconds), max_wait) except ValueError: # Try parsing as HTTP date try: retry_time = cast(datetime, parsedate_to_datetime(retry_after)) assert isinstance(retry_time, datetime) now = datetime.now(timezone.utc) wait_seconds = (retry_time - now).total_seconds() if wait_seconds > 0: return min(wait_seconds, max_wait) except (ValueError, TypeError, AssertionError): # If date parsing fails, fall back to fallback strategy pass # Use fallback strategy return fallback_strategy(state) return wait_func ```` # `pydantic_ai.run` ### AgentRun Bases: `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an Agent. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an End is reached, the run finishes and result becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the next method for more granular control. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python @dataclasses.dataclass(repr=False) class AgentRun(Generic[AgentDepsT, OutputDataT]): """A stateful, async-iterable run of an [`Agent`][pydantic_ai.agent.Agent]. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`][pydantic_graph.nodes.End] is reached, the run finishes and [`result`][pydantic_ai.agent.AgentRun.result] becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`][pydantic_ai.agent.AgentRun.next] method for more granular control. """ _graph_run: GraphRun[ _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[OutputDataT] ] @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: traceparent = self._graph_run._traceparent(required=False) # type: ignore[reportPrivateUsage] if traceparent is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return traceparent @property def ctx(self) -> GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]: """The current context of the agent run.""" return GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]( state=self._graph_run.state, deps=self._graph_run.deps ) @property def next_node( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ task = self._graph_run.next_task return self._task_to_node(task) @property def result(self) -> AgentRunResult[OutputDataT] | None: """The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`][pydantic_graph.nodes.End] node, `result` is populated with an [`AgentRunResult`][pydantic_ai.agent.AgentRunResult]. """ graph_run_output = self._graph_run.output if graph_run_output is None: return None return AgentRunResult( graph_run_output.output, graph_run_output.tool_name, self._graph_run.state, self._graph_run.deps.new_message_index, self._traceparent(required=False), ) def all_messages(self) -> list[_messages.ModelMessage]: """Return all messages for the run so far. Messages from older runs are included. """ return self.ctx.state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRun.all_messages] as JSON bytes. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.all_messages()) def new_messages(self) -> list[_messages.ModelMessage]: """Return new messages for the run so far. Messages from older runs are excluded. """ return self.all_messages()[self.ctx.deps.new_message_index :] def new_messages_json(self) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRun.new_messages] as JSON bytes. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.new_messages()) def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" task = await anext(self._graph_run) return self._task_to_node(task) def _task_to_node( self, task: EndMarker[FinalResult[OutputDataT]] | JoinItem | Sequence[GraphTaskRequest] ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: if isinstance(task, Sequence) and len(task) == 1: first_task = task[0] if isinstance(first_task.inputs, BaseNode): # pragma: no branch base_node: BaseNode[ # pyright: ignore[reportUnknownVariableType] _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, OutputDataT], FinalResult[OutputDataT], ] = first_task.inputs # pyright: ignore[reportUnknownMemberType] if _agent_graph.is_agent_node(node=base_node): # pragma: no branch return base_node if isinstance(task, EndMarker): return End(task.value) raise exceptions.AgentRunError(f'Unexpected node: {task}') # pragma: no cover def _node_to_task(self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT]) -> GraphTaskRequest: return GraphTaskRequest(NodeStep(type(node)).id, inputs=node, fork_stack=()) async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. task = [self._node_to_task(node)] try: task = await self._graph_run.next(task) except StopAsyncIteration: pass return self._task_to_node(task) # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" return self._graph_run.state.metadata @property def run_id(self) -> str: """The unique identifier for the agent run.""" return self._graph_run.state.run_id def __repr__(self) -> str: # pragma: no cover result = self._graph_run.output result_repr = '' if result is None else repr(result.output) return f'<{type(self).__name__} result={result_repr} usage={self.usage()}>' ```` #### ctx ```python ctx: GraphRunContext[ GraphAgentState, GraphAgentDeps[AgentDepsT, Any] ] ``` The current context of the agent run. #### next_node ```python next_node: ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: AgentRunResult[OutputDataT] | None ``` The final result of the run if it has ended, otherwise `None`. Once the run returns an End node, `result` is populated with an AgentRunResult. #### all_messages ```python all_messages() -> list[ModelMessage] ``` Return all messages for the run so far. Messages from older runs are included. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self) -> list[_messages.ModelMessage]: """Return all messages for the run so far. Messages from older runs are included. """ return self.ctx.state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRun.all_messages] as JSON bytes. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.all_messages()) ``` #### new_messages ```python new_messages() -> list[ModelMessage] ``` Return new messages for the run so far. Messages from older runs are excluded. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self) -> list[_messages.ModelMessage]: """Return new messages for the run so far. Messages from older runs are excluded. """ return self.all_messages()[self.ctx.deps.new_message_index :] ``` #### new_messages_json ```python new_messages_json() -> bytes ``` Return new messages from new_messages as JSON bytes. Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRun.new_messages] as JSON bytes. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json(self.new_messages()) ``` #### __aiter__ ```python __aiter__() -> ( AsyncIterator[ AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ] ) ``` Provide async-iteration over the nodes in the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self ``` #### __anext__ ```python __anext__() -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Advance to the next node automatically based on the last returned node. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" task = await anext(self._graph_run) return self._task_to_node(task) ``` #### next ```python next( node: AgentNode[AgentDepsT, OutputDataT], ) -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an End node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | ------ | ------------------------------------ | ---------------------------------- | ---------- | | `node` | `AgentNode[AgentDepsT, OutputDataT]` | The node to run next in the graph. | *required* | Returns: | Type | Description | | ------------------------------------ | --------------------------------- | | \`AgentNode[AgentDepsT, OutputDataT] | End\[FinalResult[OutputDataT]\]\` | | \`AgentNode[AgentDepsT, OutputDataT] | End\[FinalResult[OutputDataT]\]\` | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-5.2') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-5.2', timestamp=datetime.datetime(...), run_id='...', ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. task = [self._node_to_task(node)] try: task = await self._graph_run.next(task) except StopAsyncIteration: pass return self._task_to_node(task) ```` #### usage ```python usage() -> RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage ``` #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### run_id ```python run_id: str ``` The unique identifier for the agent run. ### AgentRunResult Bases: `Generic[OutputDataT]` The final result of an agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass class AgentRunResult(Generic[OutputDataT]): """The final result of an agent run.""" output: OutputDataT """The output data from the agent run.""" _output_tool_name: str | None = dataclasses.field(repr=False, compare=False, default=None) _state: _agent_graph.GraphAgentState = dataclasses.field( repr=False, compare=False, default_factory=_agent_graph.GraphAgentState ) _new_message_index: int = dataclasses.field(repr=False, compare=False, default=0) _traceparent_value: str | None = dataclasses.field(repr=False, compare=False, default=None) @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self._traceparent_value is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return self._traceparent_value def _set_output_tool_return(self, return_content: str) -> list[_messages.ModelMessage]: """Set return content for the output tool. Useful if you want to continue the conversation and want to set the response to the output tool call. """ if not self._output_tool_name: raise ValueError('Cannot set output tool return content when the return type is `str`.') messages = self._state.message_history last_message = messages[-1] for idx, part in enumerate(last_message.parts): if isinstance(part, _messages.ToolReturnPart) and part.tool_name == self._output_tool_name: # Only do deepcopy when we have to modify copied_messages = list(messages) copied_last = deepcopy(last_message) copied_last.parts[idx].content = return_content # type: ignore[misc] copied_messages[-1] = copied_last return copied_messages raise LookupError(f'No tool call found with tool name {self._output_tool_name!r}.') def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @property def response(self) -> _messages.ModelResponse: """Return the last response from the message history.""" # The response may not be the very last item if it contained an output tool call. See `CallToolsNode._handle_final_result`. for message in reversed(self.all_messages()): if isinstance(message, _messages.ModelResponse): return message raise ValueError('No response found in the message history') # pragma: no cover # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage # TODO (v2): Make this a property def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp @property def metadata(self) -> dict[str, Any] | None: """Metadata associated with this agent run, if configured.""" return self._state.metadata @property def run_id(self) -> str: """The unique identifier for the agent run.""" return self._state.run_id ``` #### output ```python output: OutputDataT ``` The output data from the agent run. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | ----------------- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ------------------------------------- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | -------------------- | --------------------- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | ---------------------------- | ----- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_tool_return_content` | \`str | None\` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | Returns: | Type | Description | | ------- | ----------------------------------------- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### response ```python response: ModelResponse ``` Return the last response from the message history. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage ``` #### timestamp ```python timestamp() -> datetime ``` Return the timestamp of last response. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` #### metadata ```python metadata: dict[str, Any] | None ``` Metadata associated with this agent run, if configured. #### run_id ```python run_id: str ``` The unique identifier for the agent run. ### AgentRunResultEvent Bases: `Generic[OutputDataT]` An event indicating the agent run ended and containing the final result of the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass(repr=False) class AgentRunResultEvent(Generic[OutputDataT]): """An event indicating the agent run ended and containing the final result of the agent run.""" result: AgentRunResult[OutputDataT] """The result of the run.""" _: dataclasses.KW_ONLY event_kind: Literal['agent_run_result'] = 'agent_run_result' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### result ```python result: AgentRunResult[OutputDataT] ``` The result of the run. #### event_kind ```python event_kind: Literal["agent_run_result"] = "agent_run_result" ``` Event type identifier, used as a discriminator. # `pydantic_ai.settings` ### ModelSettings Bases: `TypedDict` Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. Source code in `pydantic_ai_slim/pydantic_ai/settings.py` ```python class ModelSettings(TypedDict, total=False): """Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. """ max_tokens: int """The maximum number of tokens to generate before stopping. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock * MCP Sampling * Outlines (all providers) * xAI """ temperature: float """Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock * Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) * xAI """ top_p: float """An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock * Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) * xAI """ timeout: float | Timeout """Override the client-level default timeout for a request, in seconds. Supported by: * Gemini * Anthropic * OpenAI * Groq * Mistral * xAI """ parallel_tool_calls: bool """Whether to allow parallel tool calls. Supported by: * OpenAI (some models, not o1) * Groq * Anthropic * xAI """ seed: int """The random seed to use for the model, theoretically allowing for deterministic results. Supported by: * OpenAI * Groq * Cohere * Mistral * Gemini * Outlines (LlamaCpp, VLLMOffline) """ presence_penalty: float """Penalize new tokens based on whether they have appeared in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral * Outlines (LlamaCpp, SgLang, VLLMOffline) * xAI """ frequency_penalty: float """Penalize new tokens based on their existing frequency in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral * Outlines (LlamaCpp, SgLang, VLLMOffline) * xAI """ logit_bias: dict[str, int] """Modify the likelihood of specified tokens appearing in the completion. Supported by: * OpenAI * Groq * Outlines (Transformers, LlamaCpp, VLLMOffline) """ stop_sequences: list[str] """Sequences that will cause the model to stop generating. Supported by: * OpenAI * Anthropic * Bedrock * Mistral * Groq * Cohere * Google * xAI """ extra_headers: dict[str, str] """Extra headers to send to the model. Supported by: * OpenAI * Anthropic * Groq * xAI """ extra_body: object """Extra body to send to the model. Supported by: * OpenAI * Anthropic * Groq * Outlines (all providers) """ ``` #### max_tokens ```python max_tokens: int ``` The maximum number of tokens to generate before stopping. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - MCP Sampling - Outlines (all providers) - xAI #### temperature ```python temperature: float ``` Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) - xAI #### top_p ```python top_p: float ``` An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - Outlines (Transformers, LlamaCpp, SgLang, VLLMOffline) - xAI #### timeout ```python timeout: float | Timeout ``` Override the client-level default timeout for a request, in seconds. Supported by: - Gemini - Anthropic - OpenAI - Groq - Mistral - xAI #### parallel_tool_calls ```python parallel_tool_calls: bool ``` Whether to allow parallel tool calls. Supported by: - OpenAI (some models, not o1) - Groq - Anthropic - xAI #### seed ```python seed: int ``` The random seed to use for the model, theoretically allowing for deterministic results. Supported by: - OpenAI - Groq - Cohere - Mistral - Gemini - Outlines (LlamaCpp, VLLMOffline) #### presence_penalty ```python presence_penalty: float ``` Penalize new tokens based on whether they have appeared in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral - Outlines (LlamaCpp, SgLang, VLLMOffline) - xAI #### frequency_penalty ```python frequency_penalty: float ``` Penalize new tokens based on their existing frequency in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral - Outlines (LlamaCpp, SgLang, VLLMOffline) - xAI #### logit_bias ```python logit_bias: dict[str, int] ``` Modify the likelihood of specified tokens appearing in the completion. Supported by: - OpenAI - Groq - Outlines (Transformers, LlamaCpp, VLLMOffline) #### stop_sequences ```python stop_sequences: list[str] ``` Sequences that will cause the model to stop generating. Supported by: - OpenAI - Anthropic - Bedrock - Mistral - Groq - Cohere - Google - xAI #### extra_headers ```python extra_headers: dict[str, str] ``` Extra headers to send to the model. Supported by: - OpenAI - Anthropic - Groq - xAI #### extra_body ```python extra_body: object ``` Extra body to send to the model. Supported by: - OpenAI - Anthropic - Groq - Outlines (all providers) # `pydantic_ai.tools` ### AgentDepsT ```python AgentDepsT = TypeVar( "AgentDepsT", default=None, contravariant=True ) ``` Type variable for agent dependencies. ### RunContext Bases: `Generic[RunContextAgentDepsT]` Information about the current call. Source code in `pydantic_ai_slim/pydantic_ai/_run_context.py` ```python @dataclasses.dataclass(repr=False, kw_only=True) class RunContext(Generic[RunContextAgentDepsT]): """Information about the current call.""" deps: RunContextAgentDepsT """Dependencies for the agent.""" model: Model """The model used in this run.""" usage: RunUsage """LLM usage associated with the run.""" prompt: str | Sequence[_messages.UserContent] | None = None """The original user prompt passed to the run.""" messages: list[_messages.ModelMessage] = field(default_factory=list[_messages.ModelMessage]) """Messages exchanged in the conversation so far.""" validation_context: Any = None """Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) for tool args and run outputs.""" tracer: Tracer = field(default_factory=NoOpTracer) """The tracer to use for tracing the run.""" trace_include_content: bool = False """Whether to include the content of the messages in the trace.""" instrumentation_version: int = DEFAULT_INSTRUMENTATION_VERSION """Instrumentation settings version, if instrumentation is enabled.""" retries: dict[str, int] = field(default_factory=dict[str, int]) """Number of retries for each tool so far.""" tool_call_id: str | None = None """The ID of the tool call.""" tool_name: str | None = None """Name of the tool being called.""" retry: int = 0 """Number of retries so far. For tool calls, this is the number of retries of the specific tool. For output validation, this is the number of output validation retries. """ max_retries: int = 0 """The maximum number of retries allowed. For tool calls, this is the maximum retries for the specific tool. For output validation, this is the maximum output validation retries. """ run_step: int = 0 """The current step in the run.""" tool_call_approved: bool = False """Whether a tool call that required approval has now been approved.""" tool_call_metadata: Any = None """Metadata from `DeferredToolResults.metadata[tool_call_id]`, available when `tool_call_approved=True`.""" partial_output: bool = False """Whether the output passed to an output validator is partial.""" run_id: str | None = None """"Unique identifier for the agent run.""" metadata: dict[str, Any] | None = None """Metadata associated with this agent run, if configured.""" @property def last_attempt(self) -> bool: """Whether this is the last attempt at running this tool before an error is raised.""" return self.retry == self.max_retries __repr__ = _utils.dataclasses_no_defaults_repr ``` #### deps ```python deps: RunContextAgentDepsT ``` Dependencies for the agent. #### model ```python model: Model ``` The model used in this run. #### usage ```python usage: RunUsage ``` LLM usage associated with the run. #### prompt ```python prompt: str | Sequence[UserContent] | None = None ``` The original user prompt passed to the run. #### messages ```python messages: list[ModelMessage] = field( default_factory=list[ModelMessage] ) ``` Messages exchanged in the conversation so far. #### validation_context ```python validation_context: Any = None ``` Pydantic [validation context](https://docs.pydantic.dev/latest/concepts/validators/#validation-context) for tool args and run outputs. #### tracer ```python tracer: Tracer = field(default_factory=NoOpTracer) ``` The tracer to use for tracing the run. #### trace_include_content ```python trace_include_content: bool = False ``` Whether to include the content of the messages in the trace. #### instrumentation_version ```python instrumentation_version: int = ( DEFAULT_INSTRUMENTATION_VERSION ) ``` Instrumentation settings version, if instrumentation is enabled. #### retries ```python retries: dict[str, int] = field( default_factory=dict[str, int] ) ``` Number of retries for each tool so far. #### tool_call_id ```python tool_call_id: str | None = None ``` The ID of the tool call. #### tool_name ```python tool_name: str | None = None ``` Name of the tool being called. #### retry ```python retry: int = 0 ``` Number of retries so far. For tool calls, this is the number of retries of the specific tool. For output validation, this is the number of output validation retries. #### max_retries ```python max_retries: int = 0 ``` The maximum number of retries allowed. For tool calls, this is the maximum retries for the specific tool. For output validation, this is the maximum output validation retries. #### run_step ```python run_step: int = 0 ``` The current step in the run. #### tool_call_approved ```python tool_call_approved: bool = False ``` Whether a tool call that required approval has now been approved. #### tool_call_metadata ```python tool_call_metadata: Any = None ``` Metadata from `DeferredToolResults.metadata[tool_call_id]`, available when `tool_call_approved=True`. #### partial_output ```python partial_output: bool = False ``` Whether the output passed to an output validator is partial. #### run_id ```python run_id: str | None = None ``` "Unique identifier for the agent run. #### metadata ```python metadata: dict[str, Any] | None = None ``` Metadata associated with this agent run, if configured. #### last_attempt ```python last_attempt: bool ``` Whether this is the last attempt at running this tool before an error is raised. ### ToolParams ```python ToolParams = ParamSpec('ToolParams', default=...) ``` Retrieval function param spec. ### SystemPromptFunc ```python SystemPromptFunc: TypeAlias = ( Callable[[RunContext[AgentDepsT]], str | None] | Callable[ [RunContext[AgentDepsT]], Awaitable[str | None] ] | Callable[[], str | None] | Callable[[], Awaitable[str | None]] ) ``` A function that may or maybe not take `RunContext` as an argument, and may or may not be async. Functions which return None are excluded from model requests. Usage `SystemPromptFunc[AgentDepsT]`. ### ToolFuncContext ```python ToolFuncContext: TypeAlias = Callable[ Concatenate[RunContext[AgentDepsT], ToolParams], Any ] ``` A tool function that takes `RunContext` as the first argument. Usage `ToolContextFunc[AgentDepsT, ToolParams]`. ### ToolFuncPlain ```python ToolFuncPlain: TypeAlias = Callable[ToolParams, Any] ``` A tool function that does not take `RunContext` as the first argument. Usage `ToolPlainFunc[ToolParams]`. ### ToolFuncEither ```python ToolFuncEither: TypeAlias = ( ToolFuncContext[AgentDepsT, ToolParams] | ToolFuncPlain[ToolParams] ) ``` Either kind of tool function. This is just a union of ToolFuncContext and ToolFuncPlain. Usage `ToolFuncEither[AgentDepsT, ToolParams]`. ### ToolPrepareFunc ```python ToolPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], "ToolDefinition"], Awaitable["ToolDefinition | None"], ] ``` Definition of a function that can prepare a tool definition at call time. See [tool docs](https://ai.pydantic.dev/tools-advanced/#tool-prepare) for more information. Example — here `only_if_42` is valid as a `ToolPrepareFunc`: ```python from pydantic_ai import RunContext, Tool from pydantic_ai.tools import ToolDefinition async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' hitchhiker = Tool(hitchhiker, prepare=only_if_42) ``` Usage `ToolPrepareFunc[AgentDepsT]`. ### ToolsPrepareFunc ```python ToolsPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], list["ToolDefinition"]], Awaitable["list[ToolDefinition] | None"], ] ``` Definition of a function that can prepare the tool definition of all tools for each step. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. Example — here `turn_on_strict_if_openai` is valid as a `ToolsPrepareFunc`: ```python from dataclasses import replace from pydantic_ai import Agent, RunContext from pydantic_ai.tools import ToolDefinition async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs agent = Agent('openai:gpt-5.2', prepare_tools=turn_on_strict_if_openai) ``` Usage `ToolsPrepareFunc[AgentDepsT]`. ### BuiltinToolFunc ```python BuiltinToolFunc: TypeAlias = Callable[ [RunContext[AgentDepsT]], Awaitable[AbstractBuiltinTool | None] | AbstractBuiltinTool | None, ] ``` Definition of a function that can prepare a builtin tool at call time. This is useful if you want to customize the builtin tool based on the run context (e.g. user dependencies), or omit it completely from a step. ### DocstringFormat ```python DocstringFormat: TypeAlias = Literal[ "google", "numpy", "sphinx", "auto" ] ``` Supported docstring formats. - `'google'` — [Google-style](https://google.github.io/styleguide/pyguide.html#381-docstrings) docstrings. - `'numpy'` — [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html) docstrings. - `'sphinx'` — [Sphinx-style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format) docstrings. - `'auto'` — Automatically infer the format based on the structure of the docstring. ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list[ToolCallPart]) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list[ToolCallPart]) """Tool calls that require human-in-the-loop approval.""" metadata: dict[str, dict[str, Any]] = field(default_factory=dict[str, dict[str, Any]]) """Metadata for deferred tool calls, keyed by `tool_call_id`.""" ``` #### calls ```python calls: list[ToolCallPart] = field( default_factory=list[ToolCallPart] ) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field( default_factory=list[ToolCallPart] ) ``` Tool calls that require human-in-the-loop approval. #### metadata ```python metadata: dict[str, dict[str, Any]] = field( default_factory=dict[str, dict[str, Any]] ) ``` Metadata for deferred tool calls, keyed by `tool_call_id`. ### ToolApproved Indicates that a tool call has been approved and that the tool function should be executed. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class ToolApproved: """Indicates that a tool call has been approved and that the tool function should be executed.""" override_args: dict[str, Any] | None = None """Optional tool call arguments to use instead of the original arguments.""" kind: Literal['tool-approved'] = 'tool-approved' ``` #### override_args ```python override_args: dict[str, Any] | None = None ``` Optional tool call arguments to use instead of the original arguments. ### ToolDenied Indicates that a tool call has been denied and that a denial message should be returned to the model. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass class ToolDenied: """Indicates that a tool call has been denied and that a denial message should be returned to the model.""" message: str = 'The tool call was denied.' """The message to return to the model.""" _: KW_ONLY kind: Literal['tool-denied'] = 'tool-denied' ``` #### message ```python message: str = 'The tool call was denied.' ``` The message to return to the model. ### DeferredToolApprovalResult ```python DeferredToolApprovalResult: TypeAlias = Annotated[ ToolApproved | ToolDenied, Discriminator("kind") ] ``` Result for a tool call that required human-in-the-loop approval. ### DeferredToolCallResult ```python DeferredToolCallResult: TypeAlias = Annotated[ Annotated[ToolReturn, Tag("tool-return")] | Annotated[ModelRetry, Tag("model-retry")] | Annotated[RetryPromptPart, Tag("retry-prompt")], Discriminator(_deferred_tool_call_result_discriminator), ] ``` Result for a tool call that required external execution. ### DeferredToolResult ```python DeferredToolResult = ( DeferredToolApprovalResult | DeferredToolCallResult ) ``` Result for a tool call that required approval or external execution. ### DeferredToolResults Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the DeferredToolRequests output object from the previous run. See [deferred tools docs](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolResults: """Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the [`DeferredToolRequests`][pydantic_ai.output.DeferredToolRequests] output object from the previous run. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: dict[str, DeferredToolCallResult | Any] = field(default_factory=dict[str, DeferredToolCallResult | Any]) """Map of tool call IDs to results for tool calls that required external execution.""" approvals: dict[str, bool | DeferredToolApprovalResult] = field( default_factory=dict[str, bool | DeferredToolApprovalResult] ) """Map of tool call IDs to results for tool calls that required human-in-the-loop approval.""" metadata: dict[str, dict[str, Any]] = field(default_factory=dict[str, dict[str, Any]]) """Metadata for deferred tool calls, keyed by `tool_call_id`. Each value will be available in the tool's RunContext as `tool_call_metadata`.""" ``` #### calls ```python calls: dict[str, DeferredToolCallResult | Any] = field( default_factory=dict[str, DeferredToolCallResult | Any] ) ``` Map of tool call IDs to results for tool calls that required external execution. #### approvals ```python approvals: dict[str, bool | DeferredToolApprovalResult] = ( field( default_factory=dict[ str, bool | DeferredToolApprovalResult ] ) ) ``` Map of tool call IDs to results for tool calls that required human-in-the-loop approval. #### metadata ```python metadata: dict[str, dict[str, Any]] = field( default_factory=dict[str, dict[str, Any]] ) ``` Metadata for deferred tool calls, keyed by `tool_call_id`. Each value will be available in the tool's RunContext as `tool_call_metadata`. ### ToolAgentDepsT ```python ToolAgentDepsT = TypeVar( "ToolAgentDepsT", default=object, contravariant=True ) ``` Type variable for agent dependencies for a tool. ### Tool Bases: `Generic[ToolAgentDepsT]` A tool function for an agent. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python @dataclass(init=False) class Tool(Generic[ToolAgentDepsT]): """A tool function for an agent.""" function: ToolFuncEither[ToolAgentDepsT] takes_ctx: bool max_retries: int | None name: str description: str | None prepare: ToolPrepareFunc[ToolAgentDepsT] | None docstring_format: DocstringFormat require_parameter_descriptions: bool strict: bool | None sequential: bool requires_approval: bool metadata: dict[str, Any] | None timeout: float | None function_schema: _function_schema.FunctionSchema """ The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. """ def __init__( self, function: ToolFuncEither[ToolAgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[ToolAgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.timeout = timeout @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) @property def tool_def(self): return ToolDefinition( name=self.name, description=self.description, parameters_json_schema=self.function_schema.json_schema, strict=self.strict, sequential=self.sequential, metadata=self.metadata, timeout=self.timeout, kind='unapproved' if self.requires_approval else 'function', ) async def prepare_tool_def(self, ctx: RunContext[ToolAgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ```` #### __init__ ```python __init__( function: ToolFuncEither[ToolAgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[ToolAgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, function_schema: FunctionSchema | None = None ) ``` Create a new tool instance. Example usage: ```python from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Parameters: | Name | Type | Description | Default | | -------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `function` | `ToolFuncEither[ToolAgentDepsT]` | The Python function to call as the tool. | *required* | | `takes_ctx` | \`bool | None\` | Whether the function takes a RunContext first argument, this is inferred if unset. | | `max_retries` | \`int | None\` | Maximum number of retries allowed for this tool, set to the agent default if None. | | `name` | \`str | None\` | Name of the tool, inferred from the function if None. | | `description` | \`str | None\` | Description of the tool, inferred from the function if None. | | `prepare` | \`ToolPrepareFunc[ToolAgentDepsT] | None\` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). | | `function_schema` | \`FunctionSchema | None\` | The function schema to use for the tool. If not provided, it will be generated. | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python def __init__( self, function: ToolFuncEither[ToolAgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[ToolAgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, timeout: float | None = None, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.timeout = timeout ```` #### function_schema ```python function_schema: FunctionSchema = ( function_schema or function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) ) ``` The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. #### from_schema ```python from_schema( function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self ``` Creates a Pydantic tool from a function and a JSON schema. Parameters: | Name | Type | Description | Default | | ------------- | -------------------- | ------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- | | `function` | `Callable[..., Any]` | The function to call. This will be called with keywords only, and no validation of the arguments will be performed. | *required* | | `name` | `str` | The unique name of the tool that clearly communicates its purpose | *required* | | `description` | \`str | None\` | Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. | | `json_schema` | `JsonSchemaValue` | The schema for the function arguments | *required* | | `takes_ctx` | `bool` | An optional boolean parameter indicating whether the function accepts the context object as an argument. | `False` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | Returns: | Type | Description | | ------ | --------------------------------------- | | `Self` | A Pydantic tool that calls the function | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) ``` #### prepare_tool_def ```python prepare_tool_def( ctx: RunContext[ToolAgentDepsT], ) -> ToolDefinition | None ``` Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: | Type | Description | | ---------------- | ----------- | | \`ToolDefinition | None\` | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python async def prepare_tool_def(self, ctx: RunContext[ToolAgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ``` ### ObjectJsonSchema ```python ObjectJsonSchema: TypeAlias = dict[str, Any] ``` Type representing JSON schema of an object, e.g. where `"type": "object"`. This type is used to define tools parameters (aka arguments) in ToolDefinition. With PEP-728 this should be a TypedDict with `type: Literal['object']`, and `extra_parts=Any` ### ToolKind ```python ToolKind: TypeAlias = Literal[ "function", "output", "external", "unapproved" ] ``` Kind of tool. ### ToolDefinition Definition of a tool passed to a model. This is used for both function tools and output tools. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(repr=False, kw_only=True) class ToolDefinition: """Definition of a tool passed to a model. This is used for both function tools and output tools. """ name: str """The name of the tool.""" parameters_json_schema: ObjectJsonSchema = field(default_factory=lambda: {'type': 'object', 'properties': {}}) """The JSON schema for the tool's parameters.""" description: str | None = None """The description of the tool.""" outer_typed_dict_key: str | None = None """The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. """ strict: bool | None = None """Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently supported by OpenAI and Anthropic models. """ sequential: bool = False """Whether this tool requires a sequential/serial execution environment.""" kind: ToolKind = field(default='function') """The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ metadata: dict[str, Any] | None = None """Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization. For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition. """ timeout: float | None = None """Timeout in seconds for tool execution. If the tool takes longer than this, a retry prompt is returned to the model. Defaults to None (no timeout). """ @property def defer(self) -> bool: """Whether calls to this tool will be deferred. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. """ return self.kind in ('external', 'unapproved') __repr__ = _utils.dataclasses_no_defaults_repr ``` #### name ```python name: str ``` The name of the tool. #### parameters_json_schema ```python parameters_json_schema: ObjectJsonSchema = field( default_factory=lambda: { "type": "object", "properties": {}, } ) ``` The JSON schema for the tool's parameters. #### description ```python description: str | None = None ``` The description of the tool. #### outer_typed_dict_key ```python outer_typed_dict_key: str | None = None ``` The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. #### strict ```python strict: bool | None = None ``` Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently supported by OpenAI and Anthropic models. #### sequential ```python sequential: bool = False ``` Whether this tool requires a sequential/serial execution environment. #### kind ```python kind: ToolKind = field(default='function') ``` The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](https://ai.pydantic.dev/deferred-tools/#human-in-the-loop-tool-approval) for more info. #### metadata ```python metadata: dict[str, Any] | None = None ``` Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization. For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition. #### timeout ```python timeout: float | None = None ``` Timeout in seconds for tool execution. If the tool takes longer than this, a retry prompt is returned to the model. Defaults to None (no timeout). #### defer ```python defer: bool ``` Whether calls to this tool will be deferred. See the [tools documentation](https://ai.pydantic.dev/deferred-tools/#deferred-tools) for more info. # `pydantic_ai.toolsets` ### AbstractToolset Bases: `ABC`, `Generic[AgentDepsT]` A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](https://ai.pydantic.dev/toolsets/index.md) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python class AbstractToolset(ABC, Generic[AgentDepsT]): """A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](../toolsets.md) for more information. """ @property @abstractmethod def id(self) -> str | None: """An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ raise NotImplementedError() @property def label(self) -> str: """The name of the toolset for use in error messages.""" label = self.__class__.__name__ if self.id: # pragma: no branch label += f' {self.id!r}' return label @property def tool_name_conflict_hint(self) -> str: """A hint for how to avoid name conflicts with other toolsets for use in error messages.""" return 'Rename the tool or wrap the toolset in a `PrefixedToolset` to avoid name conflicts.' async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` #### id ```python id: str | None ``` An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. #### label ```python label: str ``` The name of the toolset for use in error messages. #### tool_name_conflict_hint ```python tool_name_conflict_hint: str ``` A hint for how to avoid name conflicts with other toolsets for use in error messages. #### __aenter__ ```python __aenter__() -> Self ``` Enter the toolset context. This is where you can set up network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self ``` #### __aexit__ ```python __aexit__(*args: Any) -> bool | None ``` Exit the toolset context. This is where you can tear down network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None ``` #### get_tools ```python get_tools( ctx: RunContext[AgentDepsT], ) -> dict[str, ToolsetTool[AgentDepsT]] ``` The tools that are available in this toolset. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool with the given arguments. Parameters: | Name | Type | Description | Default | | ----------- | ------------------------- | ---------------------------------------------------------- | ---------- | | `name` | `str` | The name of the tool to call. | *required* | | `tool_args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `ctx` | `RunContext[AgentDepsT]` | The run context. | *required* | | `tool` | `ToolsetTool[AgentDepsT]` | The tool definition returned by get_tools that was called. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() ``` #### apply ```python apply( visitor: Callable[[AbstractToolset[AgentDepsT]], None], ) -> None ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling). Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) ``` #### visit_and_replace ```python visit_and_replace( visitor: Callable[ [AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT], ], ) -> AbstractToolset[AgentDepsT] ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) ``` #### filtered ```python filtered( filter_func: Callable[ [RunContext[AgentDepsT], ToolDefinition], bool ], ) -> FilteredToolset[AgentDepsT] ``` Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](https://ai.pydantic.dev/toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) ``` #### prefixed ```python prefixed(prefix: str) -> PrefixedToolset[AgentDepsT] ``` Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](https://ai.pydantic.dev/toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) ``` #### prepared ```python prepared( prepare_func: ToolsPrepareFunc[AgentDepsT], ) -> PreparedToolset[AgentDepsT] ``` Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](https://ai.pydantic.dev/toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) ``` #### renamed ```python renamed( name_map: dict[str, str], ) -> RenamedToolset[AgentDepsT] ``` Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](https://ai.pydantic.dev/toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) ``` #### approval_required ```python approval_required( approval_required_func: Callable[ [ RunContext[AgentDepsT], ToolDefinition, dict[str, Any], ], bool, ] = lambda ctx, tool_def, tool_args: True ) -> ApprovalRequiredToolset[AgentDepsT] ``` Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](https://ai.pydantic.dev/toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` ### CombinedToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that combines multiple toolsets. See [toolset docs](https://ai.pydantic.dev/toolsets/#combining-toolsets) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/combined.py` ```python @dataclass class CombinedToolset(AbstractToolset[AgentDepsT]): """A toolset that combines multiple toolsets. See [toolset docs](../toolsets.md#combining-toolsets) for more information. """ toolsets: Sequence[AbstractToolset[AgentDepsT]] _enter_lock: Lock = field(compare=False, init=False, default_factory=Lock) _entered_count: int = field(init=False, default=0) _exit_stack: AsyncExitStack | None = field(init=False, default=None) @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({", ".join(toolset.label for toolset in self.toolsets)})' # pragma: no cover async def __aenter__(self) -> Self: async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: for toolset in self.toolsets: await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: toolsets_tools = await asyncio.gather(*(toolset.get_tools(ctx) for toolset in self.toolsets)) all_tools: dict[str, ToolsetTool[AgentDepsT]] = {} for toolset, tools in zip(self.toolsets, toolsets_tools): for name, tool in tools.items(): tool_toolset = tool.toolset if existing_tool := all_tools.get(name): capitalized_toolset_label = tool_toolset.label[0].upper() + tool_toolset.label[1:] raise UserError( f'{capitalized_toolset_label} defines a tool whose name conflicts with existing tool from {existing_tool.toolset.label}: {name!r}. {toolset.tool_name_conflict_hint}' ) all_tools[name] = _CombinedToolsetTool( toolset=tool_toolset, tool_def=tool.tool_def, max_retries=tool.max_retries, args_validator=tool.args_validator, source_toolset=toolset, source_tool=tool, ) return all_tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, _CombinedToolsetTool) return await tool.source_toolset.call_tool(name, tool_args, ctx, tool.source_tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: for toolset in self.toolsets: toolset.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, toolsets=[toolset.visit_and_replace(visitor) for toolset in self.toolsets]) ``` ### ExternalToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](https://ai.pydantic.dev/toolsets/#external-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/external.py` ```python class ExternalToolset(AbstractToolset[AgentDepsT]): """A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](../toolsets.md#external-toolset) for more information. """ tool_defs: list[ToolDefinition] _id: str | None def __init__(self, tool_defs: list[ToolDefinition], *, id: str | None = None): self.tool_defs = tool_defs self._id = id @property def id(self) -> str | None: return self._id async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { tool_def.name: ToolsetTool( toolset=self, tool_def=replace(tool_def, kind='external'), max_retries=0, args_validator=TOOL_SCHEMA_VALIDATOR, ) for tool_def in self.tool_defs } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: raise NotImplementedError('External tools cannot be called directly') ``` ### ApprovalRequiredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](https://ai.pydantic.dev/toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/approval_required.py` ```python @dataclass class ApprovalRequiredToolset(WrapperToolset[AgentDepsT]): """A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: if not ctx.tool_call_approved and self.approval_required_func(ctx, tool.tool_def, tool_args): raise ApprovalRequired return await super().call_tool(name, tool_args, ctx, tool) ``` ### FilteredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](https://ai.pydantic.dev/toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/filtered.py` ```python @dataclass class FilteredToolset(WrapperToolset[AgentDepsT]): """A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { name: tool for name, tool in (await super().get_tools(ctx)).items() if self.filter_func(ctx, tool.tool_def) } ``` ### FunctionToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that lets Python functions be used as tools. See [toolset docs](https://ai.pydantic.dev/toolsets/#function-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python class FunctionToolset(AbstractToolset[AgentDepsT]): """A toolset that lets Python functions be used as tools. See [toolset docs](../toolsets.md#function-toolset) for more information. """ tools: dict[str, Tool[Any]] max_retries: int timeout: float | None _id: str | None docstring_format: DocstringFormat require_parameter_descriptions: bool schema_generator: type[GenerateJsonSchema] def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, timeout: float | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. timeout: Timeout in seconds for tool execution. If a tool takes longer than this, a retry prompt is returned to the model. Individual tools can override this with their own timeout. Defaults to None (no timeout). docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ self.max_retries = max_retries self.timeout = timeout self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) # pyright: ignore[reportUnknownArgumentType] else: self.add_function(tool) @property def id(self) -> str | None: return self._id @overload def tool(self, func: ToolFuncEither[AgentDepsT, ToolParams], /) -> ToolFuncEither[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Callable[[ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool,defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func=func_, takes_ctx=None, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator if strict is None: strict = self.strict if sequential is None: sequential = self.sequential if requires_approval is None: requires_approval = self.requires_approval tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, description=description, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) self.add_tool(tool) def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries if self.metadata is not None: tool.metadata = self.metadata | (tool.metadata or {}) self.tools[tool.name] = tool async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in self.tools.items(): max_retries = tool.max_retries if tool.max_retries is not None else self.max_retries run_context = replace( ctx, tool_name=original_name, retry=ctx.retries.get(original_name, 0), max_retries=max_retries, ) tool_def = await tool.prepare_tool_def(run_context) if not tool_def: continue new_name = tool_def.name if new_name in tools: if new_name != original_name: raise UserError(f'Renaming tool {original_name!r} to {new_name!r} conflicts with existing tool.') else: raise UserError(f'Tool name conflicts with previously renamed tool: {new_name!r}.') tools[new_name] = FunctionToolsetTool( toolset=self, tool_def=tool_def, max_retries=max_retries, args_validator=tool.function_schema.validator, call_func=tool.function_schema.call, is_async=tool.function_schema.is_async, timeout=tool_def.timeout, ) return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, FunctionToolsetTool) # Per-tool timeout takes precedence over toolset timeout timeout = tool.timeout if tool.timeout is not None else self.timeout if timeout is not None: try: with anyio.fail_after(timeout): return await tool.call_func(tool_args, ctx) except TimeoutError: raise ModelRetry(f'Timed out after {timeout} seconds.') from None else: return await tool.call_func(tool_args, ctx) ```` #### __init__ ```python __init__( tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = [], *, max_retries: int = 1, timeout: float | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None ) ``` Build a new function toolset. Parameters: | Name | Type | Description | Default | | -------------------------------- | ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `tools` | \`Sequence\[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]\]\` | The tools to add to the toolset. | | `max_retries` | `int` | The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. | `1` | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If a tool takes longer than this, a retry prompt is returned to the model. Individual tools can override this with their own timeout. Defaults to None (no timeout). | | `docstring_format` | `DocstringFormat` | Format of tool docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. Applies to all tools, unless overridden when adding a tool. | `GenerateToolJsonSchema` | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. Applies to all tools, unless overridden when adding a tool. | `False` | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. | | `id` | \`str | None\` | An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, timeout: float | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. timeout: Timeout in seconds for tool execution. If a tool takes longer than this, a retry prompt is returned to the model. Individual tools can override this with their own timeout. Defaults to None (no timeout). docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ self.max_retries = max_retries self.timeout = timeout self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) # pyright: ignore[reportUnknownArgumentType] else: self.add_function(tool) ``` #### tool ```python tool( func: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None ) -> Callable[ [ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncEither[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](https://ai.pydantic.dev/tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | -------------------------------- | ---------------------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `func` | \`ToolFuncEither[AgentDepsT, ToolParams] | None\` | The tool function to register. | | `name` | \`str | None\` | The name of the tool, defaults to the function name. | | `description` | \`str | None\` | The description of the tool,defaults to the function docstring. | | `retries` | \`int | None\` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | | `prepare` | \`ToolPrepareFunc[AgentDepsT] | None\` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | | `docstring_format` | \`DocstringFormat | None\` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | | `require_parameter_descriptions` | \`bool | None\` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | | `schema_generator` | \`type[GenerateJsonSchema] | None\` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. If None, the default value is determined by the toolset. | | `sequential` | \`bool | None\` | Whether the function requires a sequential/serial execution environment. Defaults to False. If None, the default value is determined by the toolset. | | `requires_approval` | \`bool | None\` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. If None, the default value is determined by the toolset. | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If None, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool,defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func=func_, takes_ctx=None, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### add_function ```python add_function( func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> None ``` Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](https://ai.pydantic.dev/tools/#function-tools-and-schema). Parameters: | Name | Type | Description | Default | | -------------------------------- | ---------------------------------------- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `func` | `ToolFuncEither[AgentDepsT, ToolParams]` | The tool function to register. | *required* | | `takes_ctx` | \`bool | None\` | Whether the function takes a RunContext as its first argument. If None, this is inferred from the function signature. | | `name` | \`str | None\` | The name of the tool, defaults to the function name. | | `description` | \`str | None\` | The description of the tool, defaults to the function docstring. | | `retries` | \`int | None\` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | | `prepare` | \`ToolPrepareFunc[AgentDepsT] | None\` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | | `docstring_format` | \`DocstringFormat | None\` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | | `require_parameter_descriptions` | \`bool | None\` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | | `schema_generator` | \`type[GenerateJsonSchema] | None\` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | | `strict` | \`bool | None\` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. If None, the default value is determined by the toolset. | | `sequential` | \`bool | None\` | Whether the function requires a sequential/serial execution environment. Defaults to False. If None, the default value is determined by the toolset. | | `requires_approval` | \`bool | None\` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. If None, the default value is determined by the toolset. | | `metadata` | \`dict[str, Any] | None\` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If None, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. | | `timeout` | \`float | None\` | Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, timeout: float | None = None, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. timeout: Timeout in seconds for tool execution. If the tool takes longer, a retry prompt is returned to the model. Defaults to None (no timeout). """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator if strict is None: strict = self.strict if sequential is None: sequential = self.sequential if requires_approval is None: requires_approval = self.requires_approval tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, description=description, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, timeout=timeout, ) self.add_tool(tool) ``` #### add_tool ```python add_tool(tool: Tool[AgentDepsT]) -> None ``` Add a tool to the toolset. Parameters: | Name | Type | Description | Default | | ------ | ------------------ | ---------------- | ---------- | | `tool` | `Tool[AgentDepsT]` | The tool to add. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries if self.metadata is not None: tool.metadata = self.metadata | (tool.metadata or {}) self.tools[tool.name] = tool ``` ### PrefixedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prefixes the names of the tools it contains. See [toolset docs](https://ai.pydantic.dev/toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prefixed.py` ```python @dataclass class PrefixedToolset(WrapperToolset[AgentDepsT]): """A toolset that prefixes the names of the tools it contains. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ prefix: str @property def tool_name_conflict_hint(self) -> str: return 'Change the `prefix` attribute to avoid name conflicts.' async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { new_name: replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) for name, tool in (await super().get_tools(ctx)).items() if (new_name := f'{self.prefix}_{name}') } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = name.removeprefix(self.prefix + '_') ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### RenamedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](https://ai.pydantic.dev/toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/renamed.py` ```python @dataclass class RenamedToolset(WrapperToolset[AgentDepsT]): """A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ name_map: dict[str, str] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_to_new_name_map = {v: k for k, v in self.name_map.items()} original_tools = await super().get_tools(ctx) tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in original_tools.items(): new_name = original_to_new_name_map.get(original_name, None) if new_name: tools[new_name] = replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) else: tools[original_name] = tool return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = self.name_map.get(name, name) ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### PreparedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](https://ai.pydantic.dev/toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prepared.py` ```python @dataclass class PreparedToolset(WrapperToolset[AgentDepsT]): """A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ prepare_func: ToolsPrepareFunc[AgentDepsT] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_tools = await super().get_tools(ctx) original_tool_defs = [tool.tool_def for tool in original_tools.values()] prepared_tool_defs_by_name = { tool_def.name: tool_def for tool_def in (await self.prepare_func(ctx, original_tool_defs) or []) } if len(prepared_tool_defs_by_name.keys() - original_tools.keys()) > 0: raise UserError( 'Prepare function cannot add or rename tools. Use `FunctionToolset.add_function()` or `RenamedToolset` instead.' ) return { name: replace(original_tools[name], tool_def=tool_def) for name, tool_def in prepared_tool_defs_by_name.items() } ``` ### WrapperToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that wraps another toolset and delegates to it. See [toolset docs](https://ai.pydantic.dev/toolsets/#wrapping-a-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/wrapper.py` ```python @dataclass class WrapperToolset(AbstractToolset[AgentDepsT]): """A toolset that wraps another toolset and delegates to it. See [toolset docs](../toolsets.md#wrapping-a-toolset) for more information. """ wrapped: AbstractToolset[AgentDepsT] @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({self.wrapped.label})' async def __aenter__(self) -> Self: await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return await self.wrapped.get_tools(ctx) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: return await self.wrapped.call_tool(name, tool_args, ctx, tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: self.wrapped.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, wrapped=self.wrapped.visit_and_replace(visitor)) ``` ### ToolsetFunc ```python ToolsetFunc: TypeAlias = Callable[ [RunContext[AgentDepsT]], AbstractToolset[AgentDepsT] | None | Awaitable[AbstractToolset[AgentDepsT] | None], ] ``` A sync/async function which takes a run context and returns a toolset. ### FastMCPToolset Bases: `AbstractToolset[AgentDepsT]` A FastMCP Toolset that uses the FastMCP Client to call tools from a local or remote MCP Server. The Toolset can accept a FastMCP Client, a FastMCP Transport, or any other object which a FastMCP Transport can be created from. See https://gofastmcp.com/clients/transports for a full list of transports available. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/fastmcp.py` ```python @dataclass(init=False) class FastMCPToolset(AbstractToolset[AgentDepsT]): """A FastMCP Toolset that uses the FastMCP Client to call tools from a local or remote MCP Server. The Toolset can accept a FastMCP Client, a FastMCP Transport, or any other object which a FastMCP Transport can be created from. See https://gofastmcp.com/clients/transports for a full list of transports available. """ client: Client[Any] """The FastMCP client to use.""" _: KW_ONLY tool_error_behavior: Literal['model_retry', 'error'] """The behavior to take when a tool error occurs.""" max_retries: int """The maximum number of retries to attempt if a tool call fails.""" _id: str | None def __init__( self, client: Client[Any] | ClientTransport | FastMCP | FastMCP1Server | AnyUrl | Path | MCPConfig | dict[str, Any] | str, *, max_retries: int = 1, tool_error_behavior: Literal['model_retry', 'error'] = 'model_retry', id: str | None = None, ) -> None: if isinstance(client, Client): self.client = client else: self.client = Client[Any](transport=client) self._id = id self.max_retries = max_retries self.tool_error_behavior = tool_error_behavior self._enter_lock: Lock = Lock() self._running_count: int = 0 self._exit_stack: AsyncExitStack | None = None @property def id(self) -> str | None: return self._id async def __aenter__(self) -> Self: async with self._enter_lock: if self._running_count == 0: self._exit_stack = AsyncExitStack() await self._exit_stack.enter_async_context(self.client) self._running_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._running_count -= 1 if self._running_count == 0 and self._exit_stack: await self._exit_stack.aclose() self._exit_stack = None return None async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: async with self: return { mcp_tool.name: self.tool_for_tool_def( ToolDefinition( name=mcp_tool.name, description=mcp_tool.description, parameters_json_schema=mcp_tool.inputSchema, metadata={ 'meta': mcp_tool.meta, 'annotations': mcp_tool.annotations.model_dump() if mcp_tool.annotations else None, 'output_schema': mcp_tool.outputSchema or None, }, ) ) for mcp_tool in await self.client.list_tools() } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: async with self: try: call_tool_result: CallToolResult = await self.client.call_tool(name=name, arguments=tool_args) except ToolError as e: if self.tool_error_behavior == 'model_retry': raise ModelRetry(message=str(e)) from e else: raise e # If we have structured content, return that if call_tool_result.structured_content: return call_tool_result.structured_content # Otherwise, return the content return _map_fastmcp_tool_results(parts=call_tool_result.content) def tool_for_tool_def(self, tool_def: ToolDefinition) -> ToolsetTool[AgentDepsT]: return ToolsetTool[AgentDepsT]( tool_def=tool_def, toolset=self, max_retries=self.max_retries, args_validator=TOOL_SCHEMA_VALIDATOR, ) ``` #### client ```python client: Client[Any] ``` The FastMCP client to use. #### max_retries ```python max_retries: int = max_retries ``` The maximum number of retries to attempt if a tool call fails. #### tool_error_behavior ```python tool_error_behavior: Literal["model_retry", "error"] = ( tool_error_behavior ) ``` The behavior to take when a tool error occurs. # `pydantic_ai.usage` ### UsageBase Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class UsageBase: input_tokens: Annotated[ int, # `request_tokens` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed Field(validation_alias=AliasChoices('input_tokens', 'request_tokens')), ] = 0 """Number of input/prompt tokens.""" cache_write_tokens: int = 0 """Number of tokens written to the cache.""" cache_read_tokens: int = 0 """Number of tokens read from the cache.""" output_tokens: Annotated[ int, # `response_tokens` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed Field(validation_alias=AliasChoices('output_tokens', 'response_tokens')), ] = 0 """Number of output/completion tokens.""" input_audio_tokens: int = 0 """Number of audio input tokens.""" cache_audio_read_tokens: int = 0 """Number of audio tokens read from the cache.""" output_audio_tokens: int = 0 """Number of audio output tokens.""" details: Annotated[ dict[str, int], # `details` can not be `None` any longer, but we still want to support deserializing model responses stored in a DB before this was changed BeforeValidator(lambda d: d or {}), ] = dataclasses.field(default_factory=dict[str, int]) """Any extra details returned by the model.""" @property @deprecated('`request_tokens` is deprecated, use `input_tokens` instead') def request_tokens(self) -> int: return self.input_tokens @property @deprecated('`response_tokens` is deprecated, use `output_tokens` instead') def response_tokens(self) -> int: return self.output_tokens @property def total_tokens(self) -> int: """Sum of `input_tokens + output_tokens`.""" return self.input_tokens + self.output_tokens def opentelemetry_attributes(self) -> dict[str, int]: """Get the token usage values as OpenTelemetry attributes.""" result: dict[str, int] = {} if self.input_tokens: result['gen_ai.usage.input_tokens'] = self.input_tokens if self.output_tokens: result['gen_ai.usage.output_tokens'] = self.output_tokens details = self.details.copy() if self.cache_write_tokens: details['cache_write_tokens'] = self.cache_write_tokens if self.cache_read_tokens: details['cache_read_tokens'] = self.cache_read_tokens if self.input_audio_tokens: details['input_audio_tokens'] = self.input_audio_tokens if self.cache_audio_read_tokens: details['cache_audio_read_tokens'] = self.cache_audio_read_tokens if self.output_audio_tokens: details['output_audio_tokens'] = self.output_audio_tokens if details: prefix = 'gen_ai.usage.details.' for key, value in details.items(): # Skipping check for value since spec implies all detail values are relevant if value: result[prefix + key] = value return result def __repr__(self): kv_pairs = (f'{f.name}={value!r}' for f in fields(self) if (value := getattr(self, f.name))) return f'{self.__class__.__qualname__}({", ".join(kv_pairs)})' def has_values(self) -> bool: """Whether any values are set and non-zero.""" return any(dataclasses.asdict(self).values()) ``` #### input_tokens ```python input_tokens: Annotated[ int, Field( validation_alias=AliasChoices( input_tokens, request_tokens ) ), ] = 0 ``` Number of input/prompt tokens. #### cache_write_tokens ```python cache_write_tokens: int = 0 ``` Number of tokens written to the cache. #### cache_read_tokens ```python cache_read_tokens: int = 0 ``` Number of tokens read from the cache. #### output_tokens ```python output_tokens: Annotated[ int, Field( validation_alias=AliasChoices( output_tokens, response_tokens ) ), ] = 0 ``` Number of output/completion tokens. #### input_audio_tokens ```python input_audio_tokens: int = 0 ``` Number of audio input tokens. #### cache_audio_read_tokens ```python cache_audio_read_tokens: int = 0 ``` Number of audio tokens read from the cache. #### output_audio_tokens ```python output_audio_tokens: int = 0 ``` Number of audio output tokens. #### details ```python details: Annotated[ dict[str, int], BeforeValidator(lambda d: d or {}) ] = field(default_factory=dict[str, int]) ``` Any extra details returned by the model. #### total_tokens ```python total_tokens: int ``` Sum of `input_tokens + output_tokens`. #### opentelemetry_attributes ```python opentelemetry_attributes() -> dict[str, int] ``` Get the token usage values as OpenTelemetry attributes. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def opentelemetry_attributes(self) -> dict[str, int]: """Get the token usage values as OpenTelemetry attributes.""" result: dict[str, int] = {} if self.input_tokens: result['gen_ai.usage.input_tokens'] = self.input_tokens if self.output_tokens: result['gen_ai.usage.output_tokens'] = self.output_tokens details = self.details.copy() if self.cache_write_tokens: details['cache_write_tokens'] = self.cache_write_tokens if self.cache_read_tokens: details['cache_read_tokens'] = self.cache_read_tokens if self.input_audio_tokens: details['input_audio_tokens'] = self.input_audio_tokens if self.cache_audio_read_tokens: details['cache_audio_read_tokens'] = self.cache_audio_read_tokens if self.output_audio_tokens: details['output_audio_tokens'] = self.output_audio_tokens if details: prefix = 'gen_ai.usage.details.' for key, value in details.items(): # Skipping check for value since spec implies all detail values are relevant if value: result[prefix + key] = value return result ``` #### has_values ```python has_values() -> bool ``` Whether any values are set and non-zero. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def has_values(self) -> bool: """Whether any values are set and non-zero.""" return any(dataclasses.asdict(self).values()) ``` ### RequestUsage Bases: `UsageBase` LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RequestUsage(UsageBase): """LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). """ @property def requests(self): return 1 def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage @classmethod def extract( cls, data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str = 'default', details: dict[str, Any] | None = None, ) -> RequestUsage: """Extract usage information from the response data using genai-prices. Args: data: The response data from the model API. provider: The actual provider ID provider_url: The provider base_url provider_fallback: The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. api_flavor: The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. details: Becomes the `details` field on the returned `RequestUsage` for convenience. """ details = details or {} for provider_id, provider_api_url in [(None, provider_url), (provider, None), (provider_fallback, None)]: try: provider_obj = get_snapshot().find_provider(None, provider_id, provider_api_url) _model_ref, extracted_usage = provider_obj.extract_usage(data, api_flavor=api_flavor) return cls(**{k: v for k, v in extracted_usage.__dict__.items() if v is not None}, details=details) except Exception: pass return cls(details=details) ``` #### incr ```python incr(incr_usage: RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | ------------ | -------------- | -------------------------- | ---------- | | `incr_usage` | `RequestUsage` | The usage to increment by. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RequestUsage) -> RequestUsage ``` Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### extract ```python extract( data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str = "default", details: dict[str, Any] | None = None ) -> RequestUsage ``` Extract usage information from the response data using genai-prices. Parameters: | Name | Type | Description | Default | | ------------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- | | `data` | `Any` | The response data from the model API. | *required* | | `provider` | `str` | The actual provider ID | *required* | | `provider_url` | `str` | The provider base_url | *required* | | `provider_fallback` | `str` | The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. | *required* | | `api_flavor` | `str` | The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. | `'default'` | | `details` | \`dict[str, Any] | None\` | Becomes the details field on the returned RequestUsage for convenience. | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @classmethod def extract( cls, data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str = 'default', details: dict[str, Any] | None = None, ) -> RequestUsage: """Extract usage information from the response data using genai-prices. Args: data: The response data from the model API. provider: The actual provider ID provider_url: The provider base_url provider_fallback: The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. api_flavor: The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. details: Becomes the `details` field on the returned `RequestUsage` for convenience. """ details = details or {} for provider_id, provider_api_url in [(None, provider_url), (provider, None), (provider_fallback, None)]: try: provider_obj = get_snapshot().find_provider(None, provider_id, provider_api_url) _model_ref, extracted_usage = provider_obj.extract_usage(data, api_flavor=api_flavor) return cls(**{k: v for k, v in extracted_usage.__dict__.items() if v is not None}, details=details) except Exception: pass return cls(details=details) ``` ### RunUsage Bases: `UsageBase` LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RunUsage(UsageBase): """LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. """ requests: int = 0 """Number of requests made to the LLM API.""" tool_calls: int = 0 """Number of successful tool calls executed during the run.""" input_tokens: int = 0 """Total number of input/prompt tokens.""" cache_write_tokens: int = 0 """Total number of tokens written to the cache.""" cache_read_tokens: int = 0 """Total number of tokens read from the cache.""" input_audio_tokens: int = 0 """Total number of audio input tokens.""" cache_audio_read_tokens: int = 0 """Total number of audio tokens read from the cache.""" output_tokens: int = 0 """Total number of output/completion tokens.""" details: dict[str, int] = dataclasses.field(default_factory=dict[str, int]) """Any extra details returned by the model.""" def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### requests ```python requests: int = 0 ``` Number of requests made to the LLM API. #### tool_calls ```python tool_calls: int = 0 ``` Number of successful tool calls executed during the run. #### input_tokens ```python input_tokens: int = 0 ``` Total number of input/prompt tokens. #### cache_write_tokens ```python cache_write_tokens: int = 0 ``` Total number of tokens written to the cache. #### cache_read_tokens ```python cache_read_tokens: int = 0 ``` Total number of tokens read from the cache. #### input_audio_tokens ```python input_audio_tokens: int = 0 ``` Total number of audio input tokens. #### cache_audio_read_tokens ```python cache_audio_read_tokens: int = 0 ``` Total number of audio tokens read from the cache. #### output_tokens ```python output_tokens: int = 0 ``` Total number of output/completion tokens. #### details ```python details: dict[str, int] = field( default_factory=dict[str, int] ) ``` Any extra details returned by the model. #### incr ```python incr(incr_usage: RunUsage | RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | ------------ | ---------- | -------------- | -------------------------- | | `incr_usage` | \`RunUsage | RequestUsage\` | The usage to increment by. | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RunUsage | RequestUsage) -> RunUsage ``` Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` ### Usage Bases: `RunUsage` Deprecated `Usage` is deprecated, use `RunUsage` instead Deprecated alias for `RunUsage`. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) @deprecated('`Usage` is deprecated, use `RunUsage` instead') class Usage(RunUsage): """Deprecated alias for `RunUsage`.""" ``` ### UsageLimits Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class UsageLimits: """Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. """ request_limit: int | None = 50 """The maximum number of requests allowed to the model.""" tool_calls_limit: int | None = None """The maximum number of successful tool calls allowed to be executed.""" input_tokens_limit: int | None = None """The maximum number of input/prompt tokens allowed.""" output_tokens_limit: int | None = None """The maximum number of output/response tokens allowed.""" total_tokens_limit: int | None = None """The maximum number of tokens allowed in requests and responses combined.""" count_tokens_before_request: bool = False """If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default. Supported by: - Anthropic - Google - Bedrock Converse Support for OpenAI is in development: https://github.com/pydantic/pydantic-ai/issues/3430 """ @property @deprecated('`request_tokens_limit` is deprecated, use `input_tokens_limit` instead') def request_tokens_limit(self) -> int | None: return self.input_tokens_limit @property @deprecated('`response_tokens_limit` is deprecated, use `output_tokens_limit` instead') def response_tokens_limit(self) -> int | None: return self.output_tokens_limit @overload def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit self.output_tokens_limit = output_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request @overload @deprecated( 'Use `input_tokens_limit` instead of `request_tokens_limit` and `output_tokens_limit` and `total_tokens_limit`' ) def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = request_tokens_limit self.output_tokens_limit = response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, # deprecated: request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, ): self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit or request_tokens_limit self.output_tokens_limit = output_tokens_limit or response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') def check_before_tool_call(self, projected_usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit tool_calls = projected_usage.tool_calls if tool_calls_limit is not None and tool_calls > tool_calls_limit: raise UsageLimitExceeded( f'The next tool call(s) would exceed the tool_calls_limit of {tool_calls_limit} ({tool_calls=}).' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### request_limit ```python request_limit: int | None = request_limit ``` The maximum number of requests allowed to the model. #### tool_calls_limit ```python tool_calls_limit: int | None = tool_calls_limit ``` The maximum number of successful tool calls allowed to be executed. #### input_tokens_limit ```python input_tokens_limit: int | None = ( input_tokens_limit or request_tokens_limit ) ``` The maximum number of input/prompt tokens allowed. #### output_tokens_limit ```python output_tokens_limit: int | None = ( output_tokens_limit or response_tokens_limit ) ``` The maximum number of output/response tokens allowed. #### total_tokens_limit ```python total_tokens_limit: int | None = total_tokens_limit ``` The maximum number of tokens allowed in requests and responses combined. #### count_tokens_before_request ```python count_tokens_before_request: bool = ( count_tokens_before_request ) ``` If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default. Supported by: - Anthropic - Google - Bedrock Converse Support for OpenAI is in development: https://github.com/pydantic/pydantic-ai/issues/3430 #### has_token_limits ```python has_token_limits() -> bool ``` Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) ``` #### check_before_request ```python check_before_request(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) ``` #### check_tokens ```python check_tokens(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') ``` #### check_before_tool_call ```python check_before_tool_call(projected_usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_tool_call(self, projected_usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit tool_calls = projected_usage.tool_calls if tool_calls_limit is not None and tool_calls > tool_calls_limit: raise UsageLimitExceeded( f'The next tool call(s) would exceed the tool_calls_limit of {tool_calls_limit} ({tool_calls=}).' ) ``` # `pydantic_ai.models.anthropic` ## Setup For details on how to set up authentication with this model, see [model configuration for Anthropic](https://ai.pydantic.dev/models/anthropic/index.md). ### LatestAnthropicModelNames ```python LatestAnthropicModelNames = ModelParam ``` Latest Anthropic models. ### AnthropicModelName ```python AnthropicModelName = str | LatestAnthropicModelNames ``` Possible Anthropic model names. Since Anthropic supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list. ### AnthropicModelSettings Bases: `ModelSettings` Settings used for an Anthropic model request. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python class AnthropicModelSettings(ModelSettings, total=False): """Settings used for an Anthropic model request.""" # ALL FIELDS MUST BE `anthropic_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. anthropic_metadata: BetaMetadataParam """An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. """ anthropic_thinking: BetaThinkingConfigParam """Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. """ anthropic_cache_tool_definitions: bool | Literal['5m', '1h'] """Whether to add `cache_control` to the last tool definition. When enabled, the last tool in the `tools` array will have `cache_control` set, allowing Anthropic to cache tool definitions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. """ anthropic_cache_instructions: bool | Literal['5m', '1h'] """Whether to add `cache_control` to the last system prompt block. When enabled, the last system prompt will have `cache_control` set, allowing Anthropic to cache system instructions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. """ anthropic_cache_messages: bool | Literal['5m', '1h'] """Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. Note: Uses 1 of Anthropic's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. """ anthropic_effort: Literal['low', 'medium', 'high', 'max'] | None """The effort level for the model to use when generating a response. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for more information. """ anthropic_container: BetaContainerParams | Literal[False] """Container configuration for multi-turn conversations. By default, if previous messages contain a container_id (from a prior response), it will be reused automatically. Set to `False` to force a fresh container (ignore any `container_id` from history). Set to a dict (e.g. `{'id': 'container_xxx'}`) to explicitly specify a container. """ anthropic_betas: list[AnthropicBetaParam] """List of Anthropic beta features to enable for API requests. Each item can be a known beta name (e.g. 'interleaved-thinking-2025-05-14') or a custom string. Merged with auto-added betas (e.g. structured-outputs, builtin tools) and any betas from extra_headers['anthropic-beta']. See the Anthropic docs for available beta features. """ ``` #### anthropic_metadata ```python anthropic_metadata: BetaMetadataParam ``` An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. #### anthropic_thinking ```python anthropic_thinking: BetaThinkingConfigParam ``` Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. #### anthropic_cache_tool_definitions ```python anthropic_cache_tool_definitions: bool | Literal["5m", "1h"] ``` Whether to add `cache_control` to the last tool definition. When enabled, the last tool in the `tools` array will have `cache_control` set, allowing Anthropic to cache tool definitions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. #### anthropic_cache_instructions ```python anthropic_cache_instructions: bool | Literal['5m', '1h'] ``` Whether to add `cache_control` to the last system prompt block. When enabled, the last system prompt will have `cache_control` set, allowing Anthropic to cache system instructions and reduce costs. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. #### anthropic_cache_messages ```python anthropic_cache_messages: bool | Literal['5m', '1h'] ``` Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. If `True`, uses TTL='5m'. You can also specify '5m' or '1h' directly. TTL is automatically omitted for Bedrock, as it does not support explicit TTL. Note: Uses 1 of Anthropic's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. #### anthropic_effort ```python anthropic_effort: ( Literal["low", "medium", "high", "max"] | None ) ``` The effort level for the model to use when generating a response. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/effort) for more information. #### anthropic_container ```python anthropic_container: BetaContainerParams | Literal[False] ``` Container configuration for multi-turn conversations. By default, if previous messages contain a container_id (from a prior response), it will be reused automatically. Set to `False` to force a fresh container (ignore any `container_id` from history). Set to a dict (e.g. `{'id': 'container_xxx'}`) to explicitly specify a container. #### anthropic_betas ```python anthropic_betas: list[AnthropicBetaParam] ``` List of Anthropic beta features to enable for API requests. Each item can be a known beta name (e.g. 'interleaved-thinking-2025-05-14') or a custom string. Merged with auto-added betas (e.g. structured-outputs, builtin tools) and any betas from extra_headers['anthropic-beta']. See the Anthropic docs for available beta features. ### AnthropicModel Bases: `Model` A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass(init=False) class AnthropicModel(Model): """A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncAnthropicClient = field(repr=False) _model_name: AnthropicModelName = field(repr=False) _provider: Provider[AsyncAnthropicClient] = field(repr=False) def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. Defaults to 'anthropic'. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. The default 'anthropic' provider will use the default `..profiles.anthropic.anthropic_model_profile`. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/anthropic' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> AnthropicModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """The set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, WebFetchTool, MemoryTool, MCPServerTool}) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._messages_create( messages, False, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._messages_count_tokens( messages, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) return usage.RequestUsage(input_tokens=response.input_tokens) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._messages_create( messages, True, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters ) -> tuple[ModelSettings | None, ModelRequestParameters]: settings = merge_model_settings(self.settings, model_settings) if ( model_request_parameters.output_tools and settings and (thinking := settings.get('anthropic_thinking')) and thinking.get('type') in ('enabled', 'adaptive') ): if model_request_parameters.output_mode == 'auto': output_mode = 'native' if self.profile.supports_json_schema_output else 'prompted' model_request_parameters = replace(model_request_parameters, output_mode=output_mode) elif ( model_request_parameters.output_mode == 'tool' and not model_request_parameters.allow_text_output ): # pragma: no branch # This would result in `tool_choice=required`, which Anthropic does not support with thinking. suggested_output_type = 'NativeOutput' if self.profile.supports_json_schema_output else 'PromptedOutput' raise UserError( f'Anthropic does not support thinking and output tools at the same time. Use `output_type={suggested_output_type}(...)` instead.' ) if model_request_parameters.output_mode == 'native': assert model_request_parameters.output_object is not None if model_request_parameters.output_object.strict is False: raise UserError( 'Setting `strict=False` on `output_type=NativeOutput(...)` is not allowed for Anthropic models.' ) model_request_parameters = replace( model_request_parameters, output_object=replace(model_request_parameters.output_object, strict=True) ) return super().prepare_request(model_settings, model_request_parameters) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[BetaRawMessageStreamEvent]: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage | AsyncStream[BetaRawMessageStreamEvent]: """Calls the Anthropic API to create a message. This is the last step before sending the request to the API. Most preprocessing has happened in `prepare_request()`. """ tools = self._get_tools(model_request_parameters, model_settings) tools, mcp_servers, builtin_tool_betas = self._add_builtin_tools(tools, model_request_parameters) tool_choice = self._infer_tool_choice(tools, model_settings, model_request_parameters) system_prompt, anthropic_messages = await self._map_message(messages, model_request_parameters, model_settings) self._limit_cache_points(system_prompt, anthropic_messages, tools) output_config = self._build_output_config(model_request_parameters, model_settings) betas, extra_headers = self._get_betas_and_extra_headers(tools, model_request_parameters, model_settings) betas.update(builtin_tool_betas) container = self._get_container(messages, model_settings) try: return await self.client.beta.messages.create( max_tokens=model_settings.get('max_tokens', 4096), system=system_prompt or OMIT, messages=anthropic_messages, model=self._model_name, tools=tools or OMIT, tool_choice=tool_choice or OMIT, mcp_servers=mcp_servers or OMIT, output_config=output_config or OMIT, betas=sorted(betas) or OMIT, stream=stream, thinking=model_settings.get('anthropic_thinking', OMIT), stop_sequences=model_settings.get('stop_sequences', OMIT), temperature=model_settings.get('temperature', OMIT), top_p=model_settings.get('top_p', OMIT), timeout=model_settings.get('timeout', NOT_GIVEN), metadata=model_settings.get('anthropic_metadata', OMIT), container=container or OMIT, extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=e.message) from e # pragma: lax no cover except APIConnectionError as e: raise ModelAPIError(model_name=self.model_name, message=e.message) from e def _get_betas_and_extra_headers( self, tools: list[BetaToolUnionParam], model_request_parameters: ModelRequestParameters, model_settings: AnthropicModelSettings, ) -> tuple[set[str], dict[str, str]]: """Prepare beta features list and extra headers for API request. Handles merging custom `anthropic-beta` header from `extra_headers` into betas set and ensuring `User-Agent` is set. """ extra_headers = dict(model_settings.get('extra_headers', {})) extra_headers.setdefault('User-Agent', get_user_agent()) betas: set[str] = set() has_strict_tools = any(tool.get('strict') for tool in tools) if has_strict_tools or model_request_parameters.output_mode == 'native': betas.add('structured-outputs-2025-11-13') if betas_from_setting := model_settings.get('anthropic_betas'): betas.update(str(b) for b in betas_from_setting) if beta_header := extra_headers.pop('anthropic-beta', None): betas.update({stripped_beta for beta in beta_header.split(',') if (stripped_beta := beta.strip())}) return betas, extra_headers def _get_container( self, messages: list[ModelMessage], model_settings: AnthropicModelSettings ) -> BetaContainerParams | None: """Get container config for the API request.""" if (container := model_settings.get('anthropic_container')) is not None: return None if container is False else container for m in reversed(messages): if isinstance(m, ModelResponse) and m.provider_name == self.system and m.provider_details: if cid := m.provider_details.get('container_id'): return BetaContainerParams(id=cid) return None async def _messages_count_tokens( self, messages: list[ModelMessage], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessageTokensCount: if isinstance(self.client, AsyncAnthropicBedrock): raise UserError('AsyncAnthropicBedrock client does not support `count_tokens` api.') # standalone function to make it easier to override tools = self._get_tools(model_request_parameters, model_settings) tools, mcp_servers, builtin_tool_betas = self._add_builtin_tools(tools, model_request_parameters) tool_choice = self._infer_tool_choice(tools, model_settings, model_request_parameters) system_prompt, anthropic_messages = await self._map_message(messages, model_request_parameters, model_settings) self._limit_cache_points(system_prompt, anthropic_messages, tools) output_config = self._build_output_config(model_request_parameters, model_settings) betas, extra_headers = self._get_betas_and_extra_headers(tools, model_request_parameters, model_settings) betas.update(builtin_tool_betas) try: return await self.client.beta.messages.count_tokens( system=system_prompt or OMIT, messages=anthropic_messages, model=self._model_name, tools=tools or OMIT, tool_choice=tool_choice or OMIT, mcp_servers=mcp_servers or OMIT, betas=sorted(betas) or OMIT, output_config=output_config or OMIT, thinking=model_settings.get('anthropic_thinking', OMIT), timeout=model_settings.get('timeout', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=e.message) from e # pragma: lax no cover except APIConnectionError as e: raise ModelAPIError(model_name=self.model_name, message=e.message) from e def _process_response(self, response: BetaMessage) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" items: list[ModelResponsePart] = [] builtin_tool_calls: dict[str, BuiltinToolCallPart] = {} for item in response.content: if isinstance(item, BetaTextBlock): items.append(TextPart(content=item.text)) elif isinstance(item, BetaServerToolUseBlock): call_part = _map_server_tool_use_block(item, self.system) builtin_tool_calls[call_part.tool_call_id] = call_part items.append(call_part) elif isinstance(item, BetaWebSearchToolResultBlock): items.append(_map_web_search_tool_result_block(item, self.system)) elif isinstance(item, BetaCodeExecutionToolResultBlock): items.append(_map_code_execution_tool_result_block(item, self.system)) elif isinstance(item, BetaWebFetchToolResultBlock): items.append(_map_web_fetch_tool_result_block(item, self.system)) elif isinstance(item, BetaRedactedThinkingBlock): items.append( ThinkingPart(id='redacted_thinking', content='', signature=item.data, provider_name=self.system) ) elif isinstance(item, BetaThinkingBlock): items.append(ThinkingPart(content=item.thinking, signature=item.signature, provider_name=self.system)) elif isinstance(item, BetaMCPToolUseBlock): call_part = _map_mcp_server_use_block(item, self.system) builtin_tool_calls[call_part.tool_call_id] = call_part items.append(call_part) elif isinstance(item, BetaMCPToolResultBlock): call_part = builtin_tool_calls.get(item.tool_use_id) items.append(_map_mcp_server_result_block(item, call_part, self.system)) else: assert isinstance(item, BetaToolUseBlock), f'unexpected item type {type(item)}' items.append( ToolCallPart( tool_name=item.name, args=cast(dict[str, Any], item.input), tool_call_id=item.id, ) ) finish_reason: FinishReason | None = None provider_details: dict[str, Any] | None = None if raw_finish_reason := response.stop_reason: # pragma: no branch provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) if response.container: provider_details = provider_details or {} provider_details['container_id'] = response.container.id return ModelResponse( parts=items, usage=_map_usage(response, self._provider.name, self._provider.base_url, self._model_name), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, provider_url=self._provider.base_url, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[BetaRawMessageStreamEvent], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover assert isinstance(first_chunk, BetaRawMessageStartEvent) return AnthropicStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.message.model, _response=peekable_response, _provider_name=self._provider.name, _provider_url=self._provider.base_url, ) def _get_tools( self, model_request_parameters: ModelRequestParameters, model_settings: AnthropicModelSettings ) -> list[BetaToolUnionParam]: tools: list[BetaToolUnionParam] = [ self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values() ] # Add cache_control to the last tool if enabled if tools and (cache_tool_defs := model_settings.get('anthropic_cache_tool_definitions')): # If True, use '5m'; otherwise use the specified ttl value ttl: Literal['5m', '1h'] = '5m' if cache_tool_defs is True else cache_tool_defs last_tool = tools[-1] last_tool['cache_control'] = self._build_cache_control(ttl) return tools def _add_builtin_tools( self, tools: list[BetaToolUnionParam], model_request_parameters: ModelRequestParameters ) -> tuple[list[BetaToolUnionParam], list[BetaRequestMCPServerURLDefinitionParam], set[str]]: beta_features: set[str] = set() mcp_servers: list[BetaRequestMCPServerURLDefinitionParam] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): user_location = UserLocation(type='approximate', **tool.user_location) if tool.user_location else None tools.append( BetaWebSearchTool20250305Param( name='web_search', type='web_search_20250305', max_uses=tool.max_uses, allowed_domains=tool.allowed_domains, blocked_domains=tool.blocked_domains, user_location=user_location, ) ) elif isinstance(tool, CodeExecutionTool): # pragma: no branch tools.append(BetaCodeExecutionTool20250522Param(name='code_execution', type='code_execution_20250522')) beta_features.add('code-execution-2025-05-22') elif isinstance(tool, WebFetchTool): # pragma: no branch citations = BetaCitationsConfigParam(enabled=tool.enable_citations) if tool.enable_citations else None tools.append( BetaWebFetchTool20250910Param( name='web_fetch', type='web_fetch_20250910', max_uses=tool.max_uses, allowed_domains=tool.allowed_domains, blocked_domains=tool.blocked_domains, citations=citations, max_content_tokens=tool.max_content_tokens, ) ) beta_features.add('web-fetch-2025-09-10') elif isinstance(tool, MemoryTool): # pragma: no branch if 'memory' not in model_request_parameters.tool_defs: raise UserError("Built-in `MemoryTool` requires a 'memory' tool to be defined.") # Replace the memory tool definition with the built-in memory tool tools = [tool for tool in tools if tool.get('name') != 'memory'] tools.append(BetaMemoryTool20250818Param(name='memory', type='memory_20250818')) beta_features.add('context-management-2025-06-27') elif isinstance(tool, MCPServerTool) and tool.url: mcp_server_url_definition_param = BetaRequestMCPServerURLDefinitionParam( type='url', name=tool.id, url=tool.url, ) if tool.allowed_tools is not None: # pragma: no branch mcp_server_url_definition_param['tool_configuration'] = BetaRequestMCPServerToolConfigurationParam( enabled=bool(tool.allowed_tools), allowed_tools=tool.allowed_tools, ) if tool.authorization_token: # pragma: no cover mcp_server_url_definition_param['authorization_token'] = tool.authorization_token mcp_servers.append(mcp_server_url_definition_param) beta_features.add('mcp-client-2025-04-04') else: raise UserError( # pragma: no cover f'`{tool.__class__.__name__}` is not supported by `AnthropicModel`. If it should be, please file an issue.' ) return tools, mcp_servers, beta_features def _infer_tool_choice( self, tools: list[BetaToolUnionParam], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaToolChoiceParam | None: if not tools: return None else: tool_choice: BetaToolChoiceParam if not model_request_parameters.allow_text_output: tool_choice = {'type': 'any'} else: tool_choice = {'type': 'auto'} if 'parallel_tool_calls' in model_settings: tool_choice['disable_parallel_tool_use'] = not model_settings['parallel_tool_calls'] return tool_choice async def _map_message( # noqa: C901 self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters, model_settings: AnthropicModelSettings, ) -> tuple[str | list[BetaTextBlockParam], list[BetaMessageParam]]: """Just maps a `pydantic_ai.Message` to a `anthropic.types.MessageParam`.""" system_prompt_parts: list[str] = [] anthropic_messages: list[BetaMessageParam] = [] for m in messages: if isinstance(m, ModelRequest): user_content_params: list[BetaContentBlockParam] = [] for request_part in m.parts: if isinstance(request_part, SystemPromptPart): system_prompt_parts.append(request_part.content) elif isinstance(request_part, UserPromptPart): async for content in self._map_user_prompt(request_part): if isinstance(content, CachePoint): self._add_cache_control_to_last_param(user_content_params, ttl=content.ttl) else: user_content_params.append(content) elif isinstance(request_part, ToolReturnPart): tool_result_block_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response_str(), is_error=False, ) user_content_params.append(tool_result_block_param) elif isinstance(request_part, RetryPromptPart): # pragma: no branch if request_part.tool_name is None: text = request_part.model_response() # pragma: no cover retry_param = BetaTextBlockParam(type='text', text=text) # pragma: no cover else: retry_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response(), is_error=True, ) user_content_params.append(retry_param) if len(user_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='user', content=user_content_params)) elif isinstance(m, ModelResponse): assistant_content_params: list[ BetaTextBlockParam | BetaToolUseBlockParam | BetaServerToolUseBlockParam | BetaWebSearchToolResultBlockParam | BetaCodeExecutionToolResultBlockParam | BetaWebFetchToolResultBlockParam | BetaThinkingBlockParam | BetaRedactedThinkingBlockParam | BetaMCPToolUseBlockParam | BetaMCPToolResultBlock ] = [] for response_part in m.parts: if isinstance(response_part, TextPart): if response_part.content: assistant_content_params.append(BetaTextBlockParam(text=response_part.content, type='text')) elif isinstance(response_part, ToolCallPart): tool_use_block_param = BetaToolUseBlockParam( id=_guard_tool_call_id(t=response_part), type='tool_use', name=response_part.tool_name, input=response_part.args_as_dict(), ) assistant_content_params.append(tool_use_block_param) elif isinstance(response_part, ThinkingPart): if ( response_part.provider_name == self.system and response_part.signature is not None ): # pragma: no branch if response_part.id == 'redacted_thinking': assistant_content_params.append( BetaRedactedThinkingBlockParam( data=response_part.signature, type='redacted_thinking', ) ) else: assistant_content_params.append( BetaThinkingBlockParam( thinking=response_part.content, signature=response_part.signature, type='thinking', ) ) elif response_part.content: # pragma: no branch start_tag, end_tag = self.profile.thinking_tags assistant_content_params.append( BetaTextBlockParam( text='\n'.join([start_tag, response_part.content, end_tag]), type='text' ) ) elif isinstance(response_part, BuiltinToolCallPart): if response_part.provider_name == self.system: tool_use_id = _guard_tool_call_id(t=response_part) if response_part.tool_name == WebSearchTool.kind: server_tool_use_block_param = BetaServerToolUseBlockParam( id=tool_use_id, type='server_tool_use', name='web_search', input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif response_part.tool_name == CodeExecutionTool.kind: server_tool_use_block_param = BetaServerToolUseBlockParam( id=tool_use_id, type='server_tool_use', name='code_execution', input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif response_part.tool_name == WebFetchTool.kind: server_tool_use_block_param = BetaServerToolUseBlockParam( id=tool_use_id, type='server_tool_use', name='web_fetch', input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif ( response_part.tool_name.startswith(MCPServerTool.kind) and (server_id := response_part.tool_name.split(':', 1)[1]) and (args := response_part.args_as_dict()) and (tool_name := args.get('tool_name')) and (tool_args := args.get('tool_args')) ): # pragma: no branch mcp_tool_use_block_param = BetaMCPToolUseBlockParam( id=tool_use_id, type='mcp_tool_use', server_name=server_id, name=tool_name, input=tool_args, ) assistant_content_params.append(mcp_tool_use_block_param) elif isinstance(response_part, BuiltinToolReturnPart): if response_part.provider_name == self.system: tool_use_id = _guard_tool_call_id(t=response_part) if response_part.tool_name in ( WebSearchTool.kind, 'web_search_tool_result', # Backward compatibility ) and isinstance(response_part.content, dict | list): assistant_content_params.append( BetaWebSearchToolResultBlockParam( tool_use_id=tool_use_id, type='web_search_tool_result', content=cast( BetaWebSearchToolResultBlockParamContentParam, response_part.content, # pyright: ignore[reportUnknownMemberType] ), ) ) elif response_part.tool_name in ( # pragma: no branch CodeExecutionTool.kind, 'code_execution_tool_result', # Backward compatibility ) and isinstance(response_part.content, dict): assistant_content_params.append( BetaCodeExecutionToolResultBlockParam( tool_use_id=tool_use_id, type='code_execution_tool_result', content=cast( BetaCodeExecutionToolResultBlockParamContentParam, response_part.content, # pyright: ignore[reportUnknownMemberType] ), ) ) elif response_part.tool_name == WebFetchTool.kind and isinstance( response_part.content, dict ): assistant_content_params.append( BetaWebFetchToolResultBlockParam( tool_use_id=tool_use_id, type='web_fetch_tool_result', content=cast( WebFetchToolResultBlockParamContent, response_part.content, # pyright: ignore[reportUnknownMemberType] ), ) ) elif response_part.tool_name.startswith(MCPServerTool.kind) and isinstance( response_part.content, dict ): # pragma: no branch assistant_content_params.append( BetaMCPToolResultBlock( tool_use_id=tool_use_id, type='mcp_tool_result', **response_part.content, # pyright: ignore[reportUnknownMemberType] ) ) elif isinstance(response_part, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(response_part) if len(assistant_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='assistant', content=assistant_content_params)) else: assert_never(m) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_parts.append(instructions) system_prompt = '\n\n'.join(system_prompt_parts) # Add cache_control to the last message content if anthropic_cache_messages is enabled if anthropic_messages and (cache_messages := model_settings.get('anthropic_cache_messages')): ttl: Literal['5m', '1h'] = '5m' if cache_messages is True else cache_messages m = anthropic_messages[-1] content = m['content'] if isinstance(content, str): # Convert string content to list format with cache_control m['content'] = [ # pragma: no cover BetaTextBlockParam( text=content, type='text', cache_control=self._build_cache_control(ttl), ) ] else: # Add cache_control to the last content block content = cast(list[BetaContentBlockParam], content) self._add_cache_control_to_last_param(content, ttl) # If anthropic_cache_instructions is enabled, return system prompt as a list with cache_control if system_prompt and (cache_instructions := model_settings.get('anthropic_cache_instructions')): # If True, use '5m'; otherwise use the specified ttl value ttl: Literal['5m', '1h'] = '5m' if cache_instructions is True else cache_instructions system_prompt_blocks = [ BetaTextBlockParam( type='text', text=system_prompt, cache_control=self._build_cache_control(ttl), ) ] return system_prompt_blocks, anthropic_messages return system_prompt, anthropic_messages @staticmethod def _limit_cache_points( system_prompt: str | list[BetaTextBlockParam], anthropic_messages: list[BetaMessageParam], tools: list[BetaToolUnionParam], ) -> None: """Limit the number of cache points in the request to Anthropic's maximum. Anthropic enforces a maximum of 4 cache points per request. This method ensures compliance by counting existing cache points and removing excess ones from messages. Strategy: 1. Count cache points in system_prompt (can be multiple if list of blocks) 2. Count cache points in tools (can be in any position, not just last) 3. Raise UserError if system + tools already exceed MAX_CACHE_POINTS 4. Calculate remaining budget for message cache points 5. Traverse messages from newest to oldest, keeping the most recent cache points within the remaining budget 6. Remove excess cache points from older messages to stay within limit Cache point priority (always preserved): - System prompt cache points - Tool definition cache points - Message cache points (newest first, oldest removed if needed) Raises: UserError: If system_prompt and tools combined already exceed MAX_CACHE_POINTS (4). This indicates a configuration error that cannot be auto-fixed. """ MAX_CACHE_POINTS = 4 # Count existing cache points in system prompt used_cache_points = ( sum(1 for block in system_prompt if 'cache_control' in cast(dict[str, Any], block)) if isinstance(system_prompt, list) else 0 ) # Count existing cache points in tools (any tool may have cache_control) # Note: cache_control can be in the middle of tools list if builtin tools are added after for tool in tools: if 'cache_control' in tool: used_cache_points += 1 # Calculate remaining cache points budget for messages remaining_budget = MAX_CACHE_POINTS - used_cache_points if remaining_budget < 0: # pragma: no cover raise UserError( f'Too many cache points for Anthropic request. ' f'System prompt and tool definitions already use {used_cache_points} cache points, ' f'which exceeds the maximum of {MAX_CACHE_POINTS}.' ) # Remove excess cache points from messages (newest to oldest) for message in reversed(anthropic_messages): content = message['content'] if isinstance(content, str): # pragma: no cover continue # Process content blocks in reverse order (newest first) for block in reversed(cast(list[BetaContentBlockParam], content)): block_dict = cast(dict[str, Any], block) if 'cache_control' in block_dict: if remaining_budget > 0: remaining_budget -= 1 else: # Exceeded limit, remove this cache point del block_dict['cache_control'] def _build_cache_control(self, ttl: Literal['5m', '1h'] = '5m') -> BetaCacheControlEphemeralParam: """Build cache control dict, automatically omitting TTL for Bedrock clients. Args: ttl: The cache time-to-live ('5m' or '1h'). Ignored for Bedrock clients. Returns: A cache control dict suitable for the current client type. """ if isinstance(self.client, AsyncAnthropicBedrock): # Bedrock doesn't support TTL, use cast to satisfy type checker return cast(BetaCacheControlEphemeralParam, {'type': 'ephemeral'}) return BetaCacheControlEphemeralParam(type='ephemeral', ttl=ttl) def _add_cache_control_to_last_param( self, params: list[BetaContentBlockParam], ttl: Literal['5m', '1h'] = '5m' ) -> None: """Add cache control to the last content block param. See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching for more information. Args: params: List of content block params to modify. ttl: The cache time-to-live ('5m' or '1h'). This is automatically ignored for Bedrock clients, which don't support explicit TTL parameters. """ if not params: raise UserError( 'CachePoint cannot be the first content in a user message - there must be previous content to attach the CachePoint to. ' 'To cache system instructions or tool definitions, use the `anthropic_cache_instructions` or `anthropic_cache_tool_definitions` settings instead.' ) # Only certain types support cache_control # See https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#what-can-be-cached cacheable_types = {'text', 'tool_use', 'server_tool_use', 'image', 'tool_result', 'document'} # Cast needed because BetaContentBlockParam is a union including response Block types (Pydantic models) # that don't support dict operations, even though at runtime we only have request Param types (TypedDicts). last_param = cast(dict[str, Any], params[-1]) if last_param['type'] not in cacheable_types: raise UserError(f'Cache control not supported for param type: {last_param["type"]}') # Add cache_control to the last param last_param['cache_control'] = self._build_cache_control(ttl) @staticmethod def _map_binary_data(data: bytes, media_type: str) -> BetaContentBlockParam: # Anthropic SDK accepts file-like objects (IO[bytes]) and handles base64 encoding internally if media_type.startswith('image/'): return BetaImageBlockParam( source={'data': io.BytesIO(data), 'media_type': media_type, 'type': 'base64'}, # type: ignore type='image', ) elif media_type == 'application/pdf': return BetaRequestDocumentBlockParam( source=BetaBase64PDFSourceParam( data=io.BytesIO(data), media_type='application/pdf', type='base64', ), type='document', ) elif media_type == 'text/plain': return BetaRequestDocumentBlockParam( source=BetaPlainTextSourceParam(data=data.decode('utf-8'), media_type=media_type, type='text'), type='document', ) else: raise RuntimeError(f'Unsupported binary content media type for Anthropic: {media_type}') @staticmethod async def _map_user_prompt( part: UserPromptPart, ) -> AsyncGenerator[BetaContentBlockParam | CachePoint]: if isinstance(part.content, str): if part.content: # Only yield non-empty text yield BetaTextBlockParam(text=part.content, type='text') else: for item in part.content: if isinstance(item, str): if item: # Only yield non-empty text yield BetaTextBlockParam(text=item, type='text') elif isinstance(item, CachePoint): yield item elif isinstance(item, BinaryContent): yield AnthropicModel._map_binary_data(item.data, item.media_type) elif isinstance(item, ImageUrl): if item.force_download: downloaded = await download_item(item, data_format='bytes') yield AnthropicModel._map_binary_data(downloaded['data'], item.media_type) else: yield BetaImageBlockParam(source={'type': 'url', 'url': item.url}, type='image') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': if item.force_download: downloaded = await download_item(item, data_format='bytes') yield AnthropicModel._map_binary_data(downloaded['data'], item.media_type) else: yield BetaRequestDocumentBlockParam( source={'url': item.url, 'type': 'url'}, type='document' ) elif item.media_type == 'text/plain': downloaded_item = await download_item(item, data_format='text') yield BetaRequestDocumentBlockParam( source=BetaPlainTextSourceParam( data=downloaded_item['data'], media_type=item.media_type, type='text' ), type='document', ) else: # pragma: no cover raise RuntimeError(f'Unsupported media type: {item.media_type}') else: raise RuntimeError(f'Unsupported content type: {type(item)}') # pragma: no cover def _map_tool_definition(self, f: ToolDefinition) -> BetaToolParam: """Maps a `ToolDefinition` dataclass to an Anthropic `BetaToolParam` dictionary.""" tool_param: BetaToolParam = { 'name': f.name, 'description': f.description or '', 'input_schema': f.parameters_json_schema, } if f.strict and self.profile.supports_json_schema_output: tool_param['strict'] = f.strict return tool_param @staticmethod def _build_output_config( model_request_parameters: ModelRequestParameters, model_settings: AnthropicModelSettings ) -> BetaOutputConfigParam | None: output_format: BetaJSONOutputFormatParam | None = None if model_request_parameters.output_mode == 'native': assert model_request_parameters.output_object is not None output_format = {'type': 'json_schema', 'schema': model_request_parameters.output_object.json_schema} effort = model_settings.get('anthropic_effort') if output_format is None and effort is None: return None config: BetaOutputConfigParam = {} if output_format is not None: config['format'] = output_format if effort is not None: config['effort'] = effort return config ``` #### __init__ ```python __init__( model_name: AnthropicModelName, *, provider: ( Literal["anthropic", "gateway"] | Provider[AsyncAnthropicClient] ) = "anthropic", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Anthropic model. Parameters: | Name | Type | Description | Default | | ------------ | --------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `AnthropicModelName` | The name of the Anthropic model to use. List of model names available here. | *required* | | `provider` | \`Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient]\` | The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of Provider[AsyncAnthropicClient]. Defaults to 'anthropic'. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. The default 'anthropic' provider will use the default ..profiles.anthropic.anthropic_model_profile. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic', 'gateway'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. Defaults to 'anthropic'. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. The default 'anthropic' provider will use the default `..profiles.anthropic.anthropic_model_profile`. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/anthropic' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: AnthropicModelName ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` The set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """The set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, WebFetchTool, MemoryTool, MCPServerTool}) ``` ### AnthropicStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Anthropic models. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass class AnthropicStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Anthropic models.""" _model_name: AnthropicModelName _response: AsyncIterable[BetaRawMessageStreamEvent] _provider_name: str _provider_url: str _timestamp: datetime = field(default_factory=_utils.now_utc) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 current_block: BetaContentBlock | None = None builtin_tool_calls: dict[str, BuiltinToolCallPart] = {} async for event in self._response: if isinstance(event, BetaRawMessageStartEvent): self._usage = _map_usage(event, self._provider_name, self._provider_url, self._model_name) self.provider_response_id = event.message.id if event.message.container: self.provider_details = self.provider_details or {} self.provider_details['container_id'] = event.message.container.id elif isinstance(event, BetaRawContentBlockStartEvent): current_block = event.content_block if isinstance(current_block, BetaTextBlock) and current_block.text: for event_ in self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=current_block.text ): yield event_ elif isinstance(current_block, BetaThinkingBlock): for event_ in self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=current_block.thinking, signature=current_block.signature, provider_name=self.provider_name, ): yield event_ elif isinstance(current_block, BetaRedactedThinkingBlock): for event_ in self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, id='redacted_thinking', signature=current_block.data, provider_name=self.provider_name, ): yield event_ elif isinstance(current_block, BetaToolUseBlock): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, tool_name=current_block.name, args=cast(dict[str, Any], current_block.input) or None, tool_call_id=current_block.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaServerToolUseBlock): call_part = _map_server_tool_use_block(current_block, self.provider_name) builtin_tool_calls[call_part.tool_call_id] = call_part yield self._parts_manager.handle_part( vendor_part_id=event.index, part=call_part, ) elif isinstance(current_block, BetaWebSearchToolResultBlock): yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_web_search_tool_result_block(current_block, self.provider_name), ) elif isinstance(current_block, BetaCodeExecutionToolResultBlock): yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_code_execution_tool_result_block(current_block, self.provider_name), ) elif isinstance(current_block, BetaWebFetchToolResultBlock): # pragma: lax no cover yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_web_fetch_tool_result_block(current_block, self.provider_name), ) elif isinstance(current_block, BetaMCPToolUseBlock): call_part = _map_mcp_server_use_block(current_block, self.provider_name) builtin_tool_calls[call_part.tool_call_id] = call_part args_json = call_part.args_as_json_str() # Drop the final `{}}` so that we can add tool args deltas args_json_delta = args_json[:-3] assert args_json_delta.endswith('"tool_args":'), ( f'Expected {args_json_delta!r} to end in `"tool_args":`' ) yield self._parts_manager.handle_part( vendor_part_id=event.index, part=replace(call_part, args=None) ) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, args=args_json_delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaMCPToolResultBlock): call_part = builtin_tool_calls.get(current_block.tool_use_id) yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_mcp_server_result_block(current_block, call_part, self.provider_name), ) elif isinstance(event, BetaRawContentBlockDeltaEvent): if isinstance(event.delta, BetaTextDelta): for event_ in self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=event.delta.text ): yield event_ elif isinstance(event.delta, BetaThinkingDelta): for event_ in self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=event.delta.thinking, provider_name=self.provider_name, ): yield event_ elif isinstance(event.delta, BetaSignatureDelta): for event_ in self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, signature=event.delta.signature, provider_name=self.provider_name, ): yield event_ elif isinstance(event.delta, BetaInputJSONDelta): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, args=event.delta.partial_json, ) if maybe_event is not None: # pragma: no branch yield maybe_event # TODO(Marcelo): We need to handle citations. elif isinstance(event.delta, BetaCitationsDelta): pass elif isinstance(event, BetaRawMessageDeltaEvent): self._usage = _map_usage(event, self._provider_name, self._provider_url, self._model_name, self._usage) if raw_finish_reason := event.delta.stop_reason: # pragma: no branch self.provider_details = self.provider_details or {} self.provider_details['finish_reason'] = raw_finish_reason self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) elif isinstance(event, BetaRawContentBlockStopEvent): # pragma: no branch if isinstance(current_block, BetaMCPToolUseBlock): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, args='}', ) if maybe_event is not None: # pragma: no branch yield maybe_event current_block = None elif isinstance(event, BetaRawMessageStopEvent): # pragma: no branch current_block = None @property def model_name(self) -> AnthropicModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: AnthropicModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models` Logic related to making requests to an LLM. The aim here is to make a common interface for different LLMs, so that the rest of the code can be agnostic to the specific LLM being used. ### KnownModelName ```python KnownModelName = TypeAliasType( "KnownModelName", Literal[ "anthropic:claude-3-5-haiku-20241022", "anthropic:claude-3-5-haiku-latest", "anthropic:claude-3-7-sonnet-20250219", "anthropic:claude-3-7-sonnet-latest", "anthropic:claude-3-haiku-20240307", "anthropic:claude-3-opus-20240229", "anthropic:claude-3-opus-latest", "anthropic:claude-4-opus-20250514", "anthropic:claude-4-sonnet-20250514", "anthropic:claude-haiku-4-5-20251001", "anthropic:claude-haiku-4-5", "anthropic:claude-opus-4-0", "anthropic:claude-opus-4-1-20250805", "anthropic:claude-opus-4-20250514", "anthropic:claude-opus-4-5-20251101", "anthropic:claude-opus-4-5", "anthropic:claude-opus-4-6", "anthropic:claude-sonnet-4-0", "anthropic:claude-sonnet-4-20250514", "anthropic:claude-sonnet-4-5-20250929", "anthropic:claude-sonnet-4-5", "bedrock:amazon.titan-text-express-v1", "bedrock:amazon.titan-text-lite-v1", "bedrock:amazon.titan-tg1-large", "bedrock:anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:anthropic.claude-3-haiku-20240307-v1:0", "bedrock:anthropic.claude-3-opus-20240229-v1:0", "bedrock:anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:anthropic.claude-haiku-4-5-20251001-v1:0", "bedrock:anthropic.claude-instant-v1", "bedrock:anthropic.claude-opus-4-20250514-v1:0", "bedrock:anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0", "bedrock:anthropic.claude-v2:1", "bedrock:anthropic.claude-v2", "bedrock:cohere.command-light-text-v14", "bedrock:cohere.command-r-plus-v1:0", "bedrock:cohere.command-r-v1:0", "bedrock:cohere.command-text-v14", "bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0", "bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0", "bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0", "bedrock:meta.llama3-1-405b-instruct-v1:0", "bedrock:meta.llama3-1-70b-instruct-v1:0", "bedrock:meta.llama3-1-8b-instruct-v1:0", "bedrock:meta.llama3-70b-instruct-v1:0", "bedrock:meta.llama3-8b-instruct-v1:0", "bedrock:mistral.mistral-7b-instruct-v0:2", "bedrock:mistral.mistral-large-2402-v1:0", "bedrock:mistral.mistral-large-2407-v1:0", "bedrock:mistral.mixtral-8x7b-instruct-v0:1", "bedrock:us.amazon.nova-2-lite-v1:0", "bedrock:us.amazon.nova-lite-v1:0", "bedrock:us.amazon.nova-micro-v1:0", "bedrock:us.amazon.nova-pro-v1:0", "bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:us.anthropic.claude-3-haiku-20240307-v1:0", "bedrock:us.anthropic.claude-3-opus-20240229-v1:0", "bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:us.anthropic.claude-haiku-4-5-20251001-v1:0", "bedrock:us.anthropic.claude-opus-4-20250514-v1:0", "bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0", "bedrock:us.meta.llama3-1-70b-instruct-v1:0", "bedrock:us.meta.llama3-1-8b-instruct-v1:0", "bedrock:us.meta.llama3-2-11b-instruct-v1:0", "bedrock:us.meta.llama3-2-1b-instruct-v1:0", "bedrock:us.meta.llama3-2-3b-instruct-v1:0", "bedrock:us.meta.llama3-2-90b-instruct-v1:0", "bedrock:us.meta.llama3-3-70b-instruct-v1:0", "cerebras:gpt-oss-120b", "cerebras:llama-3.3-70b", "cerebras:llama3.1-8b", "cerebras:qwen-3-235b-a22b-instruct-2507", "cerebras:qwen-3-32b", "cerebras:qwen-3-coder-480b", "cerebras:zai-glm-4.6", "cerebras:zai-glm-4.7", "cohere:c4ai-aya-expanse-32b", "cohere:c4ai-aya-expanse-8b", "cohere:command-nightly", "cohere:command-r-08-2024", "cohere:command-r-plus-08-2024", "cohere:command-r7b-12-2024", "deepseek:deepseek-chat", "deepseek:deepseek-reasoner", "gateway/anthropic:claude-3-5-haiku-20241022", "gateway/anthropic:claude-3-5-haiku-latest", "gateway/anthropic:claude-3-7-sonnet-20250219", "gateway/anthropic:claude-3-7-sonnet-latest", "gateway/anthropic:claude-3-haiku-20240307", "gateway/anthropic:claude-3-opus-20240229", "gateway/anthropic:claude-3-opus-latest", "gateway/anthropic:claude-4-opus-20250514", "gateway/anthropic:claude-4-sonnet-20250514", "gateway/anthropic:claude-haiku-4-5-20251001", "gateway/anthropic:claude-haiku-4-5", "gateway/anthropic:claude-opus-4-0", "gateway/anthropic:claude-opus-4-1-20250805", "gateway/anthropic:claude-opus-4-20250514", "gateway/anthropic:claude-opus-4-5-20251101", "gateway/anthropic:claude-opus-4-5", "gateway/anthropic:claude-opus-4-6", "gateway/anthropic:claude-sonnet-4-0", "gateway/anthropic:claude-sonnet-4-20250514", "gateway/anthropic:claude-sonnet-4-5-20250929", "gateway/anthropic:claude-sonnet-4-5", "gateway/bedrock:amazon.titan-text-express-v1", "gateway/bedrock:amazon.titan-text-lite-v1", "gateway/bedrock:amazon.titan-tg1-large", "gateway/bedrock:anthropic.claude-3-5-haiku-20241022-v1:0", "gateway/bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0", "gateway/bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0", "gateway/bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0", "gateway/bedrock:anthropic.claude-3-haiku-20240307-v1:0", "gateway/bedrock:anthropic.claude-3-opus-20240229-v1:0", "gateway/bedrock:anthropic.claude-3-sonnet-20240229-v1:0", "gateway/bedrock:anthropic.claude-haiku-4-5-20251001-v1:0", "gateway/bedrock:anthropic.claude-instant-v1", "gateway/bedrock:anthropic.claude-opus-4-20250514-v1:0", "gateway/bedrock:anthropic.claude-sonnet-4-20250514-v1:0", "gateway/bedrock:anthropic.claude-sonnet-4-5-20250929-v1:0", "gateway/bedrock:anthropic.claude-v2:1", "gateway/bedrock:anthropic.claude-v2", "gateway/bedrock:cohere.command-light-text-v14", "gateway/bedrock:cohere.command-r-plus-v1:0", "gateway/bedrock:cohere.command-r-v1:0", "gateway/bedrock:cohere.command-text-v14", "gateway/bedrock:eu.anthropic.claude-haiku-4-5-20251001-v1:0", "gateway/bedrock:eu.anthropic.claude-sonnet-4-20250514-v1:0", "gateway/bedrock:eu.anthropic.claude-sonnet-4-5-20250929-v1:0", "gateway/bedrock:global.anthropic.claude-opus-4-5-20251101-v1:0", "gateway/bedrock:meta.llama3-1-405b-instruct-v1:0", "gateway/bedrock:meta.llama3-1-70b-instruct-v1:0", "gateway/bedrock:meta.llama3-1-8b-instruct-v1:0", "gateway/bedrock:meta.llama3-70b-instruct-v1:0", "gateway/bedrock:meta.llama3-8b-instruct-v1:0", "gateway/bedrock:mistral.mistral-7b-instruct-v0:2", "gateway/bedrock:mistral.mistral-large-2402-v1:0", "gateway/bedrock:mistral.mistral-large-2407-v1:0", "gateway/bedrock:mistral.mixtral-8x7b-instruct-v0:1", "gateway/bedrock:us.amazon.nova-2-lite-v1:0", "gateway/bedrock:us.amazon.nova-lite-v1:0", "gateway/bedrock:us.amazon.nova-micro-v1:0", "gateway/bedrock:us.amazon.nova-pro-v1:0", "gateway/bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0", "gateway/bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0", "gateway/bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0", "gateway/bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0", "gateway/bedrock:us.anthropic.claude-3-haiku-20240307-v1:0", "gateway/bedrock:us.anthropic.claude-3-opus-20240229-v1:0", "gateway/bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0", "gateway/bedrock:us.anthropic.claude-haiku-4-5-20251001-v1:0", "gateway/bedrock:us.anthropic.claude-opus-4-20250514-v1:0", "gateway/bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0", "gateway/bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0", "gateway/bedrock:us.meta.llama3-1-70b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-1-8b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-2-11b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-2-1b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-2-3b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-2-90b-instruct-v1:0", "gateway/bedrock:us.meta.llama3-3-70b-instruct-v1:0", "gateway/google-vertex:gemini-2.0-flash-lite", "gateway/google-vertex:gemini-2.0-flash", "gateway/google-vertex:gemini-2.5-flash-image", "gateway/google-vertex:gemini-2.5-flash-lite-preview-09-2025", "gateway/google-vertex:gemini-2.5-flash-lite", "gateway/google-vertex:gemini-2.5-flash-preview-09-2025", "gateway/google-vertex:gemini-2.5-flash", "gateway/google-vertex:gemini-2.5-pro", "gateway/google-vertex:gemini-3-flash-preview", "gateway/google-vertex:gemini-3-pro-image-preview", "gateway/google-vertex:gemini-3-pro-preview", "gateway/google-vertex:gemini-flash-latest", "gateway/google-vertex:gemini-flash-lite-latest", "gateway/groq:llama-3.1-8b-instant", "gateway/groq:llama-3.3-70b-versatile", "gateway/groq:meta-llama/llama-guard-4-12b", "gateway/groq:openai/gpt-oss-120b", "gateway/groq:openai/gpt-oss-20b", "gateway/groq:whisper-large-v3", "gateway/groq:whisper-large-v3-turbo", "gateway/groq:meta-llama/llama-4-maverick-17b-128e-instruct", "gateway/groq:meta-llama/llama-4-scout-17b-16e-instruct", "gateway/groq:meta-llama/llama-prompt-guard-2-22m", "gateway/groq:meta-llama/llama-prompt-guard-2-86m", "gateway/groq:moonshotai/kimi-k2-instruct-0905", "gateway/groq:openai/gpt-oss-safeguard-20b", "gateway/groq:playai-tts", "gateway/groq:playai-tts-arabic", "gateway/groq:qwen/qwen-3-32b", "gateway/openai:chatgpt-4o-latest", "gateway/openai:codex-mini-latest", "gateway/openai:computer-use-preview-2025-03-11", "gateway/openai:computer-use-preview", "gateway/openai:gpt-3.5-turbo-0125", "gateway/openai:gpt-3.5-turbo-0301", "gateway/openai:gpt-3.5-turbo-0613", "gateway/openai:gpt-3.5-turbo-1106", "gateway/openai:gpt-3.5-turbo-16k-0613", "gateway/openai:gpt-3.5-turbo-16k", "gateway/openai:gpt-3.5-turbo", "gateway/openai:gpt-4-0125-preview", "gateway/openai:gpt-4-0314", "gateway/openai:gpt-4-0613", "gateway/openai:gpt-4-1106-preview", "gateway/openai:gpt-4-32k-0314", "gateway/openai:gpt-4-32k-0613", "gateway/openai:gpt-4-32k", "gateway/openai:gpt-4-turbo-2024-04-09", "gateway/openai:gpt-4-turbo-preview", "gateway/openai:gpt-4-turbo", "gateway/openai:gpt-4-vision-preview", "gateway/openai:gpt-4.1-2025-04-14", "gateway/openai:gpt-4.1-mini-2025-04-14", "gateway/openai:gpt-4.1-mini", "gateway/openai:gpt-4.1-nano-2025-04-14", "gateway/openai:gpt-4.1-nano", "gateway/openai:gpt-4.1", "gateway/openai:gpt-4", "gateway/openai:gpt-4o-2024-05-13", "gateway/openai:gpt-4o-2024-08-06", "gateway/openai:gpt-4o-2024-11-20", "gateway/openai:gpt-4o-audio-preview-2024-10-01", "gateway/openai:gpt-4o-audio-preview-2024-12-17", "gateway/openai:gpt-4o-audio-preview-2025-06-03", "gateway/openai:gpt-4o-audio-preview", "gateway/openai:gpt-4o-mini-2024-07-18", "gateway/openai:gpt-4o-mini-audio-preview-2024-12-17", "gateway/openai:gpt-4o-mini-audio-preview", "gateway/openai:gpt-4o-mini-search-preview-2025-03-11", "gateway/openai:gpt-4o-mini-search-preview", "gateway/openai:gpt-4o-mini", "gateway/openai:gpt-4o-search-preview-2025-03-11", "gateway/openai:gpt-4o-search-preview", "gateway/openai:gpt-4o", "gateway/openai:gpt-5-2025-08-07", "gateway/openai:gpt-5-chat-latest", "gateway/openai:gpt-5-codex", "gateway/openai:gpt-5-mini-2025-08-07", "gateway/openai:gpt-5-mini", "gateway/openai:gpt-5-nano-2025-08-07", "gateway/openai:gpt-5-nano", "gateway/openai:gpt-5-pro-2025-10-06", "gateway/openai:gpt-5-pro", "gateway/openai:gpt-5.1-2025-11-13", "gateway/openai:gpt-5.1-chat-latest", "gateway/openai:gpt-5.1-codex-max", "gateway/openai:gpt-5.1-codex", "gateway/openai:gpt-5.1-mini", "gateway/openai:gpt-5.1", "gateway/openai:gpt-5.2-2025-12-11", "gateway/openai:gpt-5.2-chat-latest", "gateway/openai:gpt-5.2-pro-2025-12-11", "gateway/openai:gpt-5.2-pro", "gateway/openai:gpt-5.2", "gateway/openai:gpt-5", "gateway/openai:o1-2024-12-17", "gateway/openai:o1-mini-2024-09-12", "gateway/openai:o1-mini", "gateway/openai:o1-preview-2024-09-12", "gateway/openai:o1-preview", "gateway/openai:o1-pro-2025-03-19", "gateway/openai:o1-pro", "gateway/openai:o1", "gateway/openai:o3-2025-04-16", "gateway/openai:o3-deep-research-2025-06-26", "gateway/openai:o3-deep-research", "gateway/openai:o3-mini-2025-01-31", "gateway/openai:o3-mini", "gateway/openai:o3-pro-2025-06-10", "gateway/openai:o3-pro", "gateway/openai:o3", "gateway/openai:o4-mini-2025-04-16", "gateway/openai:o4-mini-deep-research-2025-06-26", "gateway/openai:o4-mini-deep-research", "gateway/openai:o4-mini", "google-gla:gemini-2.0-flash-lite", "google-gla:gemini-2.0-flash", "google-gla:gemini-2.5-flash-image", "google-gla:gemini-2.5-flash-lite-preview-09-2025", "google-gla:gemini-2.5-flash-lite", "google-gla:gemini-2.5-flash-preview-09-2025", "google-gla:gemini-2.5-flash", "google-gla:gemini-2.5-pro", "google-gla:gemini-3-flash-preview", "google-gla:gemini-3-pro-image-preview", "google-gla:gemini-3-pro-preview", "google-gla:gemini-flash-latest", "google-gla:gemini-flash-lite-latest", "google-vertex:gemini-2.0-flash-lite", "google-vertex:gemini-2.0-flash", "google-vertex:gemini-2.5-flash-image", "google-vertex:gemini-2.5-flash-lite-preview-09-2025", "google-vertex:gemini-2.5-flash-lite", "google-vertex:gemini-2.5-flash-preview-09-2025", "google-vertex:gemini-2.5-flash", "google-vertex:gemini-2.5-pro", "google-vertex:gemini-3-flash-preview", "google-vertex:gemini-3-pro-image-preview", "google-vertex:gemini-3-pro-preview", "google-vertex:gemini-flash-latest", "google-vertex:gemini-flash-lite-latest", "grok:grok-2-image-1212", "grok:grok-2-vision-1212", "grok:grok-3-fast", "grok:grok-3-mini-fast", "grok:grok-3-mini", "grok:grok-3", "grok:grok-4-0709", "grok:grok-4-latest", "grok:grok-4-1-fast-non-reasoning", "grok:grok-4-1-fast-reasoning", "grok:grok-4-1-fast", "grok:grok-4-fast-non-reasoning", "grok:grok-4-fast-reasoning", "grok:grok-4-fast", "grok:grok-4", "grok:grok-code-fast-1", "xai:grok-3", "xai:grok-3-fast", "xai:grok-3-fast-latest", "xai:grok-3-latest", "xai:grok-3-mini", "xai:grok-3-mini-fast", "xai:grok-3-mini-fast-latest", "xai:grok-4", "xai:grok-4-0709", "xai:grok-4-1-fast", "xai:grok-4-1-fast-non-reasoning", "xai:grok-4-1-fast-non-reasoning-latest", "xai:grok-4-1-fast-reasoning", "xai:grok-4-1-fast-reasoning-latest", "xai:grok-4-fast", "xai:grok-4-fast-non-reasoning", "xai:grok-4-fast-non-reasoning-latest", "xai:grok-4-fast-reasoning", "xai:grok-4-fast-reasoning-latest", "xai:grok-4-latest", "xai:grok-code-fast-1", "groq:llama-3.1-8b-instant", "groq:llama-3.3-70b-versatile", "groq:meta-llama/llama-guard-4-12b", "groq:openai/gpt-oss-120b", "groq:openai/gpt-oss-20b", "groq:whisper-large-v3", "groq:whisper-large-v3-turbo", "groq:meta-llama/llama-4-maverick-17b-128e-instruct", "groq:meta-llama/llama-4-scout-17b-16e-instruct", "groq:meta-llama/llama-prompt-guard-2-22m", "groq:meta-llama/llama-prompt-guard-2-86m", "groq:moonshotai/kimi-k2-instruct-0905", "groq:openai/gpt-oss-safeguard-20b", "groq:playai-tts", "groq:playai-tts-arabic", "groq:qwen/qwen-3-32b", "heroku:claude-3-5-haiku", "heroku:claude-3-5-sonnet-latest", "heroku:claude-3-7-sonnet", "heroku:claude-3-haiku", "heroku:claude-4-5-haiku", "heroku:claude-4-5-sonnet", "heroku:claude-4-sonnet", "heroku:claude-opus-4-5", "heroku:gpt-oss-120b", "heroku:kimi-k2-thinking", "heroku:minimax-m2", "heroku:qwen3-235b", "heroku:qwen3-coder-480b", "heroku:nova-2-lite", "heroku:nova-lite", "heroku:nova-pro", "huggingface:deepseek-ai/DeepSeek-R1", "huggingface:meta-llama/Llama-3.3-70B-Instruct", "huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct", "huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct", "huggingface:Qwen/Qwen2.5-72B-Instruct", "huggingface:Qwen/Qwen3-235B-A22B", "huggingface:Qwen/Qwen3-32B", "huggingface:Qwen/QwQ-32B", "mistral:codestral-latest", "mistral:mistral-large-latest", "mistral:mistral-moderation-latest", "mistral:mistral-small-latest", "moonshotai:kimi-k2-0711-preview", "moonshotai:kimi-latest", "moonshotai:kimi-thinking-preview", "moonshotai:moonshot-v1-128k-vision-preview", "moonshotai:moonshot-v1-128k", "moonshotai:moonshot-v1-32k-vision-preview", "moonshotai:moonshot-v1-32k", "moonshotai:moonshot-v1-8k-vision-preview", "moonshotai:moonshot-v1-8k", "openai:chatgpt-4o-latest", "openai:codex-mini-latest", "openai:computer-use-preview-2025-03-11", "openai:computer-use-preview", "openai:gpt-3.5-turbo-0125", "openai:gpt-3.5-turbo-0301", "openai:gpt-3.5-turbo-0613", "openai:gpt-3.5-turbo-1106", "openai:gpt-3.5-turbo-16k-0613", "openai:gpt-3.5-turbo-16k", "openai:gpt-3.5-turbo", "openai:gpt-4-0125-preview", "openai:gpt-4-0314", "openai:gpt-4-0613", "openai:gpt-4-1106-preview", "openai:gpt-4-32k-0314", "openai:gpt-4-32k-0613", "openai:gpt-4-32k", "openai:gpt-4-turbo-2024-04-09", "openai:gpt-4-turbo-preview", "openai:gpt-4-turbo", "openai:gpt-4-vision-preview", "openai:gpt-4.1-2025-04-14", "openai:gpt-4.1-mini-2025-04-14", "openai:gpt-4.1-mini", "openai:gpt-4.1-nano-2025-04-14", "openai:gpt-4.1-nano", "openai:gpt-4.1", "openai:gpt-4", "openai:gpt-4o-2024-05-13", "openai:gpt-4o-2024-08-06", "openai:gpt-4o-2024-11-20", "openai:gpt-4o-audio-preview-2024-10-01", "openai:gpt-4o-audio-preview-2024-12-17", "openai:gpt-4o-audio-preview-2025-06-03", "openai:gpt-4o-audio-preview", "openai:gpt-4o-mini-2024-07-18", "openai:gpt-4o-mini-audio-preview-2024-12-17", "openai:gpt-4o-mini-audio-preview", "openai:gpt-4o-mini-search-preview-2025-03-11", "openai:gpt-4o-mini-search-preview", "openai:gpt-4o-mini", "openai:gpt-4o-search-preview-2025-03-11", "openai:gpt-4o-search-preview", "openai:gpt-4o", "openai:gpt-5-2025-08-07", "openai:gpt-5-chat-latest", "openai:gpt-5-codex", "openai:gpt-5-mini-2025-08-07", "openai:gpt-5-mini", "openai:gpt-5-nano-2025-08-07", "openai:gpt-5-nano", "openai:gpt-5-pro-2025-10-06", "openai:gpt-5-pro", "openai:gpt-5.1-2025-11-13", "openai:gpt-5.1-chat-latest", "openai:gpt-5.1-codex-max", "openai:gpt-5.1-codex", "openai:gpt-5.1-mini", "openai:gpt-5.1", "openai:gpt-5.2-2025-12-11", "openai:gpt-5.2-chat-latest", "openai:gpt-5.2-pro-2025-12-11", "openai:gpt-5.2-pro", "openai:gpt-5.2", "openai:gpt-5", "openai:o1-2024-12-17", "openai:o1-mini-2024-09-12", "openai:o1-mini", "openai:o1-preview-2024-09-12", "openai:o1-preview", "openai:o1-pro-2025-03-19", "openai:o1-pro", "openai:o1", "openai:o3-2025-04-16", "openai:o3-deep-research-2025-06-26", "openai:o3-deep-research", "openai:o3-mini-2025-01-31", "openai:o3-mini", "openai:o3-pro-2025-06-10", "openai:o3-pro", "openai:o3", "openai:o4-mini-2025-04-16", "openai:o4-mini-deep-research-2025-06-26", "openai:o4-mini-deep-research", "openai:o4-mini", "test", ], ) ``` Known model names that can be used with the `model` parameter of Agent. `KnownModelName` is provided as a concise way to specify a model. ### ModelRequestParameters Configuration for an agent's request to a model, specifically related to tools and output handling. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass(repr=False, kw_only=True) class ModelRequestParameters: """Configuration for an agent's request to a model, specifically related to tools and output handling.""" function_tools: list[ToolDefinition] = field(default_factory=list[ToolDefinition]) builtin_tools: list[AbstractBuiltinTool] = field(default_factory=list[AbstractBuiltinTool]) output_mode: OutputMode = 'text' output_object: OutputObjectDefinition | None = None output_tools: list[ToolDefinition] = field(default_factory=list[ToolDefinition]) prompted_output_template: str | None = None allow_text_output: bool = True allow_image_output: bool = False @cached_property def tool_defs(self) -> dict[str, ToolDefinition]: return {tool_def.name: tool_def for tool_def in [*self.function_tools, *self.output_tools]} @cached_property def prompted_output_instructions(self) -> str | None: if self.prompted_output_template and self.output_object: return StructuredTextOutputSchema.build_instructions(self.prompted_output_template, self.output_object) return None __repr__ = _utils.dataclasses_no_defaults_repr ``` ### Model Bases: `ABC` Abstract class for a model. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python class Model(ABC): """Abstract class for a model.""" _profile: ModelProfileSpec | None = None _settings: ModelSettings | None = None def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile @property def settings(self) -> ModelSettings | None: """Get the model settings.""" return self._settings @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model. This is ultimately called by `pydantic_ai._agent_graph.ModelRequestNode._make_request(...)`. """ raise NotImplementedError() async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: """Prepare request inputs before they are passed to the provider. This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved [`ModelRequestParameters`][pydantic_ai.models.ModelRequestParameters]. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods. """ model_settings = merge_model_settings(self.settings, model_settings) params = self.customize_request_parameters(model_request_parameters) if builtin_tools := params.builtin_tools: # Deduplicate builtin tools params = replace( params, builtin_tools=list({tool.unique_id: tool for tool in builtin_tools}.values()), ) if params.output_mode == 'auto': output_mode = self.profile.default_structured_output_mode params = replace( params, output_mode=output_mode, allow_text_output=output_mode in ('native', 'prompted'), ) # Reset irrelevant fields if params.output_tools and params.output_mode != 'tool': params = replace(params, output_tools=[]) if params.output_object and params.output_mode not in ('native', 'prompted'): params = replace(params, output_object=None) if params.prompted_output_template and params.output_mode not in ('prompted', 'native'): params = replace(params, prompted_output_template=None) # pragma: no cover # Set default prompted output template if ( params.output_mode == 'prompted' or (params.output_mode == 'native' and self.profile.native_output_requires_schema_in_instructions) ) and not params.prompted_output_template: params = replace(params, prompted_output_template=self.profile.prompted_output_template) # Check if output mode is supported if params.output_mode == 'native' and not self.profile.supports_json_schema_output: raise UserError('Native structured output is not supported by this model.') if params.output_mode == 'tool' and not self.profile.supports_tools: raise UserError('Tool output is not supported by this model.') if params.allow_image_output and not self.profile.supports_image_output: raise UserError('Image output is not supported by this model.') # Check if builtin tools are supported if params.builtin_tools: supported_types = self.profile.supported_builtin_tools unsupported = [tool for tool in params.builtin_tools if not isinstance(tool, tuple(supported_types))] if unsupported: unsupported_names = [type(tool).__name__ for tool in unsupported] supported_names = [t.__name__ for t in supported_types] raise UserError( f'Builtin tool(s) {unsupported_names} not supported by this model. Supported: {supported_names}' ) return model_settings, params @property @abstractmethod def model_name(self) -> str: """The model name.""" raise NotImplementedError() @property def label(self) -> str: """Human-friendly display label for the model. Handles common patterns: - gpt-5 -> GPT 5 - claude-sonnet-4-5 -> Claude Sonnet 4.5 - gemini-2.5-pro -> Gemini 2.5 Pro - meta-llama/llama-3-70b -> Llama 3 70b (OpenRouter style) """ label = self.model_name # Handle OpenRouter-style names with / (e.g., meta-llama/llama-3-70b) if '/' in label: label = label.split('/')[-1] parts = label.split('-') result: list[str] = [] for i, part in enumerate(parts): if i == 0 and part.lower() == 'gpt': result.append(part.upper()) elif part.replace('.', '').isdigit(): if result and result[-1].replace('.', '').isdigit(): result[-1] = f'{result[-1]}.{part}' else: result.append(part) else: result.append(part.capitalize()) return ' '.join(result) @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model class can handle. Subclasses should override this to reflect their actual capabilities. Default is empty set - subclasses must explicitly declare support. """ return frozenset() @cached_property def profile(self) -> ModelProfile: """The model profile. We use this to compute the intersection of the profile's supported_builtin_tools and the model's implemented tools, ensuring model.profile.supported_builtin_tools is the single source of truth for what builtin tools are actually usable. """ _profile = self._profile if callable(_profile): _profile = _profile(self.model_name) if _profile is None: _profile = DEFAULT_PROFILE # Compute intersection: profile's allowed tools & model's implemented tools model_supported = self.__class__.supported_builtin_tools() profile_supported = _profile.supported_builtin_tools effective_tools = profile_supported & model_supported if effective_tools != profile_supported: _profile = replace(_profile, supported_builtin_tools=effective_tools) return _profile @property @abstractmethod def system(self) -> str: """The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. """ raise NotImplementedError() @property def base_url(self) -> str | None: """The base URL for the provider API, if available.""" return None @staticmethod def _get_instructions( messages: Sequence[ModelMessage], model_request_parameters: ModelRequestParameters | None = None ) -> str | None: """Get instructions from the first ModelRequest found when iterating messages in reverse. In the case that a "mock" request was generated to include a tool-return part for a result tool, we want to use the instructions from the second-to-most-recent request (which should correspond to the original request that generated the response that resulted in the tool-return part). """ instructions = None last_two_requests: list[ModelRequest] = [] for message in reversed(messages): if isinstance(message, ModelRequest): last_two_requests.append(message) if len(last_two_requests) == 2: break if message.instructions is not None: instructions = message.instructions break # If we don't have two requests, and we didn't already return instructions, there are definitely not any: if instructions is None and len(last_two_requests) == 2: most_recent_request = last_two_requests[0] second_most_recent_request = last_two_requests[1] # If we've gotten this far and the most recent request consists of only tool-return parts or retry-prompt parts, # we use the instructions from the second-to-most-recent request. This is necessary because when handling # result tools, we generate a "mock" ModelRequest with a tool-return part for it, and that ModelRequest will not # have the relevant instructions from the agent. # While it's possible that you could have a message history where the most recent request has only tool returns, # I believe there is no way to achieve that would _change_ the instructions without manually crafting the most # recent message. That might make sense in principle for some usage pattern, but it's enough of an edge case # that I think it's not worth worrying about, since you can work around this by inserting another ModelRequest # with no parts at all immediately before the request that has the tool calls (that works because we only look # at the two most recent ModelRequests here). # If you have a use case where this causes pain, please open a GitHub issue and we can discuss alternatives. if all(p.part_kind == 'tool-return' or p.part_kind == 'retry-prompt' for p in most_recent_request.parts): instructions = second_most_recent_request.instructions if model_request_parameters and (output_instructions := model_request_parameters.prompted_output_instructions): if instructions: instructions = '\n\n'.join([instructions, output_instructions]) else: instructions = output_instructions return instructions ``` #### __init__ ```python __init__( *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None ) -> None ``` Initialize the model with optional settings and profile. Parameters: | Name | Type | Description | Default | | ---------- | ------------------ | ----------- | --------------------------------------------------------------------- | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile ``` #### settings ```python settings: ModelSettings | None ``` Get the model settings. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model. This is ultimately called by `pydantic_ai._agent_graph.ModelRequestNode._make_request(...)`. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model. This is ultimately called by `pydantic_ai._agent_graph.ModelRequestNode._make_request(...)`. """ raise NotImplementedError() ``` #### count_tokens ```python count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Make a request to the model for counting tokens. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a request to the model and return a streaming response. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover ``` #### customize_request_parameters ```python customize_request_parameters( model_request_parameters: ModelRequestParameters, ) -> ModelRequestParameters ``` Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters ``` #### prepare_request ```python prepare_request( model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters] ``` Prepare request inputs before they are passed to the provider. This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved ModelRequestParameters. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: """Prepare request inputs before they are passed to the provider. This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved [`ModelRequestParameters`][pydantic_ai.models.ModelRequestParameters]. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods. """ model_settings = merge_model_settings(self.settings, model_settings) params = self.customize_request_parameters(model_request_parameters) if builtin_tools := params.builtin_tools: # Deduplicate builtin tools params = replace( params, builtin_tools=list({tool.unique_id: tool for tool in builtin_tools}.values()), ) if params.output_mode == 'auto': output_mode = self.profile.default_structured_output_mode params = replace( params, output_mode=output_mode, allow_text_output=output_mode in ('native', 'prompted'), ) # Reset irrelevant fields if params.output_tools and params.output_mode != 'tool': params = replace(params, output_tools=[]) if params.output_object and params.output_mode not in ('native', 'prompted'): params = replace(params, output_object=None) if params.prompted_output_template and params.output_mode not in ('prompted', 'native'): params = replace(params, prompted_output_template=None) # pragma: no cover # Set default prompted output template if ( params.output_mode == 'prompted' or (params.output_mode == 'native' and self.profile.native_output_requires_schema_in_instructions) ) and not params.prompted_output_template: params = replace(params, prompted_output_template=self.profile.prompted_output_template) # Check if output mode is supported if params.output_mode == 'native' and not self.profile.supports_json_schema_output: raise UserError('Native structured output is not supported by this model.') if params.output_mode == 'tool' and not self.profile.supports_tools: raise UserError('Tool output is not supported by this model.') if params.allow_image_output and not self.profile.supports_image_output: raise UserError('Image output is not supported by this model.') # Check if builtin tools are supported if params.builtin_tools: supported_types = self.profile.supported_builtin_tools unsupported = [tool for tool in params.builtin_tools if not isinstance(tool, tuple(supported_types))] if unsupported: unsupported_names = [type(tool).__name__ for tool in unsupported] supported_names = [t.__name__ for t in supported_types] raise UserError( f'Builtin tool(s) {unsupported_names} not supported by this model. Supported: {supported_names}' ) return model_settings, params ``` #### model_name ```python model_name: str ``` The model name. #### label ```python label: str ``` Human-friendly display label for the model. Handles common patterns: - gpt-5 -> GPT 5 - claude-sonnet-4-5 -> Claude Sonnet 4.5 - gemini-2.5-pro -> Gemini 2.5 Pro - meta-llama/llama-3-70b -> Llama 3 70b (OpenRouter style) #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` Return the set of builtin tool types this model class can handle. Subclasses should override this to reflect their actual capabilities. Default is empty set - subclasses must explicitly declare support. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model class can handle. Subclasses should override this to reflect their actual capabilities. Default is empty set - subclasses must explicitly declare support. """ return frozenset() ``` #### profile ```python profile: ModelProfile ``` The model profile. We use this to compute the intersection of the profile's supported_builtin_tools and the model's implemented tools, ensuring model.profile.supported_builtin_tools is the single source of truth for what builtin tools are actually usable. #### system ```python system: str ``` The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. #### base_url ```python base_url: str | None ``` The base URL for the provider API, if available. ### StreamedResponse Bases: `ABC` Streamed response from an LLM when calling a tool. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass class StreamedResponse(ABC): """Streamed response from an LLM when calling a tool.""" model_request_parameters: ModelRequestParameters final_result_event: FinalResultEvent | None = field(default=None, init=False) provider_response_id: str | None = field(default=None, init=False) provider_details: dict[str, Any] | None = field(default=None, init=False) finish_reason: FinishReason | None = field(default=None, init=False) _parts_manager: ModelResponsePartsManager = field(default_factory=ModelResponsePartsManager, init=False) _event_iterator: AsyncIterator[ModelResponseStreamEvent] | None = field(default=None, init=False) _usage: RequestUsage = field(default_factory=RequestUsage, init=False) def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event async def iterator_with_part_end( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: last_start_event: PartStartEvent | None = None def part_end_event(next_part: ModelResponsePart | None = None) -> PartEndEvent | None: if not last_start_event: return None index = last_start_event.index part = self._parts_manager.get_parts()[index] if not isinstance(part, TextPart | ThinkingPart | BaseToolCallPart): # Parts other than these 3 don't have deltas, so don't need an end part. return None return PartEndEvent( index=index, part=part, next_part_kind=next_part.part_kind if next_part else None, ) async for event in iterator: if isinstance(event, PartStartEvent): if last_start_event: end_event = part_end_event(event.part) if end_event: yield end_event event.previous_part_kind = last_start_event.part.part_kind last_start_event = event yield event end_event = part_end_event() if end_event: yield end_event self._event_iterator = iterator_with_part_end(iterator_with_final_event(self._get_event_iterator())) return self._event_iterator @abstractmethod async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. It should use the `_parts_manager` to handle deltas, and should update the `_usage` attributes as it goes. """ raise NotImplementedError() # noinspection PyUnreachableCode yield def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_url=self.provider_url, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) # TODO (v2): Make this a property def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage @property @abstractmethod def model_name(self) -> str: """Get the model name of the response.""" raise NotImplementedError() @property @abstractmethod def provider_name(self) -> str | None: """Get the provider name.""" raise NotImplementedError() @property @abstractmethod def provider_url(self) -> str | None: """Get the provider base URL.""" raise NotImplementedError() @property @abstractmethod def timestamp(self) -> datetime: """Get the timestamp of the response.""" raise NotImplementedError() ``` #### __aiter__ ```python __aiter__() -> AsyncIterator[ModelResponseStreamEvent] ``` Stream the response as an async iterable of ModelResponseStreamEvents. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a FinalResultEvent if/when the first match is found. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event async def iterator_with_part_end( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: last_start_event: PartStartEvent | None = None def part_end_event(next_part: ModelResponsePart | None = None) -> PartEndEvent | None: if not last_start_event: return None index = last_start_event.index part = self._parts_manager.get_parts()[index] if not isinstance(part, TextPart | ThinkingPart | BaseToolCallPart): # Parts other than these 3 don't have deltas, so don't need an end part. return None return PartEndEvent( index=index, part=part, next_part_kind=next_part.part_kind if next_part else None, ) async for event in iterator: if isinstance(event, PartStartEvent): if last_start_event: end_event = part_end_event(event.part) if end_event: yield end_event event.previous_part_kind = last_start_event.part.part_kind last_start_event = event yield event end_event = part_end_event() if end_event: yield end_event self._event_iterator = iterator_with_part_end(iterator_with_final_event(self._get_event_iterator())) return self._event_iterator ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_url=self.provider_url, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) ``` #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. This will not be the final usage until the stream is exhausted. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str | None ``` Get the provider name. #### provider_url ```python provider_url: str | None ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. ### ALLOW_MODEL_REQUESTS ```python ALLOW_MODEL_REQUESTS = True ``` Whether to allow requests to models. This global setting allows you to disable request to most models, e.g. to make sure you don't accidentally make costly requests to a model during tests. The testing models TestModel and FunctionModel are no affected by this setting. ### check_allow_model_requests ```python check_allow_model_requests() -> None ``` Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in Model.request and Model.request_stream. Raises: | Type | Description | | -------------- | ---------------------------------- | | `RuntimeError` | If model requests are not allowed. | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def check_allow_model_requests() -> None: """Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in [`Model.request`][pydantic_ai.models.Model.request] and [`Model.request_stream`][pydantic_ai.models.Model.request_stream]. Raises: RuntimeError: If model requests are not allowed. """ if not ALLOW_MODEL_REQUESTS: raise RuntimeError('Model requests are not allowed, since ALLOW_MODEL_REQUESTS is False') ``` ### override_allow_model_requests ```python override_allow_model_requests( allow_model_requests: bool, ) -> Iterator[None] ``` Context manager to temporarily override ALLOW_MODEL_REQUESTS. Parameters: | Name | Type | Description | Default | | ---------------------- | ------ | --------------------------------------------------- | ---------- | | `allow_model_requests` | `bool` | Whether to allow model requests within the context. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @contextmanager def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None]: """Context manager to temporarily override [`ALLOW_MODEL_REQUESTS`][pydantic_ai.models.ALLOW_MODEL_REQUESTS]. Args: allow_model_requests: Whether to allow model requests within the context. """ global ALLOW_MODEL_REQUESTS old_value = ALLOW_MODEL_REQUESTS ALLOW_MODEL_REQUESTS = allow_model_requests # pyright: ignore[reportConstantRedefinition] try: yield finally: ALLOW_MODEL_REQUESTS = old_value # pyright: ignore[reportConstantRedefinition] ``` # `pydantic_ai.models.bedrock` ## Setup For details on how to set up authentication with this model, see [model configuration for Bedrock](https://ai.pydantic.dev/models/bedrock/index.md). ### LatestBedrockModelNames ```python LatestBedrockModelNames = Literal[ "amazon.titan-tg1-large", "amazon.titan-text-lite-v1", "amazon.titan-text-express-v1", "us.amazon.nova-2-lite-v1:0", "us.amazon.nova-pro-v1:0", "us.amazon.nova-lite-v1:0", "us.amazon.nova-micro-v1:0", "anthropic.claude-3-5-sonnet-20241022-v2:0", "us.anthropic.claude-3-5-sonnet-20241022-v2:0", "anthropic.claude-3-5-haiku-20241022-v1:0", "us.anthropic.claude-3-5-haiku-20241022-v1:0", "anthropic.claude-instant-v1", "anthropic.claude-v2:1", "anthropic.claude-v2", "anthropic.claude-3-sonnet-20240229-v1:0", "us.anthropic.claude-3-sonnet-20240229-v1:0", "anthropic.claude-3-haiku-20240307-v1:0", "us.anthropic.claude-3-haiku-20240307-v1:0", "anthropic.claude-3-opus-20240229-v1:0", "us.anthropic.claude-3-opus-20240229-v1:0", "anthropic.claude-3-5-sonnet-20240620-v1:0", "us.anthropic.claude-3-5-sonnet-20240620-v1:0", "anthropic.claude-3-7-sonnet-20250219-v1:0", "us.anthropic.claude-3-7-sonnet-20250219-v1:0", "anthropic.claude-opus-4-20250514-v1:0", "us.anthropic.claude-opus-4-20250514-v1:0", "global.anthropic.claude-opus-4-5-20251101-v1:0", "anthropic.claude-sonnet-4-20250514-v1:0", "us.anthropic.claude-sonnet-4-20250514-v1:0", "eu.anthropic.claude-sonnet-4-20250514-v1:0", "anthropic.claude-sonnet-4-5-20250929-v1:0", "us.anthropic.claude-sonnet-4-5-20250929-v1:0", "eu.anthropic.claude-sonnet-4-5-20250929-v1:0", "anthropic.claude-haiku-4-5-20251001-v1:0", "us.anthropic.claude-haiku-4-5-20251001-v1:0", "eu.anthropic.claude-haiku-4-5-20251001-v1:0", "cohere.command-text-v14", "cohere.command-r-v1:0", "cohere.command-r-plus-v1:0", "cohere.command-light-text-v14", "meta.llama3-8b-instruct-v1:0", "meta.llama3-70b-instruct-v1:0", "meta.llama3-1-8b-instruct-v1:0", "us.meta.llama3-1-8b-instruct-v1:0", "meta.llama3-1-70b-instruct-v1:0", "us.meta.llama3-1-70b-instruct-v1:0", "meta.llama3-1-405b-instruct-v1:0", "us.meta.llama3-2-11b-instruct-v1:0", "us.meta.llama3-2-90b-instruct-v1:0", "us.meta.llama3-2-1b-instruct-v1:0", "us.meta.llama3-2-3b-instruct-v1:0", "us.meta.llama3-3-70b-instruct-v1:0", "mistral.mistral-7b-instruct-v0:2", "mistral.mixtral-8x7b-instruct-v0:1", "mistral.mistral-large-2402-v1:0", "mistral.mistral-large-2407-v1:0", ] ``` Latest Bedrock models. ### BedrockModelName ```python BedrockModelName = str | LatestBedrockModelNames ``` Possible Bedrock model names. Since Bedrock supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for a full list. ### BedrockModelSettings Bases: `ModelSettings` Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python class BedrockModelSettings(ModelSettings, total=False): """Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. """ # ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. bedrock_guardrail_config: GuardrailConfigurationTypeDef """Content moderation and safety settings for Bedrock API requests. See more about it on . """ bedrock_performance_configuration: PerformanceConfigurationTypeDef """Performance optimization settings for model inference. See more about it on . """ bedrock_request_metadata: dict[str, str] """Additional metadata to attach to Bedrock API requests. See more about it on . """ bedrock_additional_model_response_fields_paths: list[str] """JSON paths to extract additional fields from model responses. See more about it on . """ bedrock_prompt_variables: Mapping[str, PromptVariableValuesTypeDef] """Variables for substitution into prompt templates. See more about it on . """ bedrock_additional_model_requests_fields: Mapping[str, Any] """Additional model-specific parameters to include in requests. See more about it on . """ bedrock_cache_tool_definitions: bool """Whether to add a cache point after the last tool definition. When enabled, the last tool in the `tools` array will include a `cachePoint`, allowing Bedrock to cache tool definitions and reduce costs for compatible models. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. """ bedrock_cache_instructions: bool """Whether to add a cache point after the system prompt blocks. When enabled, an extra `cachePoint` is appended to the system prompt so Bedrock can cache system instructions. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. """ bedrock_cache_messages: bool """Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. Note: Uses 1 of Bedrock's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. """ bedrock_service_tier: ServiceTierTypeDef """Setting for optimizing performance and cost See more about it on . """ ``` #### bedrock_guardrail_config ```python bedrock_guardrail_config: GuardrailConfigurationTypeDef ``` Content moderation and safety settings for Bedrock API requests. See more about it on . #### bedrock_performance_configuration ```python bedrock_performance_configuration: ( PerformanceConfigurationTypeDef ) ``` Performance optimization settings for model inference. See more about it on . #### bedrock_request_metadata ```python bedrock_request_metadata: dict[str, str] ``` Additional metadata to attach to Bedrock API requests. See more about it on . #### bedrock_additional_model_response_fields_paths ```python bedrock_additional_model_response_fields_paths: list[str] ``` JSON paths to extract additional fields from model responses. See more about it on . #### bedrock_prompt_variables ```python bedrock_prompt_variables: Mapping[ str, PromptVariableValuesTypeDef ] ``` Variables for substitution into prompt templates. See more about it on . #### bedrock_additional_model_requests_fields ```python bedrock_additional_model_requests_fields: Mapping[str, Any] ``` Additional model-specific parameters to include in requests. See more about it on . #### bedrock_cache_tool_definitions ```python bedrock_cache_tool_definitions: bool ``` Whether to add a cache point after the last tool definition. When enabled, the last tool in the `tools` array will include a `cachePoint`, allowing Bedrock to cache tool definitions and reduce costs for compatible models. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. #### bedrock_cache_instructions ```python bedrock_cache_instructions: bool ``` Whether to add a cache point after the system prompt blocks. When enabled, an extra `cachePoint` is appended to the system prompt so Bedrock can cache system instructions. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. #### bedrock_cache_messages ```python bedrock_cache_messages: bool ``` Convenience setting to enable caching for the last user message. When enabled, this automatically adds a cache point to the last content block in the final user message, which is useful for caching conversation history or context in multi-turn conversations. Note: Uses 1 of Bedrock's 4 available cache points per request. Any additional CachePoint markers in messages will be automatically limited to respect the 4-cache-point maximum. See https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html for more information. #### bedrock_service_tier ```python bedrock_service_tier: ServiceTierTypeDef ``` Setting for optimizing performance and cost See more about it on . ### BedrockConverseModel Bases: `Model` A model that uses the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass(init=False) class BedrockConverseModel(Model): """A model that uses the Bedrock Converse API.""" client: BedrockRuntimeClient _model_name: BedrockModelName = field(repr=False) _provider: Provider[BaseClient] = field(repr=False) def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock', 'gateway'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/bedrock' if provider == 'gateway' else provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.meta.endpoint_url) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """The set of builtin tool types this model can handle.""" return frozenset({CodeExecutionTool}) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolTypeDef]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolTypeDef: tool_spec: ToolSpecificationTypeDef = {'name': f.name, 'inputSchema': {'json': f.parameters_json_schema}} if f.description: # pragma: no branch tool_spec['description'] = f.description return {'toolSpec': tool_spec} async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, False, settings, model_request_parameters) model_response = await self._process_response(response) return model_response async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: """Count the number of tokens, works with limited models. Check the actual supported models on """ model_settings, model_request_parameters = self.prepare_request(model_settings, model_request_parameters) settings = cast(BedrockModelSettings, model_settings or {}) system_prompt, bedrock_messages = await self._map_messages(messages, model_request_parameters, settings) params: CountTokensRequestTypeDef = { 'modelId': remove_bedrock_geo_prefix(self.model_name), 'input': { 'converse': { 'messages': bedrock_messages, 'system': system_prompt, }, }, } try: response = await anyio.to_thread.run_sync(functools.partial(self.client.count_tokens, **params)) except ClientError as e: status_code = e.response.get('ResponseMetadata', {}).get('HTTPStatusCode') if isinstance(status_code, int): raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.response) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e return usage.RequestUsage(input_tokens=response['inputTokens']) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, True, settings, model_request_parameters) yield BedrockStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self.model_name, _event_stream=response['stream'], _provider_name=self._provider.name, _provider_url=self.base_url, _provider_response_id=response.get('ResponseMetadata', {}).get('RequestId', None), ) async def _process_response(self, response: ConverseResponseTypeDef) -> ModelResponse: items: list[ModelResponsePart] = [] if message := response['output'].get('message'): # pragma: no branch for item in message['content']: if reasoning_content := item.get('reasoningContent'): if redacted_content := reasoning_content.get('redactedContent'): items.append( ThinkingPart( id='redacted_content', content='', signature=redacted_content.decode('utf-8'), provider_name=self.system, ) ) elif reasoning_text := reasoning_content.get('reasoningText'): # pragma: no branch signature = reasoning_text.get('signature') items.append( ThinkingPart( content=reasoning_text['text'], signature=signature, provider_name=self.system if signature else None, ) ) if text := item.get('text'): items.append(TextPart(content=text)) elif tool_use := item.get('toolUse'): if tool_use.get('type') == 'server_tool_use': if tool_use['name'] == 'nova_code_interpreter': # pragma: no branch items.append( BuiltinToolCallPart( provider_name=self.system, tool_name=CodeExecutionTool.kind, args=tool_use['input'], tool_call_id=tool_use['toolUseId'], ) ) else: items.append( ToolCallPart( tool_name=tool_use['name'], args=tool_use['input'], tool_call_id=tool_use['toolUseId'], ), ) elif tool_result := item.get('toolResult'): if tool_result.get('type') == 'nova_code_interpreter_result': # pragma: no branch items.append( BuiltinToolReturnPart( provider_name=self.system, tool_name=CodeExecutionTool.kind, content=tool_result['content'][0].get('json') if tool_result['content'] else None, tool_call_id=tool_result.get('toolUseId'), provider_details={'status': tool_result['status']} if 'status' in tool_result else {}, ) ) input_tokens = response['usage']['inputTokens'] output_tokens = response['usage']['outputTokens'] cache_read_tokens = response['usage'].get('cacheReadInputTokens', 0) cache_write_tokens = response['usage'].get('cacheWriteInputTokens', 0) u = usage.RequestUsage( input_tokens=input_tokens + cache_write_tokens + cache_read_tokens, output_tokens=output_tokens, cache_read_tokens=cache_read_tokens, cache_write_tokens=cache_write_tokens, ) response_id = response.get('ResponseMetadata', {}).get('RequestId', None) raw_finish_reason = response['stopReason'] provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=u, model_name=self.model_name, provider_response_id=response_id, provider_name=self._provider.name, provider_url=self.base_url, finish_reason=finish_reason, provider_details=provider_details, ) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseStreamResponseTypeDef: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef | ConverseStreamResponseTypeDef: settings = model_settings or BedrockModelSettings() system_prompt, bedrock_messages = await self._map_messages(messages, model_request_parameters, settings) inference_config = self._map_inference_config(settings) params: ConverseRequestTypeDef = { 'modelId': self.model_name, 'messages': bedrock_messages, 'system': system_prompt, 'inferenceConfig': inference_config, } tool_config = self._map_tool_config(model_request_parameters, settings) if tool_config: params['toolConfig'] = tool_config tools: list[ToolTypeDef] = list(tool_config['tools']) if tool_config else [] self._limit_cache_points(system_prompt, bedrock_messages, tools) # Bedrock supports a set of specific extra parameters if model_settings: if guardrail_config := model_settings.get('bedrock_guardrail_config', None): params['guardrailConfig'] = guardrail_config if performance_configuration := model_settings.get('bedrock_performance_configuration', None): params['performanceConfig'] = performance_configuration if request_metadata := model_settings.get('bedrock_request_metadata', None): params['requestMetadata'] = request_metadata if additional_model_response_fields_paths := model_settings.get( 'bedrock_additional_model_response_fields_paths', None ): params['additionalModelResponseFieldPaths'] = additional_model_response_fields_paths if additional_model_requests_fields := model_settings.get('bedrock_additional_model_requests_fields', None): params['additionalModelRequestFields'] = additional_model_requests_fields if prompt_variables := model_settings.get('bedrock_prompt_variables', None): params['promptVariables'] = prompt_variables if service_tier := model_settings.get('bedrock_service_tier', None): params['serviceTier'] = service_tier try: if stream: model_response = await anyio.to_thread.run_sync( functools.partial(self.client.converse_stream, **params) ) else: model_response = await anyio.to_thread.run_sync(functools.partial(self.client.converse, **params)) except ClientError as e: status_code = e.response.get('ResponseMetadata', {}).get('HTTPStatusCode') if isinstance(status_code, int): raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.response) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e return model_response @staticmethod def _map_inference_config( model_settings: ModelSettings | None, ) -> InferenceConfigurationTypeDef: model_settings = model_settings or {} inference_config: InferenceConfigurationTypeDef = {} if max_tokens := model_settings.get('max_tokens'): inference_config['maxTokens'] = max_tokens if (temperature := model_settings.get('temperature')) is not None: inference_config['temperature'] = temperature if top_p := model_settings.get('top_p'): inference_config['topP'] = top_p if stop_sequences := model_settings.get('stop_sequences'): inference_config['stopSequences'] = stop_sequences return inference_config def _map_tool_config( self, model_request_parameters: ModelRequestParameters, model_settings: BedrockModelSettings | None, ) -> ToolConfigurationTypeDef | None: tools = self._get_tools(model_request_parameters) for tool in model_request_parameters.builtin_tools: if tool.kind == CodeExecutionTool.kind: tools.append({'systemTool': {'name': 'nova_code_interpreter'}}) else: raise NotImplementedError( f"Builtin tool '{tool.kind}' is not supported yet. If it should be, please file an issue." ) if not tools: return None profile = BedrockModelProfile.from_profile(self.profile) if ( model_settings and model_settings.get('bedrock_cache_tool_definitions') and profile.bedrock_supports_tool_caching ): tools.append({'cachePoint': {'type': 'default'}}) tool_choice: ToolChoiceTypeDef if not model_request_parameters.allow_text_output: tool_choice = {'any': {}} else: tool_choice = {'auto': {}} tool_config: ToolConfigurationTypeDef = {'tools': tools} if tool_choice and BedrockModelProfile.from_profile(self.profile).bedrock_supports_tool_choice: tool_config['toolChoice'] = tool_choice return tool_config async def _map_messages( # noqa: C901 self, messages: Sequence[ModelMessage], model_request_parameters: ModelRequestParameters, model_settings: BedrockModelSettings | None, ) -> tuple[list[SystemContentBlockTypeDef], list[MessageUnionTypeDef]]: """Maps a `pydantic_ai.Message` to the Bedrock `MessageUnionTypeDef`. Groups consecutive ToolReturnPart objects into a single user message as required by Bedrock Claude/Nova models. """ settings = model_settings or BedrockModelSettings() profile = BedrockModelProfile.from_profile(self.profile) system_prompt: list[SystemContentBlockTypeDef] = [] bedrock_messages: list[MessageUnionTypeDef] = [] document_count: Iterator[int] = count(1) for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart): if part.content: # pragma: no branch system_prompt.append({'text': part.content}) elif isinstance(part, UserPromptPart): bedrock_messages.extend( await self._map_user_prompt(part, document_count, profile.bedrock_supports_prompt_caching) ) elif isinstance(part, ToolReturnPart): assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [ {'text': part.model_response_str()} if profile.bedrock_tool_result_format == 'text' else {'json': part.model_response_object()} ], 'status': 'success', } } ], } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: bedrock_messages.append({'role': 'user', 'content': [{'text': part.model_response()}]}) else: assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [{'text': part.model_response()}], 'status': 'error', } } ], } ) else: assert_never(part) elif isinstance(message, ModelResponse): content: list[ContentBlockOutputTypeDef] = [] for item in message.parts: if isinstance(item, TextPart): content.append({'text': item.content}) elif isinstance(item, ThinkingPart): if ( item.provider_name == self.system and item.signature and BedrockModelProfile.from_profile(self.profile).bedrock_send_back_thinking_parts ): if item.id == 'redacted_content': reasoning_content: ReasoningContentBlockOutputTypeDef = { 'redactedContent': item.signature.encode('utf-8'), } else: reasoning_content: ReasoningContentBlockOutputTypeDef = { 'reasoningText': { 'text': item.content, 'signature': item.signature, } } content.append({'reasoningContent': reasoning_content}) else: start_tag, end_tag = self.profile.thinking_tags content.append({'text': '\n'.join([start_tag, item.content, end_tag])}) elif isinstance(item, BuiltinToolCallPart): if item.provider_name == self.system: if item.tool_name == CodeExecutionTool.kind: server_tool_use_block_param: ToolUseBlockOutputTypeDef = { 'toolUseId': _utils.guard_tool_call_id(t=item), 'name': 'nova_code_interpreter', 'input': item.args_as_dict(), 'type': 'server_tool_use', } content.append({'toolUse': server_tool_use_block_param}) elif isinstance(item, BuiltinToolReturnPart): if item.provider_name == self.system: if item.tool_name == CodeExecutionTool.kind: tool_result: ToolResultBlockOutputTypeDef = { 'toolUseId': _utils.guard_tool_call_id(t=item), 'content': [{'json': cast(Any, item.content)}] if item.content else [], 'type': 'nova_code_interpreter_result', } if item.provider_details and 'status' in item.provider_details: tool_result['status'] = item.provider_details['status'] content.append({'toolResult': tool_result}) else: assert isinstance(item, ToolCallPart) content.append(self._map_tool_call(item)) if content: bedrock_messages.append({'role': 'assistant', 'content': content}) else: assert_never(message) # Merge together sequential user messages. processed_messages: list[MessageUnionTypeDef] = [] last_message: dict[str, Any] | None = None for current_message in bedrock_messages: if ( last_message is not None and current_message['role'] == last_message['role'] and current_message['role'] == 'user' ): # Add the new user content onto the existing user message. last_content = list(last_message['content']) last_content.extend(current_message['content']) last_message['content'] = last_content continue # Add the entire message to the list of messages. processed_messages.append(current_message) last_message = cast(dict[str, Any], current_message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt.append({'text': instructions}) if system_prompt and settings.get('bedrock_cache_instructions') and profile.bedrock_supports_prompt_caching: system_prompt.append({'cachePoint': {'type': 'default'}}) if processed_messages and settings.get('bedrock_cache_messages') and profile.bedrock_supports_prompt_caching: last_user_content = self._get_last_user_message_content(processed_messages) if last_user_content is not None: # Note: _get_last_user_message_content ensures content doesn't already end with a cachePoint. _insert_cache_point_before_trailing_documents(last_user_content) return system_prompt, processed_messages @staticmethod def _get_last_user_message_content(messages: list[MessageUnionTypeDef]) -> list[Any] | None: """Get the content list from the last user message that can receive a cache point. Returns the content list if: - A user message exists - It has a non-empty content list - The last content block doesn't already have a cache point Returns None otherwise. """ user_messages = [msg for msg in messages if msg.get('role') == 'user'] if not user_messages: return None content = user_messages[-1].get('content') # Last user message if not content or not isinstance(content, list) or len(content) == 0: return None last_block = content[-1] if not isinstance(last_block, dict): return None if 'cachePoint' in last_block: # Skip if already has a cache point return None return content @staticmethod async def _map_user_prompt( # noqa: C901 part: UserPromptPart, document_count: Iterator[int], supports_prompt_caching: bool, ) -> list[MessageUnionTypeDef]: content: list[ContentBlockUnionTypeDef] = [] if isinstance(part.content, str): content.append({'text': part.content}) else: for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): format = item.format if item.is_document: name = f'Document {next(document_count)}' assert format in ('pdf', 'txt', 'csv', 'doc', 'docx', 'xls', 'xlsx', 'html', 'md') content.append({'document': {'name': name, 'format': format, 'source': {'bytes': item.data}}}) elif item.is_image: assert format in ('jpeg', 'png', 'gif', 'webp') content.append({'image': {'format': format, 'source': {'bytes': item.data}}}) elif item.is_video: assert format in ('mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp') content.append({'video': {'format': format, 'source': {'bytes': item.data}}}) else: raise NotImplementedError('Binary content is not supported yet.') elif isinstance(item, ImageUrl | DocumentUrl | VideoUrl): source: DocumentSourceTypeDef if item.url.startswith('s3://'): parsed = urlparse(item.url) s3_location: S3LocationTypeDef = {'uri': f'{parsed.scheme}://{parsed.netloc}{parsed.path}'} if bucket_owner := parse_qs(parsed.query).get('bucketOwner', [None])[0]: s3_location['bucketOwner'] = bucket_owner source = {'s3Location': s3_location} else: downloaded_item = await download_item(item, data_format='bytes', type_format='extension') source = {'bytes': downloaded_item['data']} if item.kind == 'image-url': format = item.media_type.split('/')[1] assert format in ('jpeg', 'png', 'gif', 'webp'), f'Unsupported image format: {format}' image: ImageBlockTypeDef = {'format': format, 'source': source} content.append({'image': image}) elif item.kind == 'document-url': name = f'Document {next(document_count)}' document: DocumentBlockTypeDef = { 'name': name, 'format': item.format, 'source': source, } content.append({'document': document}) elif item.kind == 'video-url': # pragma: no branch format = item.media_type.split('/')[1] assert format in ( 'mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp', ), f'Unsupported video format: {format}' video: VideoBlockTypeDef = {'format': format, 'source': source} content.append({'video': video}) elif isinstance(item, AudioUrl): # pragma: no cover raise NotImplementedError('Audio is not supported yet.') elif isinstance(item, CachePoint): if not supports_prompt_caching: # Silently skip CachePoint for models that don't support prompt caching continue if not content or 'cachePoint' in content[-1]: raise UserError( 'CachePoint cannot be the first content in a user message - there must be previous content to cache when using Bedrock. ' 'To cache system instructions or tool definitions, use the `bedrock_cache_instructions` or `bedrock_cache_tool_definitions` settings instead.' ) _insert_cache_point_before_trailing_documents(content, raise_if_cannot_insert=True) else: assert_never(item) return [{'role': 'user', 'content': content}] @staticmethod def _map_tool_call(t: ToolCallPart) -> ContentBlockOutputTypeDef: return { 'toolUse': {'toolUseId': _utils.guard_tool_call_id(t=t), 'name': t.tool_name, 'input': t.args_as_dict()} } @staticmethod def _limit_cache_points( system_prompt: list[SystemContentBlockTypeDef], bedrock_messages: list[MessageUnionTypeDef], tools: list[ToolTypeDef], ) -> None: """Limit the number of cache points in the request to Bedrock's maximum. Bedrock enforces a maximum of 4 cache points per request. This method ensures compliance by counting existing cache points and removing excess ones from messages. Strategy: 1. Count cache points in system_prompt 2. Count cache points in tools 3. Raise UserError if system + tools already exceed MAX_CACHE_POINTS 4. Calculate remaining budget for message cache points 5. Traverse messages from newest to oldest, keeping the most recent cache points within the remaining budget 6. Remove excess cache points from older messages to stay within limit Cache point priority (always preserved): - System prompt cache points - Tool definition cache points - Message cache points (newest first, oldest removed if needed) Raises: UserError: If system_prompt and tools combined already exceed MAX_CACHE_POINTS (4). This indicates a configuration error that cannot be auto-fixed. """ MAX_CACHE_POINTS = 4 # Count existing cache points in system prompt used_cache_points = sum(1 for block in system_prompt if 'cachePoint' in block) # Count existing cache points in tools for tool in tools: if 'cachePoint' in tool: used_cache_points += 1 # Calculate remaining cache points budget for messages remaining_budget = MAX_CACHE_POINTS - used_cache_points if remaining_budget < 0: # pragma: no cover raise UserError( f'Too many cache points for Bedrock request. ' f'System prompt and tool definitions already use {used_cache_points} cache points, ' f'which exceeds the maximum of {MAX_CACHE_POINTS}.' ) # Remove excess cache points from messages (newest to oldest) for message in reversed(bedrock_messages): content = message.get('content') if not content or not isinstance(content, list): # pragma: no cover continue # Build a new content list, keeping only cache points within budget new_content: list[Any] = [] for block in reversed(content): # Process newest first is_cache_point = isinstance(block, dict) and 'cachePoint' in block if is_cache_point: if remaining_budget > 0: remaining_budget -= 1 new_content.append(block) else: new_content.append(block) message['content'] = list(reversed(new_content)) # Restore original order ``` #### __init__ ```python __init__( model_name: BedrockModelName, *, provider: ( Literal["bedrock", "gateway"] | Provider[BaseClient] ) = "bedrock", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Bedrock model. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `BedrockModelName` | The name of the model to use. | *required* | | `model_name` | `BedrockModelName` | The name of the Bedrock model to use. List of model names available here. | *required* | | `provider` | \`Literal['bedrock', 'gateway'] | Provider[BaseClient]\` | The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of Provider[BaseClient]. If not provided, a new provider will be created using the other parameters. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock', 'gateway'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/bedrock' if provider == 'gateway' else provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` The set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """The set of builtin tool types this model can handle.""" return frozenset({CodeExecutionTool}) ``` #### count_tokens ```python count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Count the number of tokens, works with limited models. Check the actual supported models on Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: """Count the number of tokens, works with limited models. Check the actual supported models on """ model_settings, model_request_parameters = self.prepare_request(model_settings, model_request_parameters) settings = cast(BedrockModelSettings, model_settings or {}) system_prompt, bedrock_messages = await self._map_messages(messages, model_request_parameters, settings) params: CountTokensRequestTypeDef = { 'modelId': remove_bedrock_geo_prefix(self.model_name), 'input': { 'converse': { 'messages': bedrock_messages, 'system': system_prompt, }, }, } try: response = await anyio.to_thread.run_sync(functools.partial(self.client.count_tokens, **params)) except ClientError as e: status_code = e.response.get('ResponseMetadata', {}).get('HTTPStatusCode') if isinstance(status_code, int): raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.response) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e return usage.RequestUsage(input_tokens=response['inputTokens']) ``` ### BedrockStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Bedrock models. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass class BedrockStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Bedrock models.""" _model_name: BedrockModelName _event_stream: EventStream[ConverseStreamOutputTypeDef] _provider_name: str _provider_url: str _timestamp: datetime = field(default_factory=_utils.now_utc) _provider_response_id: str | None = None async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. """ if self._provider_response_id is not None: # pragma: no cover self.provider_response_id = self._provider_response_id chunk: ConverseStreamOutputTypeDef tool_ids: dict[int, str] = {} # Bedrock has deltas for built-in tool returns, which aren't supported by parts manager. # We accumulate the deltas here and yield the complete return part once the content block ends builtin_tool_returns: dict[int, BuiltinToolReturnPart] = {} async for chunk in _AsyncIteratorWrapper(self._event_stream): match chunk: case {'messageStart': _}: continue case {'messageStop': message_stop}: raw_finish_reason = message_stop['stopReason'] self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) case {'metadata': metadata}: if 'usage' in metadata: # pragma: no branch self._usage += self._map_usage(metadata) case {'contentBlockStart': content_block_start}: index = content_block_start['contentBlockIndex'] start = content_block_start['start'] if 'toolUse' in start: tool_use_start = start['toolUse'] tool_id = tool_use_start['toolUseId'] tool_ids[index] = tool_id tool_name = tool_use_start['name'] if tool_use_start.get('type') == 'server_tool_use': if tool_name == 'nova_code_interpreter': # pragma: no branch part = BuiltinToolCallPart( tool_name=CodeExecutionTool.kind, tool_call_id=tool_id, provider_name=self.provider_name, ) yield self._parts_manager.handle_part(vendor_part_id=index, part=part) elif maybe_event := self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_name, args=None, tool_call_id=tool_id, ): # pragma: no branch yield maybe_event elif 'toolResult' in start: # pragma: no branch tool_result_start = start['toolResult'] tool_id = tool_result_start['toolUseId'] if tool_result_start.get('type') == 'nova_code_interpreter_result': # pragma: no branch return_part = BuiltinToolReturnPart( provider_name=self.provider_name, tool_name=CodeExecutionTool.kind, content=None, tool_call_id=tool_id, provider_details={'status': tool_result_start['status']} if 'status' in tool_result_start else {}, ) builtin_tool_returns[index] = return_part # Don't yield anything yet - we wait for content block end case {'contentBlockDelta': content_block_delta}: index = content_block_delta['contentBlockIndex'] delta = content_block_delta['delta'] if 'reasoningContent' in delta: if redacted_content := delta['reasoningContent'].get('redactedContent'): for event in self._parts_manager.handle_thinking_delta( vendor_part_id=index, id='redacted_content', signature=redacted_content.decode('utf-8'), provider_name=self.provider_name, ): yield event else: signature = delta['reasoningContent'].get('signature') for event in self._parts_manager.handle_thinking_delta( vendor_part_id=index, content=delta['reasoningContent'].get('text'), signature=signature, provider_name=self.provider_name if signature else None, ): yield event if text := delta.get('text'): for event in self._parts_manager.handle_text_delta(vendor_part_id=index, content=text): yield event if 'toolUse' in delta: tool_use = delta['toolUse'] maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_use.get('name'), args=tool_use.get('input'), tool_call_id=tool_ids[index], ) if maybe_event: # pragma: no branch yield maybe_event if 'toolResult' in delta: # pragma: no branch if ( return_part := builtin_tool_returns.get(index) ) and return_part.tool_name == CodeExecutionTool.kind: # pragma: no branch # For now, only process `contentBlockDelta.toolResult` for Code Exe tool. if tr_content := delta['toolResult']: # pragma: no branch # Goal here is to convert to object form. # This assumes the first item is the relevant one. return_part.content = tr_content[0].get('json') # Don't yield anything yet - we wait for content block end case {'contentBlockStop': content_block_stop}: index = content_block_stop['contentBlockIndex'] if return_part := builtin_tool_returns.get(index): # Emit the complete built-in tool return only once when the block closes. yield self._parts_manager.handle_part(vendor_part_id=index, part=return_part) tool_ids.pop(index, None) builtin_tool_returns.pop(index, None) case _: # pragma: no cover pass # pyright wants match statements to be exhaustive @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: return self._timestamp def _map_usage(self, metadata: ConverseStreamMetadataEventTypeDef) -> usage.RequestUsage: input_tokens = metadata['usage']['inputTokens'] output_tokens = metadata['usage']['outputTokens'] cache_read_tokens = metadata['usage'].get('cacheReadInputTokens', 0) cache_write_tokens = metadata['usage'].get('cacheWriteInputTokens', 0) return usage.RequestUsage( input_tokens=input_tokens + cache_write_tokens + cache_read_tokens, output_tokens=output_tokens, cache_read_tokens=cache_read_tokens, cache_write_tokens=cache_write_tokens, ) ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. # `pydantic_ai.models.cerebras` ## Setup For details on how to set up authentication with this model, see [model configuration for Cerebras](https://ai.pydantic.dev/models/cerebras/index.md). Cerebras model implementation using OpenAI-compatible API. ### CerebrasModelName ```python CerebrasModelName = str | LatestCerebrasModelNames ``` Possible Cerebras model names. Since Cerebras supports a variety of models and the list changes frequently, we explicitly list known models but allow any name in the type hints. See for an up to date list of models. ### CerebrasModelSettings Bases: `ModelSettings` Settings used for a Cerebras model request. ALL FIELDS MUST BE `cerebras_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/models/cerebras.py` ```python class CerebrasModelSettings(ModelSettings, total=False): """Settings used for a Cerebras model request. ALL FIELDS MUST BE `cerebras_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ cerebras_disable_reasoning: bool """Disable reasoning for the model. This setting is only supported on reasoning models: `zai-glm-4.6` and `gpt-oss-120b`. See [the Cerebras docs](https://inference-docs.cerebras.ai/resources/openai#passing-non-standard-parameters) for more details. """ ``` #### cerebras_disable_reasoning ```python cerebras_disable_reasoning: bool ``` Disable reasoning for the model. This setting is only supported on reasoning models: `zai-glm-4.6` and `gpt-oss-120b`. See [the Cerebras docs](https://inference-docs.cerebras.ai/resources/openai#passing-non-standard-parameters) for more details. ### CerebrasModel Bases: `OpenAIChatModel` A model that uses Cerebras's OpenAI-compatible API. Cerebras provides ultra-fast inference powered by the Wafer-Scale Engine (WSE). Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/cerebras.py` ```python @dataclass(init=False) class CerebrasModel(OpenAIChatModel): """A model that uses Cerebras's OpenAI-compatible API. Cerebras provides ultra-fast inference powered by the Wafer-Scale Engine (WSE). Apart from `__init__`, all methods are private or match those of the base class. """ def __init__( self, model_name: CerebrasModelName, *, provider: Literal['cerebras'] | Provider[AsyncOpenAI] = 'cerebras', profile: ModelProfileSpec | None = None, settings: CerebrasModelSettings | None = None, ): """Initialize a Cerebras model. Args: model_name: The name of the Cerebras model to use. provider: The provider to use. Defaults to 'cerebras'. profile: The model profile to use. Defaults to a profile based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ super().__init__(model_name, provider=provider, profile=profile, settings=settings) @override def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: merged_settings, customized_parameters = super().prepare_request(model_settings, model_request_parameters) new_settings = _cerebras_settings_to_openai_settings(cast(CerebrasModelSettings, merged_settings or {})) return new_settings, customized_parameters ``` #### __init__ ```python __init__( model_name: CerebrasModelName, *, provider: ( Literal["cerebras"] | Provider[AsyncOpenAI] ) = "cerebras", profile: ModelProfileSpec | None = None, settings: CerebrasModelSettings | None = None ) ``` Initialize a Cerebras model. Parameters: | Name | Type | Description | Default | | ------------ | ----------------------- | -------------------------------------- | ------------------------------------------------------------------------ | | `model_name` | `CerebrasModelName` | The name of the Cerebras model to use. | *required* | | `provider` | \`Literal['cerebras'] | Provider[AsyncOpenAI]\` | The provider to use. Defaults to 'cerebras'. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile based on the model name. | | `settings` | \`CerebrasModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/cerebras.py` ```python def __init__( self, model_name: CerebrasModelName, *, provider: Literal['cerebras'] | Provider[AsyncOpenAI] = 'cerebras', profile: ModelProfileSpec | None = None, settings: CerebrasModelSettings | None = None, ): """Initialize a Cerebras model. Args: model_name: The name of the Cerebras model to use. provider: The provider to use. Defaults to 'cerebras'. profile: The model profile to use. Defaults to a profile based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ super().__init__(model_name, provider=provider, profile=profile, settings=settings) ``` # `pydantic_ai.models.cohere` ## Setup For details on how to set up authentication with this model, see [model configuration for Cohere](https://ai.pydantic.dev/models/cohere/index.md). ### LatestCohereModelNames ```python LatestCohereModelNames = Literal[ "c4ai-aya-expanse-32b", "c4ai-aya-expanse-8b", "command-nightly", "command-r-08-2024", "command-r-plus-08-2024", "command-r7b-12-2024", ] ``` Latest Cohere models. ### CohereModelName ```python CohereModelName = str | LatestCohereModelNames ``` Possible Cohere model names. Since Cohere supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [Cohere's docs](https://docs.cohere.com/v2/docs/models) for a list of all available models. ### CohereModelSettings Bases: `ModelSettings` Settings used for a Cohere model request. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python class CohereModelSettings(ModelSettings, total=False): """Settings used for a Cohere model request.""" ``` ### CohereModel Bases: `Model` A model that uses the Cohere API. Internally, this uses the [Cohere Python client](https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python @dataclass(init=False) class CohereModel(Model): """A model that uses the Cohere API. Internally, this uses the [Cohere Python client]( https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncClientV2 = field(repr=False) _model_name: CohereModelName = field(repr=False) _provider: Provider[AsyncClientV2] = field(repr=False) def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def model_name(self) -> CohereModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._chat(messages, cast(CohereModelSettings, model_settings or {}), model_request_parameters) model_response = self._process_response(response) return model_response async def _chat( self, messages: list[ModelMessage], model_settings: CohereModelSettings, model_request_parameters: ModelRequestParameters, ) -> V2ChatResponse: tools = self._get_tools(model_request_parameters) cohere_messages = self._map_messages(messages, model_request_parameters) try: return await self.client.chat( model=self._model_name, messages=cohere_messages, tools=tools or OMIT, max_tokens=model_settings.get('max_tokens', OMIT), stop_sequences=model_settings.get('stop_sequences', OMIT), temperature=model_settings.get('temperature', OMIT), p=model_settings.get('top_p', OMIT), seed=model_settings.get('seed', OMIT), presence_penalty=model_settings.get('presence_penalty', OMIT), frequency_penalty=model_settings.get('frequency_penalty', OMIT), ) except ApiError as e: if (status_code := e.status_code) and status_code >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=str(e)) from e def _process_response(self, response: V2ChatResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" parts: list[ModelResponsePart] = [] if response.message.content is not None: for content in response.message.content: if content.type == 'text': parts.append(TextPart(content=content.text)) elif content.type == 'thinking': # pragma: no branch parts.append(ThinkingPart(content=content.thinking)) for c in response.message.tool_calls or []: if c.function and c.function.name and c.function.arguments: # pragma: no branch parts.append( ToolCallPart( tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id or _generate_tool_call_id(), ) ) raw_finish_reason = response.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=self._model_name, provider_name=self._provider.name, provider_url=self.base_url, finish_reason=finish_reason, provider_details=provider_details, ) def _map_messages( self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters ) -> list[ChatMessageV2]: """Just maps a `pydantic_ai.Message` to a `cohere.ChatMessageV2`.""" cohere_messages: list[ChatMessageV2] = [] for message in messages: if isinstance(message, ModelRequest): cohere_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] thinking: list[str] = [] tool_calls: list[ToolCallV2] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ThinkingPart): thinking.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from cohere pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = AssistantChatMessageV2(role='assistant') if texts or thinking: contents: list[AssistantMessageV2ContentItem] = [] if thinking: contents.append(ThinkingAssistantMessageV2ContentItem(thinking='\n\n'.join(thinking))) if texts: # pragma: no branch contents.append(TextAssistantMessageV2ContentItem(text='\n\n'.join(texts))) message_param.content = contents if tool_calls: message_param.tool_calls = tool_calls cohere_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in cohere_messages if isinstance(m, SystemChatMessageV2)) cohere_messages.insert(system_prompt_count, SystemChatMessageV2(role='system', content=instructions)) return cohere_messages def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolV2]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_call(t: ToolCallPart) -> ToolCallV2: return ToolCallV2( id=_guard_tool_call_id(t=t), type='function', function=ToolCallV2Function( name=t.tool_name, arguments=t.args_as_json_str(), ), ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolV2: return ToolV2( type='function', function=ToolV2Function( name=f.name, description=f.description, parameters=f.parameters_json_schema, ), ) @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[ChatMessageV2]: for part in message.parts: if isinstance(part, SystemPromptPart): yield SystemChatMessageV2(role='system', content=part.content) elif isinstance(part, UserPromptPart): if isinstance(part.content, str): yield UserChatMessageV2(role='user', content=part.content) else: raise RuntimeError('Cohere does not yet support multi-modal inputs.') elif isinstance(part, ToolReturnPart): yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield UserChatMessageV2(role='user', content=part.model_response()) # pragma: no cover else: yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) ``` #### __init__ ```python __init__( model_name: CohereModelName, *, provider: ( Literal["cohere"] | Provider[AsyncClientV2] ) = "cohere", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Cohere model. Parameters: | Name | Type | Description | Default | | ------------ | ------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `CohereModelName` | The name of the Cohere model to use. List of model names available here. | *required* | | `provider` | \`Literal['cohere'] | Provider[AsyncClientV2]\` | The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of Provider[AsyncClientV2]. If not provided, a new provider will be created using the other parameters. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: CohereModelName ``` The model name. #### system ```python system: str ``` The model provider. # pydantic_ai.models.fallback ### FallbackModel Bases: `Model` A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @dataclass(init=False) class FallbackModel(Model): """A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. """ models: list[Model] _model_name: str = field(repr=False) _fallback_on: Callable[[Exception], bool] def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelAPIError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) # pyright: ignore[reportUnknownArgumentType] else: self._fallback_on = fallback_on @property def model_name(self) -> str: """The model name.""" return f'fallback:{",".join(model.model_name for model in self.models)}' @property def system(self) -> str: return f'fallback:{",".join(model.system for model in self.models)}' @property def base_url(self) -> str | None: return self.models[0].base_url async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: try: _, prepared_parameters = model.prepare_request(model_settings, model_request_parameters) response = await model.request(messages, model_settings, model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model, prepared_parameters) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: async with AsyncExitStack() as stack: try: _, prepared_parameters = model.prepare_request(model_settings, model_request_parameters) response = await stack.enter_async_context( model.request_stream(messages, model_settings, model_request_parameters, run_context) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model, prepared_parameters) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) @cached_property def profile(self) -> ModelProfile: raise NotImplementedError('FallbackModel does not have its own model profile.') def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: return model_request_parameters # pragma: no cover def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters ) -> tuple[ModelSettings | None, ModelRequestParameters]: return model_settings, model_request_parameters def _set_span_attributes(self, model: Model, model_request_parameters: ModelRequestParameters): with suppress(Exception): span = get_current_span() if span.is_recording(): attributes = getattr(span, 'attributes', {}) if attributes.get('gen_ai.request.model') == self.model_name: # pragma: no branch span.set_attributes( { **InstrumentedModel.model_attributes(model), **InstrumentedModel.model_request_parameters_attributes(model_request_parameters), } ) ``` #### __init__ ```python __init__( default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: ( Callable[[Exception], bool] | tuple[type[Exception], ...] ) = (ModelAPIError,) ) ``` Initialize a fallback model instance. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------------------- | ------------------------------- | ----------------------------------------------------------------- | | `default_model` | \`Model | KnownModelName | str\` | | `fallback_models` | \`Model | KnownModelName | str\` | | `fallback_on` | \`Callable\[[Exception], bool\] | tuple\[type[Exception], ...\]\` | A callable or tuple of exceptions that should trigger a fallback. | Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelAPIError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) # pyright: ignore[reportUnknownArgumentType] else: self._fallback_on = fallback_on ``` #### model_name ```python model_name: str ``` The model name. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: try: _, prepared_parameters = model.prepare_request(model_settings, model_request_parameters) response = await model.request(messages, model_settings, model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model, prepared_parameters) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Try each model in sequence until one succeeds. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: async with AsyncExitStack() as stack: try: _, prepared_parameters = model.prepare_request(model_settings, model_request_parameters) response = await stack.enter_async_context( model.request_stream(messages, model_settings, model_request_parameters, run_context) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model, prepared_parameters) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` # `pydantic_ai.models.function` A model controlled by a local function. FunctionModel is similar to [`TestModel`](https://ai.pydantic.dev/api/models/test/index.md), but allows greater control over the model's behavior. Its primary use case is for more advanced unit testing than is possible with `TestModel`. Here's a minimal example: [Learn about Gateway](https://ai.pydantic.dev/gateway) function_model_usage.py ```python from pydantic_ai import Agent from pydantic_ai import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import FunctionModel, AgentInfo my_agent = Agent('gateway/openai:gpt-5.2') async def model_function( messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: print(messages) """ [ ModelRequest( parts=[ UserPromptPart( content='Testing my agent...', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ] """ print(info) """ AgentInfo( function_tools=[], allow_text_output=True, output_tools=[], model_settings=None, model_request_parameters=ModelRequestParameters( function_tools=[], builtin_tools=[], output_tools=[] ), instructions=None, ) """ return ModelResponse(parts=[TextPart('hello world')]) async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" with my_agent.override(model=FunctionModel(model_function)): result = await my_agent.run('Testing my agent...') assert result.output == 'hello world' ``` function_model_usage.py ```python from pydantic_ai import Agent from pydantic_ai import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import FunctionModel, AgentInfo my_agent = Agent('openai:gpt-5.2') async def model_function( messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: print(messages) """ [ ModelRequest( parts=[ UserPromptPart( content='Testing my agent...', timestamp=datetime.datetime(...), ) ], timestamp=datetime.datetime(...), run_id='...', ) ] """ print(info) """ AgentInfo( function_tools=[], allow_text_output=True, output_tools=[], model_settings=None, model_request_parameters=ModelRequestParameters( function_tools=[], builtin_tools=[], output_tools=[] ), instructions=None, ) """ return ModelResponse(parts=[TextPart('hello world')]) async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" with my_agent.override(model=FunctionModel(model_function)): result = await my_agent.run('Testing my agent...') assert result.output == 'hello world' ``` See [Unit testing with `FunctionModel`](https://ai.pydantic.dev/testing/#unit-testing-with-functionmodel) for detailed documentation. ### FunctionModel Bases: `Model` A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(init=False) class FunctionModel(Model): """A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. """ function: FunctionDef | None stream_function: StreamFunctionDef | None _model_name: str = field(repr=False) _system: str = field(default='function', repr=False) @overload def __init__( self, function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, model_request_parameters=model_request_parameters, instructions=self._get_instructions(messages, model_request_parameters), ) assert self.function is not None, 'FunctionModel must receive a `function` to support non-streamed requests' if inspect.iscoroutinefunction(self.function): response = await self.function(messages, agent_info) else: response_ = await _utils.run_in_executor(self.function, messages, agent_info) assert isinstance(response_, ModelResponse), response_ response = response_ response.model_name = self._model_name # Add usage data if not already present if not response.usage.has_values(): # pragma: no branch response.usage = _estimate_usage(chain(messages, [response])) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, model_request_parameters=model_request_parameters, instructions=self._get_instructions(messages, model_request_parameters), ) assert self.stream_function is not None, ( 'FunctionModel must receive a `stream_function` to support streamed requests' ) response_stream = PeekableAsyncStream(self.stream_function(messages, agent_info)) first = await response_stream.peek() if isinstance(first, _utils.Unset): raise ValueError('Stream function must return at least one item') yield FunctionStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _iter=response_stream, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._system @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """FunctionModel supports all builtin tools for testing flexibility.""" from ..builtin_tools import SUPPORTED_BUILTIN_TOOLS return SUPPORTED_BUILTIN_TOOLS ``` #### __init__ ```python __init__( function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------- | ----------- | ------------------------------------------------------------------------------------ | | `function` | \`FunctionDef | None\` | The function to call for non-streamed requests. | | `stream_function` | \`StreamFunctionDef | None\` | The function to call for streamed requests. | | `model_name` | \`str | None\` | The name of the model. If not provided, a name is generated from the function names. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The system / model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` FunctionModel supports all builtin tools for testing flexibility. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """FunctionModel supports all builtin tools for testing flexibility.""" from ..builtin_tools import SUPPORTED_BUILTIN_TOOLS return SUPPORTED_BUILTIN_TOOLS ``` ### AgentInfo Information about an agent. This is passed as the second to functions used within FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(frozen=True, kw_only=True) class AgentInfo: """Information about an agent. This is passed as the second to functions used within [`FunctionModel`][pydantic_ai.models.function.FunctionModel]. """ function_tools: list[ToolDefinition] """The function tools available on this agent. These are the tools registered via the [`tool`][pydantic_ai.agent.Agent.tool] and [`tool_plain`][pydantic_ai.agent.Agent.tool_plain] decorators. """ allow_text_output: bool """Whether a plain text output is allowed.""" output_tools: list[ToolDefinition] """The tools that can called to produce the final output of the run.""" model_settings: ModelSettings | None """The model settings passed to the run call.""" model_request_parameters: ModelRequestParameters """The model request parameters passed to the run call.""" instructions: str | None """The instructions passed to model.""" ``` #### function_tools ```python function_tools: list[ToolDefinition] ``` The function tools available on this agent. These are the tools registered via the tool and tool_plain decorators. #### allow_text_output ```python allow_text_output: bool ``` Whether a plain text output is allowed. #### output_tools ```python output_tools: list[ToolDefinition] ``` The tools that can called to produce the final output of the run. #### model_settings ```python model_settings: ModelSettings | None ``` The model settings passed to the run call. #### model_request_parameters ```python model_request_parameters: ModelRequestParameters ``` The model request parameters passed to the run call. #### instructions ```python instructions: str | None ``` The instructions passed to model. ### DeltaToolCall Incremental change to a tool call. Used to describe a chunk when streaming structured responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class DeltaToolCall: """Incremental change to a tool call. Used to describe a chunk when streaming structured responses. """ name: str | None = None """Incremental change to the name of the tool.""" json_args: str | None = None """Incremental change to the arguments as JSON""" _: KW_ONLY tool_call_id: str | None = None """Incremental change to the tool call ID.""" ``` #### name ```python name: str | None = None ``` Incremental change to the name of the tool. #### json_args ```python json_args: str | None = None ``` Incremental change to the arguments as JSON #### tool_call_id ```python tool_call_id: str | None = None ``` Incremental change to the tool call ID. ### DeltaThinkingPart Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(kw_only=True) class DeltaThinkingPart: """Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. """ content: str | None = None """Incremental change to the thinking content.""" signature: str | None = None """Incremental change to the thinking signature.""" ``` #### content ```python content: str | None = None ``` Incremental change to the thinking content. #### signature ```python signature: str | None = None ``` Incremental change to the thinking signature. ### DeltaToolCalls ```python DeltaToolCalls: TypeAlias = dict[int, DeltaToolCall] ``` A mapping of tool call IDs to incremental changes. ### DeltaThinkingCalls ```python DeltaThinkingCalls: TypeAlias = dict[int, DeltaThinkingPart] ``` A mapping of thinking call IDs to incremental changes. ### FunctionDef ```python FunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], ModelResponse | Awaitable[ModelResponse], ] ``` A function used to generate a non-streamed response. ### StreamFunctionDef ```python StreamFunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], AsyncIterator[ str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns ], ] ``` A function used to generate a streamed response. While this is defined as having return type of `AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinTools]`, it should really be considered as `AsyncIterator[str] | AsyncIterator[DeltaToolCalls] | AsyncIterator[DeltaThinkingCalls]`, E.g. you need to yield all text, all `DeltaToolCalls`, all `DeltaThinkingCalls`, or all `BuiltinToolCallsReturns`, not mix them. ### FunctionStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class FunctionStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for [FunctionModel][pydantic_ai.models.function.FunctionModel].""" _model_name: str _iter: AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns] _timestamp: datetime = field(default_factory=_utils.now_utc) def __post_init__(self): self._usage += _estimate_usage([]) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 async for item in self._iter: if isinstance(item, str): response_tokens = _estimate_string_tokens(item) self._usage += usage.RequestUsage(output_tokens=response_tokens) for event in self._parts_manager.handle_text_delta(vendor_part_id='content', content=item): yield event elif isinstance(item, dict) and item: for dtc_index, delta in item.items(): if isinstance(delta, DeltaThinkingPart): if delta.content: # pragma: no branch response_tokens = _estimate_string_tokens(delta.content) self._usage += usage.RequestUsage(output_tokens=response_tokens) for event in self._parts_manager.handle_thinking_delta( vendor_part_id=dtc_index, content=delta.content, signature=delta.signature, provider_name='function' if delta.signature else None, ): yield event elif isinstance(delta, DeltaToolCall): if delta.json_args: response_tokens = _estimate_string_tokens(delta.json_args) self._usage += usage.RequestUsage(output_tokens=response_tokens) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc_index, tool_name=delta.name, args=delta.json_args, tool_call_id=delta.tool_call_id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(delta, BuiltinToolCallPart): if content := delta.args_as_json_str(): # pragma: no branch response_tokens = _estimate_string_tokens(content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_part(vendor_part_id=dtc_index, part=delta) elif isinstance(delta, BuiltinToolReturnPart): if content := delta.model_response_str(): # pragma: no branch response_tokens = _estimate_string_tokens(content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_part(vendor_part_id=dtc_index, part=delta) else: assert_never(delta) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> None: """Get the provider name.""" return None @property def provider_url(self) -> None: """Get the provider base URL.""" return None @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: None ``` Get the provider name. #### provider_url ```python provider_url: None ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.google` Interface that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Setup For details on how to set up authentication with this model, see [model configuration for Google](https://ai.pydantic.dev/models/google/index.md). ### LatestGoogleModelNames ```python LatestGoogleModelNames = Literal[ "gemini-flash-latest", "gemini-flash-lite-latest", "gemini-2.0-flash", "gemini-2.0-flash-lite", "gemini-2.5-flash", "gemini-2.5-flash-preview-09-2025", "gemini-2.5-flash-image", "gemini-2.5-flash-lite", "gemini-2.5-flash-lite-preview-09-2025", "gemini-2.5-pro", "gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3-pro-image-preview", ] ``` Latest Gemini models. ### GoogleModelName ```python GoogleModelName = str | LatestGoogleModelNames ``` Possible Gemini model names. Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Gemini API docs](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations) for a full list. ### GoogleModelSettings Bases: `ModelSettings` Settings used for a Gemini model request. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python class GoogleModelSettings(ModelSettings, total=False): """Settings used for a Gemini model request.""" # ALL FIELDS MUST BE `gemini_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. google_safety_settings: list[SafetySettingDict] """The safety settings to use for the model. See for more information. """ google_thinking_config: ThinkingConfigDict """The thinking configuration to use for the model. See for more information. """ google_labels: dict[str, str] """User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. """ google_video_resolution: MediaResolution """The video resolution to use for the model. See for more information. """ google_cached_content: str """The name of the cached content to use for the model. See for more information. """ ``` #### google_safety_settings ```python google_safety_settings: list[SafetySettingDict] ``` The safety settings to use for the model. See for more information. #### google_thinking_config ```python google_thinking_config: ThinkingConfigDict ``` The thinking configuration to use for the model. See for more information. #### google_labels ```python google_labels: dict[str, str] ``` User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. #### google_video_resolution ```python google_video_resolution: MediaResolution ``` The video resolution to use for the model. See for more information. #### google_cached_content ```python google_cached_content: str ``` The name of the cached content to use for the model. See for more information. ### GoogleModel Bases: `Model` A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass(init=False) class GoogleModel(Model): """A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. """ client: Client = field(repr=False) _model_name: GoogleModelName = field(repr=False) _provider: Provider[Client] = field(repr=False) def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[google.genai.AsyncClient]`. Defaults to 'google-gla'. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/google-vertex' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> GoogleModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, FileSearchTool, WebFetchTool, ImageGenerationTool}) def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters ) -> tuple[ModelSettings | None, ModelRequestParameters]: supports_native_output_with_builtin_tools = GoogleModelProfile.from_profile( self.profile ).google_supports_native_output_with_builtin_tools if model_request_parameters.builtin_tools and model_request_parameters.output_tools: if model_request_parameters.output_mode == 'auto': output_mode = 'native' if supports_native_output_with_builtin_tools else 'prompted' model_request_parameters = replace(model_request_parameters, output_mode=output_mode) else: output_mode = 'NativeOutput' if supports_native_output_with_builtin_tools else 'PromptedOutput' raise UserError( f'Google does not support output tools and built-in tools at the same time. Use `output_type={output_mode}(...)` instead.' ) return super().prepare_request(model_settings, model_request_parameters) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, False, model_settings, model_request_parameters) return self._process_response(response) async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) contents, generation_config = await self._build_content_and_config( messages, model_settings, model_request_parameters ) # Annoyingly, the type of `GenerateContentConfigDict.get` is "partially `Unknown`" because `response_schema` includes `typing._UnionGenericAlias`, # so without this we'd need `pyright: ignore[reportUnknownMemberType]` on every line and wouldn't get type checking anyway. generation_config = cast(dict[str, Any], generation_config) config = CountTokensConfigDict( http_options=generation_config.get('http_options'), ) if self._provider.name != 'google-gla': # The fields are not supported by the Gemini API per https://github.com/googleapis/python-genai/blob/7e4ec284dc6e521949626f3ed54028163ef9121d/google/genai/models.py#L1195-L1214 config.update( # pragma: lax no cover system_instruction=generation_config.get('system_instruction'), tools=cast(list[ToolDict], generation_config.get('tools')), # Annoyingly, GenerationConfigDict has fewer fields than GenerateContentConfigDict, and no extra fields are allowed. generation_config=GenerationConfigDict( temperature=generation_config.get('temperature'), top_p=generation_config.get('top_p'), max_output_tokens=generation_config.get('max_output_tokens'), stop_sequences=generation_config.get('stop_sequences'), presence_penalty=generation_config.get('presence_penalty'), frequency_penalty=generation_config.get('frequency_penalty'), seed=generation_config.get('seed'), thinking_config=generation_config.get('thinking_config'), media_resolution=generation_config.get('media_resolution'), response_mime_type=generation_config.get('response_mime_type'), response_json_schema=generation_config.get('response_json_schema'), ), ) response = await self.client.aio.models.count_tokens( model=self._model_name, contents=contents, config=config, ) if response.total_tokens is None: raise UnexpectedModelBehavior( # pragma: no cover 'Total tokens missing from Gemini response', str(response) ) return usage.RequestUsage( input_tokens=response.total_tokens, ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, True, model_settings, model_request_parameters) yield await self._process_streamed_response(response, model_request_parameters) # type: ignore def _build_image_config(self, tool: ImageGenerationTool) -> ImageConfigDict: """Build ImageConfigDict from ImageGenerationTool with validation.""" image_config = ImageConfigDict() if tool.aspect_ratio is not None: image_config['aspect_ratio'] = tool.aspect_ratio if tool.size is not None: if tool.size not in _GOOGLE_IMAGE_SIZES: raise UserError( f'Google image generation only supports `size` values: {_GOOGLE_IMAGE_SIZES}. ' f'Got: {tool.size!r}. Omit `size` to use the default (1K).' ) image_config['image_size'] = tool.size if self.system == 'google-vertex': if tool.output_format is not None: if tool.output_format not in _GOOGLE_IMAGE_OUTPUT_FORMATS: raise UserError( f'Google image generation only supports `output_format` values: {_GOOGLE_IMAGE_OUTPUT_FORMATS}. ' f'Got: {tool.output_format!r}.' ) image_config['output_mime_type'] = f'image/{tool.output_format}' output_compression = tool.output_compression if output_compression is not None: if not (0 <= output_compression <= 100): raise UserError( f'Google image generation `output_compression` must be between 0 and 100. ' f'Got: {output_compression}.' ) if tool.output_format not in (None, 'jpeg'): raise UserError( f'Google image generation `output_compression` is only supported for JPEG format. ' f'Got format: {tool.output_format!r}. Either set `output_format="jpeg"` or remove `output_compression`.' ) image_config['output_compression_quality'] = output_compression if tool.output_format is None: image_config['output_mime_type'] = 'image/jpeg' return image_config def _get_tools( self, model_request_parameters: ModelRequestParameters ) -> tuple[list[ToolDict] | None, ImageConfigDict | None]: tools: list[ToolDict] = [ ToolDict(function_declarations=[_function_declaration_from_tool(t)]) for t in model_request_parameters.tool_defs.values() ] image_config: ImageConfigDict | None = None if model_request_parameters.builtin_tools: if model_request_parameters.function_tools: raise UserError('Google does not support function tools and built-in tools at the same time.') for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): tools.append(ToolDict(google_search=GoogleSearchDict())) elif isinstance(tool, WebFetchTool): tools.append(ToolDict(url_context=UrlContextDict())) elif isinstance(tool, CodeExecutionTool): tools.append(ToolDict(code_execution=ToolCodeExecutionDict())) elif isinstance(tool, FileSearchTool): file_search_config = FileSearchDict(file_search_store_names=list(tool.file_store_ids)) tools.append(ToolDict(file_search=file_search_config)) elif isinstance(tool, ImageGenerationTool): # pragma: no branch if not self.profile.supports_image_output: raise UserError( "`ImageGenerationTool` is not supported by this model. Use a model with 'image' in the name instead." ) image_config = self._build_image_config(tool) else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `GoogleModel`. If it should be, please file an issue.' ) return tools or None, image_config def _get_tool_config( self, model_request_parameters: ModelRequestParameters, tools: list[ToolDict] | None ) -> ToolConfigDict | None: if not model_request_parameters.allow_text_output and tools: names: list[str] = [] for tool in tools: for function_declaration in tool.get('function_declarations') or []: if name := function_declaration.get('name'): # pragma: no branch names.append(name) return _tool_config(names) else: return None @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse: ... @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> Awaitable[AsyncIterator[GenerateContentResponse]]: ... async def _generate_content( self, messages: list[ModelMessage], stream: bool, model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse | Awaitable[AsyncIterator[GenerateContentResponse]]: contents, config = await self._build_content_and_config(messages, model_settings, model_request_parameters) func = self.client.aio.models.generate_content_stream if stream else self.client.aio.models.generate_content try: return await func(model=self._model_name, contents=contents, config=config) # type: ignore except errors.APIError as e: if (status_code := e.code) >= 400: raise ModelHTTPError( status_code=status_code, model_name=self._model_name, body=cast(Any, e.details), # pyright: ignore[reportUnknownMemberType] ) from e raise ModelAPIError(model_name=self._model_name, message=str(e)) from e async def _build_content_and_config( self, messages: list[ModelMessage], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> tuple[list[ContentUnionDict], GenerateContentConfigDict]: tools, image_config = self._get_tools(model_request_parameters) if model_request_parameters.function_tools and not self.profile.supports_tools: raise UserError('Tools are not supported by this model.') response_mime_type = None response_schema = None if model_request_parameters.output_mode == 'native': if model_request_parameters.function_tools: raise UserError( 'Google does not support `NativeOutput` and function tools at the same time. Use `output_type=ToolOutput(...)` instead.' ) response_mime_type = 'application/json' output_object = model_request_parameters.output_object assert output_object is not None response_schema = self._map_response_schema(output_object) elif model_request_parameters.output_mode == 'prompted' and not tools: if not self.profile.supports_json_object_output: raise UserError('JSON output is not supported by this model.') response_mime_type = 'application/json' tool_config = self._get_tool_config(model_request_parameters, tools) system_instruction, contents = await self._map_messages(messages, model_request_parameters) modalities = [Modality.TEXT.value] if self.profile.supports_image_output: modalities.append(Modality.IMAGE.value) http_options: HttpOptionsDict = { 'headers': {'Content-Type': 'application/json', 'User-Agent': get_user_agent()} } if timeout := model_settings.get('timeout'): if isinstance(timeout, int | float): http_options['timeout'] = int(1000 * timeout) else: raise UserError('Google does not support setting ModelSettings.timeout to a httpx.Timeout') config = GenerateContentConfigDict( http_options=http_options, system_instruction=system_instruction, temperature=model_settings.get('temperature'), top_p=model_settings.get('top_p'), max_output_tokens=model_settings.get('max_tokens'), stop_sequences=model_settings.get('stop_sequences'), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), seed=model_settings.get('seed'), safety_settings=model_settings.get('google_safety_settings'), thinking_config=model_settings.get('google_thinking_config'), labels=model_settings.get('google_labels'), media_resolution=model_settings.get('google_video_resolution'), cached_content=model_settings.get('google_cached_content'), tools=cast(ToolListUnionDict, tools), tool_config=tool_config, response_mime_type=response_mime_type, response_json_schema=response_schema, response_modalities=modalities, image_config=image_config, ) return contents, config def _process_response(self, response: GenerateContentResponse) -> ModelResponse: candidate = response.candidates[0] if response.candidates else None vendor_id = response.response_id finish_reason: FinishReason | None = None vendor_details: dict[str, Any] = {} raw_finish_reason = candidate.finish_reason if candidate else None if raw_finish_reason and candidate: # pragma: no branch vendor_details = {'finish_reason': raw_finish_reason.value} # Add safety ratings to provider details if candidate.safety_ratings: vendor_details['safety_ratings'] = [r.model_dump(by_alias=True) for r in candidate.safety_ratings] finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) if response.create_time is not None: # pragma: no branch vendor_details['timestamp'] = response.create_time if candidate is None or candidate.content is None or candidate.content.parts is None: parts = [] else: parts = candidate.content.parts or [] usage = _metadata_as_usage(response, provider=self._provider.name, provider_url=self._provider.base_url) grounding_metadata = candidate.grounding_metadata if candidate else None url_context_metadata = candidate.url_context_metadata if candidate else None return _process_response_from_parts( parts, grounding_metadata, response.model_version or self._model_name, self._provider.name, self._provider.base_url, usage, vendor_id=vendor_id, vendor_details=vendor_details or None, finish_reason=finish_reason, url_context_metadata=url_context_metadata, ) async def _process_streamed_response( self, response: AsyncIterator[GenerateContentResponse], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover return GeminiStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model_version or self._model_name, _response=peekable_response, _provider_name=self._provider.name, _provider_url=self._provider.base_url, _provider_timestamp=first_chunk.create_time, ) async def _map_messages( # noqa: C901 self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters ) -> tuple[ContentDict | None, list[ContentUnionDict]]: contents: list[ContentUnionDict] = [] system_parts: list[PartDict] = [] for m in messages: if isinstance(m, ModelRequest): message_parts: list[PartDict] = [] for part in m.parts: if isinstance(part, SystemPromptPart): system_parts.append({'text': part.content}) elif isinstance(part, UserPromptPart): message_parts.extend(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': part.model_response_object(), 'id': part.tool_call_id, } } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: message_parts.append({'text': part.model_response()}) else: message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': {'error': part.model_response()}, 'id': part.tool_call_id, } } ) else: assert_never(part) # Work around a Gemini bug where content objects containing functionResponse parts are treated as # role=model even when role=user is explicitly specified. # # We build `message_parts` first, then split into multiple content objects whenever we transition # between function_response and non-function_response parts. # # TODO: Remove workaround when https://github.com/pydantic/pydantic-ai/issues/4210 is resolved if message_parts: content_parts: list[PartDict] = [] for part in message_parts: if ( content_parts and 'function_response' in content_parts[-1] and 'function_response' not in part ): contents.append({'role': 'user', 'parts': content_parts}) content_parts = [] content_parts.append(part) contents.append({'role': 'user', 'parts': content_parts}) elif isinstance(m, ModelResponse): maybe_content = _content_model_response(m, self.system) if maybe_content: contents.append(maybe_content) else: assert_never(m) # Google GenAI requires at least one user part in the message, and that function call turns # come immediately after a user turn or after a function response turn. if not contents or contents[0].get('role') == 'model': # pyright: ignore[reportAttributeAccessIssue, reportUnknownMemberType] contents.insert(0, {'role': 'user', 'parts': [{'text': ''}]}) if instructions := self._get_instructions(messages, model_request_parameters): system_parts.append({'text': instructions}) system_instruction = ContentDict(role='user', parts=system_parts) if system_parts else None return system_instruction, contents async def _map_user_prompt(self, part: UserPromptPart) -> list[PartDict]: if isinstance(part.content, str): return [{'text': part.content}] else: content: list[PartDict] = [] for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): inline_data_dict: BlobDict = {'data': item.data, 'mime_type': item.media_type} part_dict: PartDict = {'inline_data': inline_data_dict} if item.vendor_metadata: part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) elif isinstance(item, VideoUrl) and ( item.is_youtube or (item.url.startswith('gs://') and self.system == 'google-vertex') ): # YouTube URLs work on both google-gla and google-vertex # GCS URIs (gs://...) only work on google-vertex (Vertex AI can access GCS buckets) # GCS on google-gla falls through to FileUrl which raises clear error on download attempt # Other URLs fall through to FileUrl handling (download for google-gla) # Note: force_download is not checked here, mirroring the original YouTube behavior. # GCS URIs cannot be downloaded anyway ("gs://" protocol not supported for download). file_data_dict: FileDataDict = {'file_uri': item.url, 'mime_type': item.media_type} part_dict: PartDict = {'file_data': file_data_dict} if item.vendor_metadata: part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) elif isinstance(item, FileUrl): if item.force_download or ( # google-gla does not support passing file urls directly, except for youtube videos # (see above) and files uploaded to the file API (which cannot be downloaded anyway) self.system == 'google-gla' and not item.url.startswith(r'https://generativelanguage.googleapis.com/v1beta/files') ): downloaded_item = await download_item(item, data_format='bytes') inline_data: BlobDict = { 'data': downloaded_item['data'], 'mime_type': downloaded_item['data_type'], } part_dict: PartDict = {'inline_data': inline_data} # VideoUrl is a subclass of FileUrl - include video_metadata if present if isinstance(item, VideoUrl) and item.vendor_metadata: part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) else: file_data_dict: FileDataDict = {'file_uri': item.url, 'mime_type': item.media_type} part_dict: PartDict = {'file_data': file_data_dict} # VideoUrl is a subclass of FileUrl - include video_metadata if present if isinstance(item, VideoUrl) and item.vendor_metadata: part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) # pragma: lax no cover elif isinstance(item, CachePoint): # Google doesn't support inline CachePoint markers. Google's caching requires # pre-creating cache objects via the API, then referencing them by name using # `GoogleModelSettings.google_cached_content`. See https://ai.google.dev/gemini-api/docs/caching pass else: assert_never(item) return content def _map_response_schema(self, o: OutputObjectDefinition) -> dict[str, Any]: response_schema = o.json_schema.copy() if o.name: response_schema['title'] = o.name if o.description: response_schema['description'] = o.description return response_schema ``` #### __init__ ```python __init__( model_name: GoogleModelName, *, provider: ( Literal["google-gla", "google-vertex", "gateway"] | Provider[Client] ) = "google-gla", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Gemini model. Parameters: | Name | Type | Description | Default | | ------------ | --------------------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `GoogleModelName` | The name of the model to use. | *required* | | `provider` | \`Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client]\` | The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of Provider[google.genai.AsyncClient]. Defaults to 'google-gla'. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | The model settings to use. Defaults to None. | Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex', 'gateway'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[google.genai.AsyncClient]`. Defaults to 'google-gla'. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/google-vertex' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GoogleModelName ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` Return the set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, FileSearchTool, WebFetchTool, ImageGenerationTool}) ``` ### GeminiStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for the Gemini model. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass class GeminiStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for the Gemini model.""" _model_name: GoogleModelName _response: AsyncIterator[GenerateContentResponse] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_utils.now_utc) _file_search_tool_call_id: str | None = field(default=None, init=False) _code_execution_tool_call_id: str | None = field(default=None, init=False) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 if self._provider_timestamp is not None: self.provider_details = {'timestamp': self._provider_timestamp} async for chunk in self._response: self._usage = _metadata_as_usage(chunk, self._provider_name, self._provider_url) if not chunk.candidates: continue # pragma: no cover candidate = chunk.candidates[0] if chunk.response_id: # pragma: no branch self.provider_response_id = chunk.response_id raw_finish_reason = candidate.finish_reason if raw_finish_reason: self.provider_details = {'finish_reason': raw_finish_reason.value} if candidate.safety_ratings: self.provider_details['safety_ratings'] = [ r.model_dump(by_alias=True) for r in candidate.safety_ratings ] self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Google streams the grounding metadata (including the web search queries and results) # _after_ the text that was generated using it, so it would show up out of order in the stream, # and cause issues with the logic that doesn't consider text ahead of built-in tool calls as output. # If that gets fixed (or we have a workaround), we can uncomment this: # web_search_call, web_search_return = _map_grounding_metadata( # candidate.grounding_metadata, self.provider_name # ) # if web_search_call and web_search_return: # yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=web_search_call) # yield self._parts_manager.handle_part( # vendor_part_id=uuid4(), part=web_search_return # ) # URL context metadata (for WebFetchTool) is streamed in the first chunk, before the text, # so we can safely yield it here web_fetch_call, web_fetch_return = _map_url_context_metadata( candidate.url_context_metadata, self.provider_name ) if web_fetch_call and web_fetch_return: yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=web_fetch_call) yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=web_fetch_return) if candidate.content is None or candidate.content.parts is None: continue parts = candidate.content.parts if not parts: continue # pragma: no cover for part in parts: provider_details: dict[str, Any] | None = None if part.thought_signature: # Per https://ai.google.dev/gemini-api/docs/function-calling?example=meeting#thought-signatures: # - Always send the thought_signature back to the model inside its original Part. # - Don't merge a Part containing a signature with one that does not. This breaks the positional context of the thought. # - Don't combine two Parts that both contain signatures, as the signature strings cannot be merged. thought_signature = base64.b64encode(part.thought_signature).decode('utf-8') provider_details = {'thought_signature': thought_signature} if part.text is not None: if len(part.text) == 0 and not provider_details: continue if part.thought: for event in self._parts_manager.handle_thinking_delta( vendor_part_id=None, content=part.text, provider_name=self.provider_name if provider_details else None, provider_details=provider_details, ): yield event else: for event in self._parts_manager.handle_text_delta( vendor_part_id=None, content=part.text, provider_name=self.provider_name if provider_details else None, provider_details=provider_details, ): yield event elif part.function_call: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=uuid4(), tool_name=part.function_call.name, args=part.function_call.args, tool_call_id=part.function_call.id, provider_name=self.provider_name if provider_details else None, provider_details=provider_details, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif part.inline_data is not None: if part.thought: # pragma: no cover # Per https://ai.google.dev/gemini-api/docs/image-generation#thinking-process: # > The model generates up to two interim images to test composition and logic. The last image within Thinking is also the final rendered image. # We currently don't expose these image thoughts as they can't be represented with `ThinkingPart` continue data = part.inline_data.data mime_type = part.inline_data.mime_type assert data and mime_type, 'Inline data must have data and mime type' content = BinaryContent(data=data, media_type=mime_type) yield self._parts_manager.handle_part( vendor_part_id=uuid4(), part=FilePart( content=BinaryContent.narrow_type(content), provider_name=self.provider_name if provider_details else None, provider_details=provider_details, ), ) elif part.executable_code is not None: part_obj = self._handle_executable_code_streaming(part.executable_code) part_obj.provider_details = provider_details yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=part_obj) elif part.code_execution_result is not None: part = self._map_code_execution_result(part.code_execution_result) part.provider_details = provider_details yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=part) else: assert part.function_response is not None, f'Unexpected part: {part}' # pragma: no cover # Grounding metadata is attached to the final text chunk, so # we emit the `BuiltinToolReturnPart` after the text delta so # that the delta is properly added to the same `TextPart` as earlier chunks file_search_part = self._handle_file_search_grounding_metadata_streaming(candidate.grounding_metadata) if file_search_part is not None: yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=file_search_part) def _handle_file_search_grounding_metadata_streaming( self, grounding_metadata: GroundingMetadata | None ) -> BuiltinToolReturnPart | None: """Handle file search grounding metadata for streaming responses. Returns a BuiltinToolReturnPart if file search results are available in the grounding metadata. """ if not self._file_search_tool_call_id or not grounding_metadata: return None grounding_chunks = grounding_metadata.grounding_chunks retrieved_contexts = _extract_file_search_retrieved_contexts(grounding_chunks) if retrieved_contexts: # pragma: no branch part = BuiltinToolReturnPart( provider_name=self.provider_name, tool_name=FileSearchTool.kind, tool_call_id=self._file_search_tool_call_id, content=retrieved_contexts, ) self._file_search_tool_call_id = None return part return None # pragma: no cover def _map_code_execution_result(self, code_execution_result: CodeExecutionResult) -> BuiltinToolReturnPart: """Map code execution result to a BuiltinToolReturnPart using instance state.""" assert self._code_execution_tool_call_id is not None return _map_code_execution_result(code_execution_result, self.provider_name, self._code_execution_tool_call_id) def _handle_executable_code_streaming(self, executable_code: ExecutableCode) -> ModelResponsePart: """Handle executable code for streaming responses. Returns a BuiltinToolCallPart for file search or code execution. Sets self._code_execution_tool_call_id or self._file_search_tool_call_id as appropriate. """ code = executable_code.code has_file_search_tool = any( isinstance(tool, FileSearchTool) for tool in self.model_request_parameters.builtin_tools ) if code and has_file_search_tool and (file_search_query := self._extract_file_search_query(code)): self._file_search_tool_call_id = _utils.generate_tool_call_id() return BuiltinToolCallPart( provider_name=self.provider_name, tool_name=FileSearchTool.kind, tool_call_id=self._file_search_tool_call_id, args={'query': file_search_query}, ) self._code_execution_tool_call_id = _utils.generate_tool_call_id() return _map_executable_code(executable_code, self.provider_name, self._code_execution_tool_call_id) def _extract_file_search_query(self, code: str) -> str | None: """Extract the query from file_search.query() executable code. Handles escaped quotes in the query string. Example: 'print(file_search.query(query="what is the capital of France?"))' Returns: 'what is the capital of France?' """ match = _FILE_SEARCH_QUERY_PATTERN.search(code) if match: query = match.group(2) query = query.replace('\\\\', '\\').replace('\\"', '"').replace("\\'", "'") return query return None # pragma: no cover @property def model_name(self) -> GoogleModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GoogleModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.groq` ## Setup For details on how to set up authentication with this model, see [model configuration for Groq](https://ai.pydantic.dev/models/groq/index.md). ### ProductionGroqModelNames ```python ProductionGroqModelNames = Literal[ "llama-3.1-8b-instant", "llama-3.3-70b-versatile", "meta-llama/llama-guard-4-12b", "openai/gpt-oss-120b", "openai/gpt-oss-20b", "whisper-large-v3", "whisper-large-v3-turbo", ] ``` Production Groq models from . ### PreviewGroqModelNames ```python PreviewGroqModelNames = Literal[ "meta-llama/llama-4-maverick-17b-128e-instruct", "meta-llama/llama-4-scout-17b-16e-instruct", "meta-llama/llama-prompt-guard-2-22m", "meta-llama/llama-prompt-guard-2-86m", "moonshotai/kimi-k2-instruct-0905", "openai/gpt-oss-safeguard-20b", "playai-tts", "playai-tts-arabic", "qwen/qwen-3-32b", ] ``` Preview Groq models from . ### GroqModelName ```python GroqModelName = ( str | ProductionGroqModelNames | PreviewGroqModelNames ) ``` Possible Groq model names. Since Groq supports a variety of models and the list changes frequencly, we explicitly list the named models as of 2025-03-31 but allow any name in the type hints. See for an up to date date list of models and more details. ### GroqModelSettings Bases: `ModelSettings` Settings used for a Groq model request. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python class GroqModelSettings(ModelSettings, total=False): """Settings used for a Groq model request.""" # ALL FIELDS MUST BE `groq_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. groq_reasoning_format: Literal['hidden', 'raw', 'parsed'] """The format of the reasoning output. See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details. """ ``` #### groq_reasoning_format ```python groq_reasoning_format: Literal['hidden', 'raw', 'parsed'] ``` The format of the reasoning output. See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details. ### GroqModel Bases: `Model` A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass(init=False) class GroqModel(Model): """A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncGroq = field(repr=False) _model_name: GroqModelName = field(repr=False) _provider: Provider[AsyncGroq] = field(repr=False) def __init__( self, model_name: GroqModelName, *, provider: Literal['groq', 'gateway'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/groq' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> GroqModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool}) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) try: response = await self._completions_create( messages, False, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) except ModelHTTPError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call. try: error = _GroqToolUseFailedError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] tool_call_part = ToolCallPart( tool_name=error.error.failed_generation.name, args=error.error.failed_generation.arguments, ) return ModelResponse( parts=[tool_call_part], model_name=e.model_name, provider_name=self._provider.name, provider_url=self.base_url, finish_reason='error', ) except ValidationError: pass raise model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, True, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[chat.ChatCompletionChunk]: pass @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion: pass async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[chat.ChatCompletionChunk]: tools = self._get_tools(model_request_parameters) tools += self._get_builtin_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' groq_messages = self._map_messages(messages, model_request_parameters) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and not tools and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.chat.completions.create( model=self._model_name, messages=groq_messages, n=1, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stop=model_settings.get('stop_sequences', NOT_GIVEN), stream=stream, response_format=response_format or NOT_GIVEN, max_tokens=model_settings.get('max_tokens', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), seed=model_settings.get('seed', NOT_GIVEN), presence_penalty=model_settings.get('presence_penalty', NOT_GIVEN), reasoning_format=model_settings.get('groq_reasoning_format', NOT_GIVEN), frequency_penalty=model_settings.get('frequency_penalty', NOT_GIVEN), logit_bias=model_settings.get('logit_bias', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=e.message) from e # pragma: no cover except APIConnectionError as e: raise ModelAPIError(model_name=self.model_name, message=e.message) from e def _process_response(self, response: chat.ChatCompletion) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" choice = response.choices[0] items: list[ModelResponsePart] = [] if choice.message.reasoning is not None: # NOTE: The `reasoning` field is only present if `groq_reasoning_format` is set to `parsed`. items.append(ThinkingPart(content=choice.message.reasoning)) if choice.message.executed_tools: for tool in choice.message.executed_tools: call_part, return_part = _map_executed_tool(tool, self.system) if call_part and return_part: # pragma: no branch items.append(call_part) items.append(return_part) if choice.message.content: # NOTE: The `` tag is only present if `groq_reasoning_format` is set to `raw`. items.extend(split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags)) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: items.append(ToolCallPart(tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details: dict[str, Any] = {'finish_reason': raw_finish_reason} if response.created: # pragma: no branch provider_details['timestamp'] = number_to_datetime(response.created) finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, provider_url=self.base_url, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[chat.ChatCompletionChunk], model_request_parameters: ModelRequestParameters ) -> GroqStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return GroqStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.model, _model_profile=self.profile, _provider_name=self._provider.name, _provider_url=self.base_url, _provider_timestamp=number_to_datetime(first_chunk.created), ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools( self, model_request_parameters: ModelRequestParameters ) -> list[chat.ChatCompletionToolParam]: tools: list[chat.ChatCompletionToolParam] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): if not GroqModelProfile.from_profile(self.profile).groq_always_has_web_search_builtin_tool: raise UserError('`WebSearchTool` is not supported by Groq') # pragma: no cover else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `GroqModel`. If it should be, please file an issue.' ) return tools def _map_messages( self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters ) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `groq.types.ChatCompletionMessageParam`.""" groq_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): groq_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[chat.ChatCompletionMessageToolCallParam] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # These are not currently sent back pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls groq_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in groq_messages if m.get('role') == 'system') groq_messages.insert( system_prompt_count, chat.ChatCompletionSystemMessageParam(role='system', content=instructions) ) return groq_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> chat.ChatCompletionMessageToolCallParam: return chat.ChatCompletionMessageToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> chat.ChatCompletionToolParam: return { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatResponseFormatJsonSchema = { 'type': 'json_schema', 'json_schema': { 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, 'strict': o.strict, }, } if o.description: # pragma: no branch response_format_param['json_schema']['description'] = o.description return response_format_param @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield cls._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): # pragma: no branch if part.tool_name is None: yield chat.ChatCompletionUserMessageParam( # pragma: no cover role='user', content=part.model_response() ) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) @staticmethod def _map_user_prompt(part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: content: str | list[chat.ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(chat.ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url = ImageURL(url=item.url) content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): if item.is_image: image_url = ImageURL(url=item.data_uri) content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) else: raise RuntimeError('Only images are supported for binary content in Groq.') elif isinstance(item, DocumentUrl): # pragma: no cover raise RuntimeError('DocumentUrl is not supported in Groq.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return chat.ChatCompletionUserMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: GroqModelName, *, provider: ( Literal["groq", "gateway"] | Provider[AsyncGroq] ) = "groq", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Groq model. Parameters: | Name | Type | Description | Default | | ------------ | ---------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `GroqModelName` | The name of the Groq model to use. List of model names available here. | *required* | | `provider` | \`Literal['groq', 'gateway'] | Provider[AsyncGroq]\` | The provider to use for authentication and API access. Can be either the string 'groq' or an instance of Provider[AsyncGroq]. If not provided, a new provider will be created using the other parameters. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python def __init__( self, model_name: GroqModelName, *, provider: Literal['groq', 'gateway'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/groq' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GroqModelName ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` Return the set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool}) ``` ### GroqStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Groq models. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass class GroqStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Groq models.""" _model_name: GroqModelName _model_profile: ModelProfile _response: AsyncIterable[chat.ChatCompletionChunk] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_utils.now_utc) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 try: executed_tool_call_id: str | None = None reasoning_index = 0 reasoning = False if self._provider_timestamp is not None: # pragma: no branch self.provider_details = {'timestamp': self._provider_timestamp} async for chunk in self._response: self._usage += _map_usage(chunk) if chunk.id: # pragma: no branch self.provider_response_id = chunk.id try: choice = chunk.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {**(self.provider_details or {}), 'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) if choice.delta.reasoning is not None: if not reasoning: reasoning_index += 1 reasoning = True # NOTE: The `reasoning` field is only present if `groq_reasoning_format` is set to `parsed`. for event in self._parts_manager.handle_thinking_delta( vendor_part_id=f'reasoning-{reasoning_index}', content=choice.delta.reasoning ): yield event else: reasoning = False if choice.delta.executed_tools: for tool in choice.delta.executed_tools: call_part, return_part = _map_executed_tool( tool, self.provider_name, streaming=True, tool_call_id=executed_tool_call_id ) if call_part: executed_tool_call_id = call_part.tool_call_id yield self._parts_manager.handle_part( vendor_part_id=f'executed_tools-{tool.index}-call', part=call_part ) if return_part: executed_tool_call_id = None yield self._parts_manager.handle_part( vendor_part_id=f'executed_tools-{tool.index}-return', part=return_part ) # Handle the text part of the response content = choice.delta.content if content: for event in self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ): yield event # Handle the tool calls for dtc in choice.delta.tool_calls or []: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc.index, tool_name=dtc.function and dtc.function.name, args=dtc.function and dtc.function.arguments, tool_call_id=dtc.id, ) if maybe_event is not None: yield maybe_event except APIError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call try: error = _GroqToolUseFailedInnerError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] yield self._parts_manager.handle_tool_call_part( vendor_part_id='tool_use_failed', tool_name=error.failed_generation.name, args=error.failed_generation.arguments, ) return except ValidationError as e: # pragma: no cover pass raise # pragma: no cover @property def model_name(self) -> GroqModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GroqModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.huggingface` ## Setup For details on how to set up authentication with this model, see [model configuration for Hugging Face](https://ai.pydantic.dev/models/huggingface/index.md). ### LatestHuggingFaceModelNames ```python LatestHuggingFaceModelNames = Literal[ "deepseek-ai/DeepSeek-R1", "meta-llama/Llama-3.3-70B-Instruct", "meta-llama/Llama-4-Maverick-17B-128E-Instruct", "meta-llama/Llama-4-Scout-17B-16E-Instruct", "Qwen/QwQ-32B", "Qwen/Qwen2.5-72B-Instruct", "Qwen/Qwen3-235B-A22B", "Qwen/Qwen3-32B", ] ``` Latest Hugging Face models. ### HuggingFaceModelName ```python HuggingFaceModelName = str | LatestHuggingFaceModelNames ``` Possible Hugging Face model names. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). ### HuggingFaceModelSettings Bases: `ModelSettings` Settings used for a Hugging Face model request. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python class HuggingFaceModelSettings(ModelSettings, total=False): """Settings used for a Hugging Face model request.""" ``` ### HuggingFaceModel Bases: `Model` A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python @dataclass(init=False) class HuggingFaceModel(Model): """A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncInferenceClient = field(repr=False) _model_name: str = field(repr=False) _provider: Provider[AsyncInferenceClient] = field(repr=False) def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: """The base URL of the provider.""" return self._provider.base_url @property def model_name(self) -> HuggingFaceModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, False, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, True, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncIterable[ChatCompletionStreamOutput]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput | AsyncIterable[ChatCompletionStreamOutput]: tools = self._get_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' hf_messages = await self._map_messages(messages, model_request_parameters) try: return await self.client.chat.completions.create( # type: ignore model=self._model_name, messages=hf_messages, # type: ignore tools=tools, tool_choice=tool_choice or None, stream=stream, stop=model_settings.get('stop_sequences', None), temperature=model_settings.get('temperature', None), top_p=model_settings.get('top_p', None), seed=model_settings.get('seed', None), presence_penalty=model_settings.get('presence_penalty', None), frequency_penalty=model_settings.get('frequency_penalty', None), logit_bias=model_settings.get('logit_bias', None), # type: ignore logprobs=model_settings.get('logprobs', None), top_logprobs=model_settings.get('top_logprobs', None), extra_body=model_settings.get('extra_body'), # type: ignore ) except aiohttp.ClientResponseError as e: raise ModelHTTPError( status_code=e.status, model_name=self.model_name, body=e.message, ) from e except HfHubHTTPError as e: raise ModelHTTPError( status_code=e.response.status_code, model_name=self.model_name, body=e.response.content, ) from e def _process_response(self, response: ChatCompletionOutput) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls items: list[ModelResponsePart] = [] if content: items.extend(split_content_into_text_and_thinking(content, self.profile.thinking_tags)) if tool_calls is not None: for c in tool_calls: items.append(ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details: dict[str, Any] = {'finish_reason': raw_finish_reason} if response.created: # pragma: no branch provider_details['timestamp'] = datetime.fromtimestamp(response.created, tz=timezone.utc) finish_reason = _FINISH_REASON_MAP.get(cast(TextGenerationOutputFinishReason, raw_finish_reason), None) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, provider_url=self.base_url, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncIterable[ChatCompletionStreamOutput], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return HuggingFaceStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model, _model_profile=self.profile, _response=peekable_response, _provider_name=self._provider.name, _provider_url=self.base_url, _provider_timestamp=datetime.fromtimestamp(first_chunk.created, tz=timezone.utc), ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ChatCompletionInputTool]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] async def _map_messages( self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters ) -> list[ChatCompletionInputMessage | ChatCompletionOutputMessage]: """Just maps a `pydantic_ai.Message` to a `huggingface_hub.ChatCompletionInputMessage`.""" hf_messages: list[ChatCompletionInputMessage | ChatCompletionOutputMessage] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): hf_messages.append(item) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[ChatCompletionInputToolCall] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from huggingface pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = ChatCompletionInputMessage(role='assistant') # type: ignore if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls hf_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in hf_messages if getattr(m, 'role', None) == 'system') hf_messages.insert(system_prompt_count, ChatCompletionInputMessage(content=instructions, role='system')) # type: ignore return hf_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionInputToolCall: return ChatCompletionInputToolCall.parse_obj_as_instance( # type: ignore { 'id': _guard_tool_call_id(t=t), 'type': 'function', 'function': { 'name': t.tool_name, 'arguments': t.args_as_json_str(), }, } ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ChatCompletionInputTool: tool_param: ChatCompletionInputTool = ChatCompletionInputTool.parse_obj_as_instance( # type: ignore { 'type': 'function', 'function': { 'name': f.name, 'description': f.description, 'parameters': f.parameters_json_schema, }, } ) return tool_param async def _map_user_message( self, message: ModelRequest ) -> AsyncIterable[ChatCompletionInputMessage | ChatCompletionOutputMessage]: for part in message.parts: if isinstance(part, SystemPromptPart): yield ChatCompletionInputMessage.parse_obj_as_instance({'role': 'system', 'content': part.content}) # type: ignore elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield ChatCompletionOutputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response_str(), } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore {'role': 'user', 'content': part.model_response()} ) else: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response(), } ) else: assert_never(part) @staticmethod async def _map_user_prompt(part: UserPromptPart) -> ChatCompletionInputMessage: content: str | list[ChatCompletionInputMessage] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionInputMessageChunk(type='text', text=item)) # type: ignore elif isinstance(item, ImageUrl): url = ChatCompletionInputURL(url=item.url) # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore elif isinstance(item, BinaryContent): if item.is_image: url = ChatCompletionInputURL(url=item.data_uri) # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): raise NotImplementedError('AudioUrl is not supported for Hugging Face') elif isinstance(item, DocumentUrl): raise NotImplementedError('DocumentUrl is not supported for Hugging Face') elif isinstance(item, VideoUrl): raise NotImplementedError('VideoUrl is not supported for Hugging Face') elif isinstance(item, CachePoint): # Hugging Face doesn't support prompt caching via CachePoint pass else: assert_never(item) return ChatCompletionInputMessage(role='user', content=content) # type: ignore ``` #### __init__ ```python __init__( model_name: str, *, provider: ( Literal["huggingface"] | Provider[AsyncInferenceClient] ) = "huggingface", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Hugging Face model. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------ | ------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `model_name` | `str` | The name of the Model to use. You can browse available models here. | *required* | | `provider` | \`Literal['huggingface'] | Provider[AsyncInferenceClient]\` | The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of Provider[AsyncInferenceClient]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### base_url ```python base_url: str ``` The base URL of the provider. #### model_name ```python model_name: HuggingFaceModelName ``` The model name. #### system ```python system: str ``` The system / model provider. ### HuggingFaceStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Hugging Face models. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python @dataclass class HuggingFaceStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Hugging Face models.""" _model_name: str _model_profile: ModelProfile _response: AsyncIterable[ChatCompletionStreamOutput] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_utils.now_utc) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: if self._provider_timestamp is not None: # pragma: no branch self.provider_details = {'timestamp': self._provider_timestamp} async for chunk in self._response: self._usage += _map_usage(chunk) if chunk.id: # pragma: no branch self.provider_response_id = chunk.id try: choice = chunk.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {**(self.provider_details or {}), 'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get( cast(TextGenerationOutputFinishReason, raw_finish_reason), None ) # Handle the text part of the response content = choice.delta.content if content: for event in self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ): yield event for dtc in choice.delta.tool_calls or []: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc.index, tool_name=dtc.function and dtc.function.name, # type: ignore args=dtc.function and dtc.function.arguments, tool_call_id=dtc.id, ) if maybe_event is not None: yield maybe_event @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # pydantic_ai.models.instrumented ### instrument_model ```python instrument_model( model: Model, instrument: InstrumentationSettings | bool ) -> Model ``` Instrument a model with OpenTelemetry/logfire. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def instrument_model(model: Model, instrument: InstrumentationSettings | bool) -> Model: """Instrument a model with OpenTelemetry/logfire.""" if instrument and not isinstance(model, InstrumentedModel): if instrument is True: instrument = InstrumentationSettings() model = InstrumentedModel(model, instrument) return model ``` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) logger: Logger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. logger_provider: The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() logger_provider = logger_provider or get_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.logger = logger_provider.get_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) def messages_to_otel_events( self, messages: list[ModelMessage], parameters: ModelRequestParameters | None = None ) -> list[LogRecord]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. parameters: The model request parameters. Returns: A list of OpenTelemetry events. """ events: list[LogRecord] = [] instructions = InstrumentedModel._get_instructions(messages, parameters) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( LogRecord( attributes={'event.name': 'gen_ai.system.message'}, body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[LogRecord] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages( self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span, parameters: ModelRequestParameters | None = None, ): if self.version == 1: events = self.messages_to_otel_events(input_messages, parameters) for event in self.messages_to_otel_events([response], parameters): events.append( LogRecord( attributes={'event.name': 'gen_ai.choice'}, body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages, parameters) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[LogRecord]) -> None: if self.event_mode == 'logs': for event in events: self.logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) def record_metrics( self, response: ModelResponse, price_calculation: PriceCalculation | None, attributes: dict[str, AttributeValue], ): for typ in ['input', 'output']: if not (tokens := getattr(response.usage, f'{typ}_tokens', 0)): # pragma: no cover continue token_attributes = {**attributes, 'gen_ai.token.type': typ} self.tokens_histogram.record(tokens, token_attributes) if price_calculation: cost = float(getattr(price_calculation, f'{typ}_price')) self.cost_histogram.record(cost, token_attributes) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[ 1, 2, 3 ] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal[ "attributes", "logs" ] = "attributes", logger_provider: LoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | ------------------------ | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `tracer_provider` | \`TracerProvider | None\` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | | `meter_provider` | \`MeterProvider | None\` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2, 3]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `DEFAULT_INSTRUMENTATION_VERSION` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `logger_provider` | \`LoggerProvider | None\` | The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling logfire.configure() sets the global logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', logger_provider: LoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. logger_provider: The OpenTelemetry logger provider to use. If not provided, the global logger provider is used. Calling `logfire.configure()` sets the global logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() logger_provider = logger_provider or get_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.logger = logger_provider.get_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], parameters: ModelRequestParameters | None = None, ) -> list[LogRecord] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | ------------ | ------------------------ | ------------------------ | ----------------------------- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | | `parameters` | \`ModelRequestParameters | None\` | The model request parameters. | Returns: | Type | Description | | ----------------- | ------------------------------- | | `list[LogRecord]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events( self, messages: list[ModelMessage], parameters: ModelRequestParameters | None = None ) -> list[LogRecord]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. parameters: The model request parameters. Returns: A list of OpenTelemetry events. """ events: list[LogRecord] = [] instructions = InstrumentedModel._get_instructions(messages, parameters) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( LogRecord( attributes={'event.name': 'gen_ai.system.message'}, body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[LogRecord] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### InstrumentedModel Bases: `WrapperModel` Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentedModel(WrapperModel): """Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ instrumentation_settings: InstrumentationSettings """Instrumentation settings for this model.""" def __init__( self, wrapped: Model | KnownModelName, options: InstrumentationSettings | None = None, ) -> None: super().__init__(wrapped) self.instrumentation_settings = options or InstrumentationSettings() async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: prepared_settings, prepared_parameters = self.wrapped.prepare_request( model_settings, model_request_parameters, ) with self._instrument(messages, prepared_settings, prepared_parameters) as finish: response = await self.wrapped.request(messages, model_settings, model_request_parameters) finish(response, prepared_parameters) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: prepared_settings, prepared_parameters = self.wrapped.prepare_request( model_settings, model_request_parameters, ) with self._instrument(messages, prepared_settings, prepared_parameters) as finish: response_stream: StreamedResponse | None = None try: async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream finally: if response_stream: # pragma: no branch finish(response_stream.get(), prepared_parameters) @contextmanager def _instrument( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> Iterator[Callable[[ModelResponse, ModelRequestParameters], None]]: operation = 'chat' span_name = f'{operation} {self.model_name}' # TODO Missing attributes: # - error.type: unclear if we should do something here or just always rely on span exceptions # - gen_ai.request.stop_sequences/top_k: model_settings doesn't include these attributes: dict[str, AttributeValue] = { 'gen_ai.operation.name': operation, **self.model_attributes(self.wrapped), **self.model_request_parameters_attributes(model_request_parameters), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': {'model_request_parameters': {'type': 'object'}}, } ), } tool_definitions = _build_tool_definitions(model_request_parameters) if tool_definitions: attributes['gen_ai.tool.definitions'] = json.dumps(tool_definitions) if model_settings: for key in MODEL_SETTING_ATTRIBUTES: if isinstance(value := model_settings.get(key), float | int): attributes[f'gen_ai.request.{key}'] = value record_metrics: Callable[[], None] | None = None try: with self.instrumentation_settings.tracer.start_as_current_span( span_name, attributes=attributes, kind=SpanKind.CLIENT ) as span: def finish(response: ModelResponse, parameters: ModelRequestParameters): # FallbackModel updates these span attributes. attributes.update(getattr(span, 'attributes', {})) request_model = attributes[GEN_AI_REQUEST_MODEL_ATTRIBUTE] system = cast(str, attributes[GEN_AI_SYSTEM_ATTRIBUTE]) response_model = response.model_name or request_model price_calculation = None def _record_metrics(): metric_attributes = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: system, # New OTel standard attribute GEN_AI_SYSTEM_ATTRIBUTE: system, # Preserved for backward compatibility (deprecated) 'gen_ai.operation.name': operation, 'gen_ai.request.model': request_model, 'gen_ai.response.model': response_model, } self.instrumentation_settings.record_metrics(response, price_calculation, metric_attributes) nonlocal record_metrics record_metrics = _record_metrics if not span.is_recording(): return self.instrumentation_settings.handle_messages(messages, response, system, span, parameters) attributes_to_set = { **response.usage.opentelemetry_attributes(), 'gen_ai.response.model': response_model, } try: price_calculation = response.cost() except LookupError: # The cost of this provider/model is unknown, which is common. pass except Exception as e: warnings.warn( f'Failed to get cost from response: {type(e).__name__}: {e}', CostCalculationFailedWarning ) else: attributes_to_set['operation.cost'] = float(price_calculation.total_price) if response.provider_response_id is not None: attributes_to_set['gen_ai.response.id'] = response.provider_response_id if response.finish_reason is not None: attributes_to_set['gen_ai.response.finish_reasons'] = [response.finish_reason] span.set_attributes(attributes_to_set) span.update_name(f'{operation} {request_model}') yield finish finally: if record_metrics: # We only want to record metrics after the span is finished, # to prevent them from being redundantly recorded in the span itself by logfire. record_metrics() @staticmethod def model_attributes(model: Model) -> dict[str, AttributeValue]: attributes: dict[str, AttributeValue] = { GEN_AI_PROVIDER_NAME_ATTRIBUTE: model.system, # New OTel standard attribute GEN_AI_SYSTEM_ATTRIBUTE: model.system, # Preserved for backward compatibility (deprecated) GEN_AI_REQUEST_MODEL_ATTRIBUTE: model.model_name, } if base_url := model.base_url: try: parsed = urlparse(base_url) except Exception: # pragma: no cover pass else: if parsed.hostname: # pragma: no branch attributes['server.address'] = parsed.hostname if parsed.port: # pragma: no branch attributes['server.port'] = parsed.port return attributes @staticmethod def model_request_parameters_attributes( model_request_parameters: ModelRequestParameters, ) -> dict[str, AttributeValue]: return {'model_request_parameters': json.dumps(InstrumentedModel.serialize_any(model_request_parameters))} @staticmethod def event_to_dict(event: LogRecord) -> dict[str, Any]: if not event.body: body = {} # pragma: no cover elif isinstance(event.body, Mapping): body = event.body else: body = {'body': event.body} return {**body, **(event.attributes or {})} @staticmethod def serialize_any(value: Any) -> str: try: return ANY_ADAPTER.dump_python(value, mode='json') except Exception: try: return str(value) except Exception as e: return f'Unable to serialize: {e}' ``` #### instrumentation_settings ```python instrumentation_settings: InstrumentationSettings = ( options or InstrumentationSettings() ) ``` Instrumentation settings for this model. ### CostCalculationFailedWarning Bases: `Warning` Warning raised when cost calculation fails. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python class CostCalculationFailedWarning(Warning): """Warning raised when cost calculation fails.""" ``` # pydantic_ai.models.mcp_sampling ### MCPSamplingModelSettings Bases: `ModelSettings` Settings used for an MCP Sampling model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python class MCPSamplingModelSettings(ModelSettings, total=False): """Settings used for an MCP Sampling model request.""" # ALL FIELDS MUST BE `mcp_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. mcp_model_preferences: ModelPreferences """Model preferences to use for MCP Sampling.""" ``` #### mcp_model_preferences ```python mcp_model_preferences: ModelPreferences ``` Model preferences to use for MCP Sampling. ### MCPSamplingModel Bases: `Model` A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python @dataclass class MCPSamplingModel(Model): """A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. """ session: ServerSession """The MCP server session to use for sampling.""" _: KW_ONLY default_max_tokens: int = 16_384 """Default max tokens to use if not set in [`ModelSettings`][pydantic_ai.settings.ModelSettings.max_tokens]. Max tokens is a required parameter for MCP Sampling, but optional on [`ModelSettings`][pydantic_ai.settings.ModelSettings], so this value is used as fallback. """ async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: system_prompt, sampling_messages = _mcp.map_from_pai_messages(messages) model_settings, _ = self.prepare_request(model_settings, model_request_parameters) model_settings = cast(MCPSamplingModelSettings, model_settings or {}) result = await self.session.create_message( sampling_messages, max_tokens=model_settings.get('max_tokens', self.default_max_tokens), system_prompt=system_prompt, temperature=model_settings.get('temperature'), model_preferences=model_settings.get('mcp_model_preferences'), stop_sequences=model_settings.get('stop_sequences'), ) if result.role == 'assistant': return ModelResponse( parts=[_mcp.map_from_sampling_content(result.content)], model_name=result.model, ) else: raise exceptions.UnexpectedModelBehavior( f'Unexpected result from MCP sampling, expected "assistant" role, got {result.role}.' ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: raise NotImplementedError('MCP Sampling does not support streaming') yield @property def model_name(self) -> str: """The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. """ return 'mcp-sampling' @property def system(self) -> str: """The system / model provider, returns `'MCP'`.""" return 'MCP' ``` #### session ```python session: ServerSession ``` The MCP server session to use for sampling. #### default_max_tokens ```python default_max_tokens: int = 16384 ``` Default max tokens to use if not set in ModelSettings. Max tokens is a required parameter for MCP Sampling, but optional on ModelSettings, so this value is used as fallback. #### model_name ```python model_name: str ``` The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. #### system ```python system: str ``` The system / model provider, returns `'MCP'`. # `pydantic_ai.models.mistral` ## Setup For details on how to set up authentication with this model, see [model configuration for Mistral](https://ai.pydantic.dev/models/mistral/index.md). ### LatestMistralModelNames ```python LatestMistralModelNames = Literal[ "mistral-large-latest", "mistral-small-latest", "codestral-latest", "mistral-moderation-latest", ] ``` Latest Mistral models. ### MistralModelName ```python MistralModelName = str | LatestMistralModelNames ``` Possible Mistral model names. Since Mistral supports a variety of date-stamped models, we explicitly list the most popular models but allow any name in the type hints. Since [the Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list. ### MistralModelSettings Bases: `ModelSettings` Settings used for a Mistral model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python class MistralModelSettings(ModelSettings, total=False): """Settings used for a Mistral model request.""" ``` ### MistralModel Bases: `Model` A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python @dataclass(init=False) class MistralModel(Model): """A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) """ client: Mistral = field(repr=False) json_mode_schema_prompt: str _model_name: MistralModelName = field(repr=False) _provider: Provider[Mistral] = field(repr=False) def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> MistralModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) async def _completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralChatCompletionResponse: """Make a non-streaming request to the model.""" # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. try: response = await self.client.chat.complete_async( model=str(self._model_name), messages=await self._map_messages(messages, model_request_parameters), n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), stream=False, max_tokens=model_settings.get('max_tokens', UNSET), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), random_seed=model_settings.get('seed', UNSET), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) except SDKError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise ModelAPIError(model_name=self.model_name, message=e.message) from e assert response, 'A unexpected empty response from Mistral.' return response async def _stream_completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralEventStreamAsync[MistralCompletionEvent]: """Create a streaming completion request to the Mistral model.""" response: MistralEventStreamAsync[MistralCompletionEvent] | None mistral_messages = await self._map_messages(messages, model_request_parameters) # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. if model_request_parameters.function_tools: # Function Calling response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), max_tokens=model_settings.get('max_tokens', UNSET), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) elif model_request_parameters.output_tools: # TODO: Port to native "manual JSON" mode # Json Mode parameters_json_schemas = [tool.parameters_json_schema for tool in model_request_parameters.output_tools] user_output_format_message = self._generate_user_output_format(parameters_json_schemas) mistral_messages.append(user_output_format_message) response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, response_format={ 'type': 'json_object' }, # TODO: Should be able to use json_schema now: https://docs.mistral.ai/capabilities/structured-output/custom_structured_output/, https://github.com/mistralai/client-python/blob/bc4adf335968c8a272e1ab7da8461c9943d8e701/src/mistralai/extra/utils/response_format.py#L9 stream=True, temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), max_tokens=model_settings.get('max_tokens', UNSET), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) else: # Stream Mode response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, stream=True, http_headers={'User-Agent': get_user_agent()}, ) assert response, 'A unexpected empty response from Mistral.' return response def _get_tool_choice(self, model_request_parameters: ModelRequestParameters) -> MistralToolChoiceEnum | None: """Get tool choice for the model. - "auto": Default mode. Model decides if it uses the tool or not. - "any": Select any tool. - "none": Prevents tool use. - "required": Forces tool use. """ if not model_request_parameters.function_tools and not model_request_parameters.output_tools: return None elif not model_request_parameters.allow_text_output: return 'required' else: return 'auto' def _map_function_and_output_tools_definition( self, model_request_parameters: ModelRequestParameters ) -> list[MistralTool] | None: """Map function and output tools to MistralTool format. Returns None if both function_tools and output_tools are empty. """ tools = [ MistralTool( function=MistralFunction( name=r.name, parameters=r.parameters_json_schema, description=r.description or '' ) ) for r in model_request_parameters.tool_defs.values() ] return tools or None def _process_response(self, response: MistralChatCompletionResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" assert response.choices, 'Unexpected empty response choice.' choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls parts: list[ModelResponsePart] = [] text, thinking = _map_content(content) for thought in thinking: parts.append(ThinkingPart(content=thought)) if text: parts.append(TextPart(content=text)) if isinstance(tool_calls, list): for tool_call in tool_calls: tool = self._map_mistral_to_pydantic_tool_call(tool_call=tool_call) parts.append(tool) raw_finish_reason = choice.finish_reason provider_details: dict[str, Any] = {'finish_reason': raw_finish_reason} if response.created: # pragma: no branch provider_details['timestamp'] = number_to_datetime(response.created) finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, provider_url=self._provider.base_url, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: MistralEventStreamAsync[MistralCompletionEvent], model_request_parameters: ModelRequestParameters, ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return MistralStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.data.model, _provider_name=self._provider.name, _provider_url=self._provider.base_url, _provider_timestamp=number_to_datetime(first_chunk.data.created) if first_chunk.data.created else None, ) @staticmethod def _map_mistral_to_pydantic_tool_call(tool_call: MistralToolCall) -> ToolCallPart: """Maps a MistralToolCall to a ToolCall.""" tool_call_id = tool_call.id or _generate_tool_call_id() func_call = tool_call.function return ToolCallPart(func_call.name, func_call.arguments, tool_call_id) @staticmethod def _map_tool_call(t: ToolCallPart) -> MistralToolCall: """Maps a pydantic-ai ToolCall to a MistralToolCall.""" return MistralToolCall( id=_utils.guard_tool_call_id(t=t), type='function', function=MistralFunctionCall(name=t.tool_name, arguments=t.args or {}), ) def _generate_user_output_format(self, schemas: list[dict[str, Any]]) -> MistralUserMessage: """Get a message with an example of the expected output format.""" examples: list[dict[str, Any]] = [] for schema in schemas: typed_dict_definition: dict[str, Any] = {} for key, value in schema.get('properties', {}).items(): typed_dict_definition[key] = self._get_python_type(value) examples.append(typed_dict_definition) example_schema = examples[0] if len(examples) == 1 else examples return MistralUserMessage(content=self.json_mode_schema_prompt.format(schema=example_schema)) @classmethod def _get_python_type(cls, value: dict[str, Any]) -> str: """Return a string representation of the Python type for a single JSON schema property. This function handles recursion for nested arrays/objects and `anyOf`. """ # 1) Handle anyOf first, because it's a different schema structure if any_of := value.get('anyOf'): # Simplistic approach: pick the first option in anyOf # (In reality, you'd possibly want to merge or union types) return f'Optional[{cls._get_python_type(any_of[0])}]' # 2) If we have a top-level "type" field value_type = value.get('type') if not value_type: # No explicit type; fallback return 'Any' # 3) Direct simple type mapping (string, integer, float, bool, None) if value_type in SIMPLE_JSON_TYPE_MAPPING and value_type != 'array' and value_type != 'object': return SIMPLE_JSON_TYPE_MAPPING[value_type] # 4) Array: Recursively get the item type if value_type == 'array': items = value.get('items', {}) return f'list[{cls._get_python_type(items)}]' # 5) Object: Check for additionalProperties if value_type == 'object': additional_properties = value.get('additionalProperties', {}) if isinstance(additional_properties, bool): return 'bool' # pragma: lax no cover additional_properties_type = additional_properties.get('type') if ( additional_properties_type in SIMPLE_JSON_TYPE_MAPPING and additional_properties_type != 'array' and additional_properties_type != 'object' ): # dict[str, bool/int/float/etc...] return f'dict[str, {SIMPLE_JSON_TYPE_MAPPING[additional_properties_type]}]' elif additional_properties_type == 'array': array_items = additional_properties.get('items', {}) return f'dict[str, list[{cls._get_python_type(array_items)}]]' elif additional_properties_type == 'object': # nested dictionary of unknown shape return 'dict[str, dict[str, Any]]' else: # If no additionalProperties type or something else, default to a generic dict return 'dict[str, Any]' # 6) Fallback return 'Any' @staticmethod def _get_timeout_ms(timeout: Timeout | float | None) -> int | None: """Convert a timeout to milliseconds.""" if timeout is None: return None if isinstance(timeout, float): # pragma: no cover return int(1000 * timeout) raise NotImplementedError('Timeout object is not yet supported for MistralModel.') async def _map_user_message(self, message: ModelRequest) -> AsyncIterable[MistralMessages]: for part in message.parts: if isinstance(part, SystemPromptPart): yield MistralSystemMessage(content=part.content) elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield MistralUserMessage(content=part.model_response()) # pragma: no cover else: yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response(), ) else: assert_never(part) async def _map_messages( # noqa: C901 self, messages: Sequence[ModelMessage], model_request_parameters: ModelRequestParameters ) -> list[MistralMessages]: """Just maps a `pydantic_ai.Message` to a `MistralMessage`.""" mistral_messages: list[MistralMessages] = [] for message in messages: if isinstance(message, ModelRequest): async for msg in self._map_user_message(message): mistral_messages.append(msg) elif isinstance(message, ModelResponse): content_chunks: list[MistralContentChunk] = [] thinking_chunks: list[MistralTextChunk | MistralReferenceChunk] = [] tool_calls: list[MistralToolCall] = [] for part in message.parts: if isinstance(part, TextPart): content_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ThinkingPart): thinking_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ToolCallPart): tool_calls.append(self._map_tool_call(part)) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from mistral pass elif isinstance(part, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(part) if thinking_chunks: content_chunks.insert(0, MistralThinkChunk(thinking=thinking_chunks)) mistral_messages.append(MistralAssistantMessage(content=content_chunks, tool_calls=tool_calls)) else: assert_never(message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in mistral_messages if isinstance(m, MistralSystemMessage)) mistral_messages.insert(system_prompt_count, MistralSystemMessage(content=instructions)) # Post-process messages to insert fake assistant message after tool message if followed by user message # to work around `Unexpected role 'user' after role 'tool'` error. processed_messages: list[MistralMessages] = [] for i, current_message in enumerate(mistral_messages): processed_messages.append(current_message) if isinstance(current_message, MistralToolMessage) and i + 1 < len(mistral_messages): next_message = mistral_messages[i + 1] if isinstance(next_message, MistralUserMessage): # Insert a dummy assistant message processed_messages.append(MistralAssistantMessage(content=[MistralTextChunk(text='OK')])) return processed_messages async def _map_user_prompt(self, part: UserPromptPart) -> MistralUserMessage: content: str | list[MistralContentChunk] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(MistralTextChunk(text=item)) elif isinstance(item, ImageUrl): if item.force_download: downloaded = await download_item(item, data_format='base64_uri') image_url = MistralImageURL(url=downloaded['data']) content.append(MistralImageURLChunk(image_url=image_url, type='image_url')) else: content.append(MistralImageURLChunk(image_url=MistralImageURL(url=item.url))) elif isinstance(item, BinaryContent): if item.is_image: image_url = MistralImageURL(url=item.data_uri) content.append(MistralImageURLChunk(image_url=image_url, type='image_url')) elif item.media_type == 'application/pdf': content.append(MistralDocumentURLChunk(document_url=item.data_uri, type='document_url')) else: raise RuntimeError('BinaryContent other than image or PDF is not supported in Mistral.') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': if item.force_download: downloaded = await download_item(item, data_format='base64_uri') content.append( MistralDocumentURLChunk(document_url=downloaded['data'], type='document_url') ) else: content.append(MistralDocumentURLChunk(document_url=item.url, type='document_url')) else: raise RuntimeError('DocumentUrl other than PDF is not supported in Mistral.') elif isinstance(item, VideoUrl): raise RuntimeError('VideoUrl is not supported in Mistral.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return MistralUserMessage(content=content) ```` #### __init__ ````python __init__( model_name: MistralModelName, *, provider: ( Literal["mistral"] | Provider[Mistral] ) = "mistral", profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = "Answer in JSON Object, respect the format:\n```\n{schema}\n```\n", settings: ModelSettings | None = None ) ```` Initialize a Mistral model. Parameters: | Name | Type | Description | Default | | ------------------------- | -------------------- | ----------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `MistralModelName` | The name of the model to use. | *required* | | `provider` | \`Literal['mistral'] | Provider[Mistral]\` | The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of Provider[Mistral]. If not provided, a new provider will be created using the other parameters. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `json_mode_schema_prompt` | `str` | The prompt to show when the model expects a JSON object as input. | ```` 'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n' ```` | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ```` #### model_name ```python model_name: MistralModelName ``` The model name. #### system ```python system: str ``` The model provider. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a non-streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) ``` ### MistralStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Mistral models. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @dataclass class MistralStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Mistral models.""" _model_name: MistralModelName _response: AsyncIterable[MistralCompletionEvent] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_now_utc) _delta_content: str = field(default='', init=False) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: if self._provider_timestamp is not None: # pragma: no branch self.provider_details = {'timestamp': self._provider_timestamp} chunk: MistralCompletionEvent async for chunk in self._response: self._usage += _map_usage(chunk.data) if chunk.data.id: # pragma: no branch self.provider_response_id = chunk.data.id try: choice = chunk.data.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {**(self.provider_details or {}), 'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Handle the text part of the response content = choice.delta.content text, thinking = _map_content(content) for thought in thinking: for event in self._parts_manager.handle_thinking_delta(vendor_part_id='thinking', content=thought): yield event if text: # Attempt to produce an output tool call from the received text output_tools = {c.name: c for c in self.model_request_parameters.output_tools} if output_tools: self._delta_content += text # TODO: Port to native "manual JSON" mode maybe_tool_call_part = self._try_get_output_tool_from_text(self._delta_content, output_tools) if maybe_tool_call_part: yield self._parts_manager.handle_tool_call_part( vendor_part_id='output', tool_name=maybe_tool_call_part.tool_name, args=maybe_tool_call_part.args_as_dict(), tool_call_id=maybe_tool_call_part.tool_call_id, ) else: for event in self._parts_manager.handle_text_delta(vendor_part_id='content', content=text): yield event # Handle the explicit tool calls for index, dtc in enumerate(choice.delta.tool_calls or []): # It seems that mistral just sends full tool calls, so we just use them directly, rather than building yield self._parts_manager.handle_tool_call_part( vendor_part_id=index, tool_name=dtc.function.name, args=dtc.function.arguments, tool_call_id=dtc.id ) @property def model_name(self) -> MistralModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp @staticmethod def _try_get_output_tool_from_text(text: str, output_tools: dict[str, ToolDefinition]) -> ToolCallPart | None: output_json: dict[str, Any] | None = pydantic_core.from_json(text, allow_partial='trailing-strings') if output_json: for output_tool in output_tools.values(): # NOTE: Additional verification to prevent JSON validation to crash # Ensures required parameters in the JSON schema are respected, especially for stream-based return types. # Example with BaseModel and required fields. if not MistralStreamedResponse._validate_required_json_schema( output_json, output_tool.parameters_json_schema ): continue # The following part_id will be thrown away return ToolCallPart(tool_name=output_tool.name, args=output_json) @staticmethod def _validate_required_json_schema(json_dict: dict[str, Any], json_schema: dict[str, Any]) -> bool: """Validate that all required parameters in the JSON schema are present in the JSON dictionary.""" required_params = json_schema.get('required', []) properties = json_schema.get('properties', {}) for param in required_params: if param not in json_dict: return False param_schema = properties.get(param, {}) param_type = param_schema.get('type') param_items_type = param_schema.get('items', {}).get('type') if param_type == 'array' and param_items_type: if not isinstance(json_dict[param], list): return False for item in json_dict[param]: if not isinstance(item, VALID_JSON_TYPE_MAPPING[param_items_type]): return False elif param_type and not isinstance(json_dict[param], VALID_JSON_TYPE_MAPPING[param_type]): return False if isinstance(json_dict[param], dict) and 'properties' in param_schema: nested_schema = param_schema if not MistralStreamedResponse._validate_required_json_schema(json_dict[param], nested_schema): return False return True ``` #### model_name ```python model_name: MistralModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.openai` ## Setup For details on how to set up authentication with this model, see [model configuration for OpenAI](https://ai.pydantic.dev/models/openai/index.md). ### OpenAIModelName ```python OpenAIModelName = str | AllModels ``` Possible OpenAI model names. Since OpenAI supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the OpenAI docs](https://platform.openai.com/docs/models) for a full list. Using this more broad type for the model name instead of the ChatModel definition allows this model to be used more easily with other model types (ie, Ollama, Deepseek). ### MCP_SERVER_TOOL_CONNECTOR_URI_SCHEME ```python MCP_SERVER_TOOL_CONNECTOR_URI_SCHEME: Literal[ "x-openai-connector" ] = "x-openai-connector" ``` Prefix for OpenAI connector IDs. OpenAI supports either a URL or a connector ID when passing MCP configuration to a model, by using that prefix like `x-openai-connector:` in a URL, you can pass a connector ID to a model. ### OpenAIChatModelSettings Bases: `ModelSettings` Settings used for an OpenAI model request. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIChatModelSettings(ModelSettings, total=False): """Settings used for an OpenAI model request.""" # ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. openai_reasoning_effort: ReasoningEffort """Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. """ openai_logprobs: bool """Include log probabilities in the response. For Chat models, these will be included in `ModelResponse.provider_details['logprobs']`. For Responses models, these will be included in the response output parts `TextPart.provider_details['logprobs']`. """ openai_top_logprobs: int """Include log probabilities of the top n tokens in the response.""" openai_store: bool | None """Whether or not to store the output of this request in OpenAI's systems. If `False`, OpenAI will not store the request for its own internal review or training. See [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat-create-store).""" openai_user: str """A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. """ openai_service_tier: Literal['auto', 'default', 'flex', 'priority'] """The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). """ openai_prediction: ChatCompletionPredictionContentParam """Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. """ openai_prompt_cache_key: str """Used by OpenAI to cache responses for similar requests to optimize your cache hit rates. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. """ openai_prompt_cache_retention: Literal['in_memory', '24h'] """The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. """ openai_continuous_usage_stats: bool """When True, enables continuous usage statistics in streaming responses. When enabled, the API returns cumulative usage data with each chunk rather than only at the end. This setting correctly handles the cumulative nature of these stats by using only the final usage values rather than summing all intermediate values. See [OpenAI's streaming documentation](https://platform.openai.com/docs/api-reference/chat/create#stream_options) for more information. """ ``` #### openai_reasoning_effort ```python openai_reasoning_effort: ReasoningEffort ``` Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. #### openai_logprobs ```python openai_logprobs: bool ``` Include log probabilities in the response. For Chat models, these will be included in `ModelResponse.provider_details['logprobs']`. For Responses models, these will be included in the response output parts `TextPart.provider_details['logprobs']`. #### openai_top_logprobs ```python openai_top_logprobs: int ``` Include log probabilities of the top n tokens in the response. #### openai_store ```python openai_store: bool | None ``` Whether or not to store the output of this request in OpenAI's systems. If `False`, OpenAI will not store the request for its own internal review or training. See [OpenAI API reference](https://platform.openai.com/docs/api-reference/chat/create#chat-create-store). #### openai_user ```python openai_user: str ``` A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. #### openai_service_tier ```python openai_service_tier: Literal[ "auto", "default", "flex", "priority" ] ``` The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). #### openai_prediction ```python openai_prediction: ChatCompletionPredictionContentParam ``` Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. #### openai_prompt_cache_key ```python openai_prompt_cache_key: str ``` Used by OpenAI to cache responses for similar requests to optimize your cache hit rates. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. #### openai_prompt_cache_retention ```python openai_prompt_cache_retention: Literal['in_memory', '24h'] ``` The retention policy for the prompt cache. Set to 24h to enable extended prompt caching, which keeps cached prefixes active for longer, up to a maximum of 24 hours. See the [OpenAI Prompt Caching documentation](https://platform.openai.com/docs/guides/prompt-caching#how-it-works) for more information. #### openai_continuous_usage_stats ```python openai_continuous_usage_stats: bool ``` When True, enables continuous usage statistics in streaming responses. When enabled, the API returns cumulative usage data with each chunk rather than only at the end. This setting correctly handles the cumulative nature of these stats by using only the final usage values rather than summing all intermediate values. See [OpenAI's streaming documentation](https://platform.openai.com/docs/api-reference/chat/create#stream_options) for more information. ### OpenAIModelSettings Bases: `OpenAIChatModelSettings` Deprecated Use `OpenAIChatModelSettings` instead. Deprecated alias for `OpenAIChatModelSettings`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated('Use `OpenAIChatModelSettings` instead.') class OpenAIModelSettings(OpenAIChatModelSettings, total=False): """Deprecated alias for `OpenAIChatModelSettings`.""" ``` ### OpenAIResponsesModelSettings Bases: `OpenAIChatModelSettings` Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIResponsesModelSettings(OpenAIChatModelSettings, total=False): """Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_builtin_tools: Sequence[FileSearchToolParam | WebSearchToolParam | ComputerToolParam] """The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. """ openai_reasoning_generate_summary: Literal['detailed', 'concise'] """Deprecated alias for `openai_reasoning_summary`.""" openai_reasoning_summary: Literal['detailed', 'concise', 'auto'] """A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise`, `detailed`, or `auto`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. """ openai_send_reasoning_ids: bool """Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../../message-history.md#processing-message-history). In that case, you'll want to disable this. """ openai_truncation: Literal['disabled', 'auto'] """The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. """ openai_text_verbosity: Literal['low', 'medium', 'high'] """Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. """ openai_previous_response_id: Literal['auto'] | str """The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. """ openai_include_code_execution_outputs: bool """Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. """ openai_include_web_search_sources: bool """Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. """ openai_include_file_search_results: bool """Whether to include the file search results in the response. Corresponds to the `file_search_call.results` value of the `include` parameter in the Responses API. """ openai_include_raw_annotations: bool """Whether to include the raw annotations in `TextPart.provider_details`. When enabled, any annotations (e.g., citations from web search) will be available in the `provider_details['annotations']` field of text parts. This is opt-in since there may be overlap with native annotation support once added via https://github.com/pydantic/pydantic-ai/issues/3126. """ ``` #### openai_builtin_tools ```python openai_builtin_tools: Sequence[ FileSearchToolParam | WebSearchToolParam | ComputerToolParam ] ``` The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. #### openai_reasoning_generate_summary ```python openai_reasoning_generate_summary: Literal[ "detailed", "concise" ] ``` Deprecated alias for `openai_reasoning_summary`. #### openai_reasoning_summary ```python openai_reasoning_summary: Literal[ "detailed", "concise", "auto" ] ``` A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise`, `detailed`, or `auto`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. #### openai_send_reasoning_ids ```python openai_send_reasoning_ids: bool ``` Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](https://ai.pydantic.dev/message-history/#processing-message-history). In that case, you'll want to disable this. #### openai_truncation ```python openai_truncation: Literal['disabled', 'auto'] ``` The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. #### openai_text_verbosity ```python openai_text_verbosity: Literal['low', 'medium', 'high'] ``` Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. #### openai_previous_response_id ```python openai_previous_response_id: Literal['auto'] | str ``` The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. #### openai_include_code_execution_outputs ```python openai_include_code_execution_outputs: bool ``` Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. #### openai_include_web_search_sources ```python openai_include_web_search_sources: bool ``` Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. #### openai_include_file_search_results ```python openai_include_file_search_results: bool ``` Whether to include the file search results in the response. Corresponds to the `file_search_call.results` value of the `include` parameter in the Responses API. #### openai_include_raw_annotations ```python openai_include_raw_annotations: bool ``` Whether to include the raw annotations in `TextPart.provider_details`. When enabled, any annotations (e.g., citations from web search) will be available in the `provider_details['annotations']` field of text parts. This is opt-in since there may be overlap with native annotation support once added via https://github.com/pydantic/pydantic-ai/issues/3126. ### OpenAIChatModel Bases: `Model` A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIChatModel(Model): """A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) @overload def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIChatCompatibleProvider | Literal[ 'openai', 'openai-chat', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') @overload def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIChatCompatibleProvider | Literal[ 'openai', 'openai-chat', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIChatCompatibleProvider | Literal[ 'openai', 'openai-chat', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/openai' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool}) @cached_property def profile(self) -> ModelProfile: """The model profile. WebSearchTool is only supported if openai_chat_supports_web_search is True. """ _profile = super().profile openai_profile = OpenAIModelProfile.from_profile(_profile) if not openai_profile.openai_chat_supports_web_search: new_tools = _profile.supported_builtin_tools - {WebSearchTool} _profile = replace(_profile, supported_builtin_tools=new_tools) return _profile @property @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') def system_prompt_role(self) -> OpenAISystemPromptRole | None: return OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: # Check for WebSearchTool before base validation to provide a helpful error message if ( any(isinstance(tool, WebSearchTool) for tool in model_request_parameters.builtin_tools) and not OpenAIModelProfile.from_profile(self.profile).openai_chat_supports_web_search ): raise UserError( f'WebSearchTool is not supported with `OpenAIChatModel` and model {self.model_name!r}. ' f'Please use `OpenAIResponsesModel` instead.' ) return super().prepare_request(model_settings, model_request_parameters) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, False, cast(OpenAIChatModelSettings, model_settings or {}), model_request_parameters ) # Handle ModelResponse returned directly (for content filters) if isinstance(response, ModelResponse): return response model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings_cast = cast(OpenAIChatModelSettings, model_settings or {}) response = await self._completions_create(messages, True, model_settings_cast, model_request_parameters) async with response: yield await self._process_streamed_response(response, model_request_parameters, model_settings_cast) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[ChatCompletionChunk]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | ModelResponse: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[ChatCompletionChunk] | ModelResponse: tools = self._get_tools(model_request_parameters) web_search_options = self._get_web_search_options(model_request_parameters) profile = OpenAIModelProfile.from_profile(self.profile) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output and profile.openai_supports_tool_choice_required: tool_choice = 'required' else: tool_choice = 'auto' openai_messages = await self._map_messages(messages, model_request_parameters) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} _drop_sampling_params_for_reasoning(profile, model_settings) _drop_unsupported_params(profile, model_settings) try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) # OpenAI SDK type stubs incorrectly use 'in-memory' but API requires 'in_memory', so we have to use `Any` to not hit type errors prompt_cache_retention: Any = model_settings.get('openai_prompt_cache_retention', OMIT) return await self.client.chat.completions.create( model=self.model_name, messages=openai_messages, parallel_tool_calls=model_settings.get('parallel_tool_calls', OMIT), tools=tools or OMIT, tool_choice=tool_choice or OMIT, stream=stream, stream_options=self._get_stream_options(model_settings) if stream else OMIT, stop=model_settings.get('stop_sequences', OMIT), max_completion_tokens=model_settings.get('max_tokens', OMIT), timeout=model_settings.get('timeout', NOT_GIVEN), response_format=response_format or OMIT, seed=model_settings.get('seed', OMIT), reasoning_effort=model_settings.get('openai_reasoning_effort', OMIT), user=model_settings.get('openai_user', OMIT), web_search_options=web_search_options or OMIT, service_tier=model_settings.get('openai_service_tier', OMIT), prediction=model_settings.get('openai_prediction', OMIT), temperature=model_settings.get('temperature', OMIT), top_p=model_settings.get('top_p', OMIT), presence_penalty=model_settings.get('presence_penalty', OMIT), frequency_penalty=model_settings.get('frequency_penalty', OMIT), logit_bias=model_settings.get('logit_bias', OMIT), logprobs=model_settings.get('openai_logprobs', OMIT), top_logprobs=model_settings.get('openai_top_logprobs', OMIT), store=model_settings.get('openai_store', OMIT), prompt_cache_key=model_settings.get('openai_prompt_cache_key', OMIT), prompt_cache_retention=prompt_cache_retention, extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if model_response := _check_azure_content_filter(e, self.system, self.model_name): return model_response if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover except APIConnectionError as e: raise ModelAPIError(model_name=self.model_name, message=e.message) from e def _validate_completion(self, response: chat.ChatCompletion) -> chat.ChatCompletion: """Hook that validates chat completions before processing. This method may be overridden by subclasses of `OpenAIChatModel` to apply custom completion validations. """ return chat.ChatCompletion.model_validate(response.model_dump()) def _process_provider_details(self, response: chat.ChatCompletion) -> dict[str, Any] | None: """Hook that response content to provider details. This method may be overridden by subclasses of `OpenAIChatModel` to apply custom mappings. """ return _map_provider_details(response.choices[0]) def _process_response(self, response: chat.ChatCompletion | str) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" # Although the OpenAI SDK claims to return a Pydantic model (`ChatCompletion`) from the chat completions function: # * it hasn't actually performed validation (presumably they're creating the model with `model_construct` or something?!) # * if the endpoint returns plain text, the return type is a string # Thus we validate it fully here. if not isinstance(response, chat.ChatCompletion): raise UnexpectedModelBehavior( f'Invalid response from {self.system} chat completions endpoint, expected JSON data' ) timestamp = _now_utc() if not response.created: response.created = int(timestamp.timestamp()) # Workaround for local Ollama which sometimes returns a `None` finish reason. if response.choices and (choice := response.choices[0]) and choice.finish_reason is None: # pyright: ignore[reportUnnecessaryComparison] choice.finish_reason = 'stop' try: response = self._validate_completion(response) except ValidationError as e: raise UnexpectedModelBehavior(f'Invalid response from {self.system} chat completions endpoint: {e}') from e choice = response.choices[0] items: list[ModelResponsePart] = [] if thinking_parts := self._process_thinking(choice.message): items.extend(thinking_parts) if choice.message.content: items.extend( (replace(part, id='content', provider_name=self.system) if isinstance(part, ThinkingPart) else part) for part in split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags) ) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: if isinstance(c, ChatCompletionMessageFunctionToolCall): part = ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id) elif isinstance(c, ChatCompletionMessageCustomToolCall): # pragma: no cover # NOTE: Custom tool calls are not supported. # See for more details. raise RuntimeError('Custom tool calls are not supported') else: assert_never(c) part.tool_call_id = _guard_tool_call_id(part) items.append(part) provider_details = self._process_provider_details(response) if response.created: # pragma: no branch if provider_details is None: provider_details = {} provider_details['timestamp'] = number_to_datetime(response.created) return ModelResponse( parts=items, usage=self._map_usage(response), model_name=response.model, timestamp=timestamp, provider_details=provider_details or None, provider_response_id=response.id, provider_name=self._provider.name, provider_url=self._provider.base_url, finish_reason=self._map_finish_reason(choice.finish_reason), ) def _process_thinking(self, message: chat.ChatCompletionMessage) -> list[ThinkingPart] | None: """Hook that maps reasoning tokens to thinking parts. This method may be overridden by subclasses of `OpenAIChatModel` to apply custom mappings. """ profile = OpenAIModelProfile.from_profile(self.profile) custom_field = profile.openai_chat_thinking_field items: list[ThinkingPart] = [] # Prefer the configured custom reasoning field, if present in profile. # Fall back to built-in fields if no custom field result was found. # The `reasoning_content` field is typically present in DeepSeek and Moonshot models. # https://api-docs.deepseek.com/guides/reasoning_model # The `reasoning` field is typically present in gpt-oss via Ollama and OpenRouter. # - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api # - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens for field_name in (custom_field, 'reasoning', 'reasoning_content'): if not field_name: continue reasoning: str | None = getattr(message, field_name, None) if reasoning: # pragma: no branch items.append(ThinkingPart(id=field_name, content=reasoning, provider_name=self.system)) return items return items or None async def _process_streamed_response( self, response: AsyncStream[ChatCompletionChunk], model_request_parameters: ModelRequestParameters, model_settings: OpenAIChatModelSettings | None = None, ) -> OpenAIStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) # When using Azure OpenAI and a content filter is enabled, the first chunk will contain a `''` model name, # so we set it from a later chunk in `OpenAIChatStreamedResponse`. model_name = first_chunk.model or self.model_name return self._streamed_response_cls( model_request_parameters=model_request_parameters, _model_name=model_name, _model_profile=self.profile, _response=peekable_response, _provider_name=self._provider.name, _provider_url=self._provider.base_url, _provider_timestamp=number_to_datetime(first_chunk.created) if first_chunk.created else None, _model_settings=model_settings, ) @property def _streamed_response_cls(self) -> type[OpenAIStreamedResponse]: """Returns the `StreamedResponse` type that will be used for streamed responses. This method may be overridden by subclasses of `OpenAIChatModel` to provide their own `StreamedResponse` type. """ return OpenAIStreamedResponse def _map_usage(self, response: chat.ChatCompletion) -> usage.RequestUsage: return _map_usage(response, self._provider.name, self._provider.base_url, self.model_name) def _get_stream_options(self, model_settings: OpenAIChatModelSettings) -> chat.ChatCompletionStreamOptionsParam: """Build stream_options for the API request. Returns a dict with include_usage=True and optionally continuous_usage_stats if configured. """ options: dict[str, bool] = {'include_usage': True} if model_settings.get('openai_continuous_usage_stats'): options['continuous_usage_stats'] = True return cast(chat.ChatCompletionStreamOptionsParam, options) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_web_search_options(self, model_request_parameters: ModelRequestParameters) -> WebSearchOptions | None: for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): # pragma: no branch if tool.user_location: return WebSearchOptions( search_context_size=tool.search_context_size, user_location=WebSearchOptionsUserLocation( type='approximate', approximate=WebSearchOptionsUserLocationApproximate(**tool.user_location), ), ) return WebSearchOptions(search_context_size=tool.search_context_size) return None @dataclass class _MapModelResponseContext: """Context object for mapping a `ModelResponse` to OpenAI chat completion parameters. This class is designed to be subclassed to add new fields for custom logic, collecting various parts of the model response (like text and tool calls) to form a single assistant message. """ _model: OpenAIChatModel texts: list[str] = field(default_factory=list[str]) thinkings: dict[str, list[str]] = field(default_factory=dict[str, list[str]]) tool_calls: list[ChatCompletionMessageFunctionToolCallParam] = field( default_factory=list[ChatCompletionMessageFunctionToolCallParam] ) def map_assistant_message(self, message: ModelResponse) -> chat.ChatCompletionAssistantMessageParam: for item in message.parts: if isinstance(item, TextPart): self._map_response_text_part(item) elif isinstance(item, ThinkingPart): self._map_response_thinking_part(item) elif isinstance(item, ToolCallPart): self._map_response_tool_call_part(item) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover self._map_response_builtin_part(item) elif isinstance(item, FilePart): # pragma: no cover self._map_response_file_part(item) else: assert_never(item) return self._into_message_param() def _into_message_param(self) -> chat.ChatCompletionAssistantMessageParam: """Converts the collected texts and tool calls into a single OpenAI `ChatCompletionAssistantMessageParam`. This method serves as a hook that can be overridden by subclasses to implement custom logic for how collected parts are transformed into the final message parameter. Returns: An OpenAI `ChatCompletionAssistantMessageParam` object representing the assistant's response. """ message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: if self.thinkings: for field_name, contents in self.thinkings.items(): message_param[field_name] = '\n\n'.join(contents) if self.texts: message_param['content'] = '\n\n'.join(self.texts) else: message_param['content'] = None if self.tool_calls: message_param['tool_calls'] = self.tool_calls return message_param def _map_response_text_part(self, item: TextPart) -> None: """Maps a `TextPart` to the response context. This method serves as a hook that can be overridden by subclasses to implement custom logic for handling text parts. """ self.texts.append(item.content) def _map_response_thinking_part(self, item: ThinkingPart) -> None: """Maps a `ThinkingPart` to the response context. This method serves as a hook that can be overridden by subclasses to implement custom logic for handling thinking parts. """ profile = OpenAIModelProfile.from_profile(self._model.profile) include_method = profile.openai_chat_send_back_thinking_parts # Auto-detect: if thinking came from a custom field and from the same provider, use field mode # id='content' means it came from tags in content, not a custom field if include_method == 'auto': # Check if thinking came from a custom field from the same provider custom_field = profile.openai_chat_thinking_field matches_custom_field = (not custom_field) or (item.id == custom_field) if ( item.id and item.id != 'content' and item.provider_name == self._model.system and matches_custom_field ): # Store both content and field name for later use in _into_message_param self.thinkings.setdefault(item.id, []).append(item.content) else: # Fall back to tags mode start_tag, end_tag = self._model.profile.thinking_tags self.texts.append('\n'.join([start_tag, item.content, end_tag])) elif include_method == 'tags': start_tag, end_tag = self._model.profile.thinking_tags self.texts.append('\n'.join([start_tag, item.content, end_tag])) elif include_method == 'field': field = profile.openai_chat_thinking_field if field: # pragma: no branch self.thinkings.setdefault(field, []).append(item.content) def _map_response_tool_call_part(self, item: ToolCallPart) -> None: """Maps a `ToolCallPart` to the response context. This method serves as a hook that can be overridden by subclasses to implement custom logic for handling tool call parts. """ self.tool_calls.append(self._model._map_tool_call(item)) def _map_response_builtin_part(self, item: BuiltinToolCallPart | BuiltinToolReturnPart) -> None: """Maps a built-in tool call or return part to the response context. This method serves as a hook that can be overridden by subclasses to implement custom logic for handling built-in tool parts. """ # OpenAI doesn't return built-in tool calls pass def _map_response_file_part(self, item: FilePart) -> None: """Maps a `FilePart` to the response context. This method serves as a hook that can be overridden by subclasses to implement custom logic for handling file parts. """ # Files generated by models are not sent back to models that don't themselves generate files. pass def _map_model_response(self, message: ModelResponse) -> chat.ChatCompletionMessageParam: """Hook that determines how `ModelResponse` is mapped into `ChatCompletionMessageParam` objects before sending. Subclasses of `OpenAIChatModel` may override this method to provide their own mapping logic. """ return self._MapModelResponseContext(self).map_assistant_message(message) def _map_finish_reason( self, key: Literal['stop', 'length', 'tool_calls', 'content_filter', 'function_call'] ) -> FinishReason | None: """Hooks that maps a finish reason key to a [FinishReason][pydantic_ai.messages.FinishReason]. This method may be overridden by subclasses of `OpenAIChatModel` to accommodate custom keys. """ return _CHAT_FINISH_REASON_MAP.get(key) async def _map_messages( self, messages: Sequence[ModelMessage], model_request_parameters: ModelRequestParameters ) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `openai.types.ChatCompletionMessageParam`.""" openai_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): openai_messages.append(item) elif isinstance(message, ModelResponse): openai_messages.append(self._map_model_response(message)) else: assert_never(message) if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in openai_messages if m.get('role') == 'system') openai_messages.insert( system_prompt_count, chat.ChatCompletionSystemMessageParam(content=instructions, role='system') ) return openai_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionMessageFunctionToolCallParam: return ChatCompletionMessageFunctionToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatJSONSchema = { # pyright: ignore[reportPrivateImportUsage] 'type': 'json_schema', 'json_schema': {'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema}, } if o.description: response_format_param['json_schema']['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['json_schema']['strict'] = o.strict return response_format_param def _map_tool_definition(self, f: ToolDefinition) -> chat.ChatCompletionToolParam: tool_param: chat.ChatCompletionToolParam = { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } if f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: tool_param['function']['strict'] = f.strict return tool_param async def _map_user_message(self, message: ModelRequest) -> AsyncIterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): system_prompt_role = OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role if system_prompt_role == 'developer': yield chat.ChatCompletionDeveloperMessageParam(role='developer', content=part.content) elif system_prompt_role == 'user': yield chat.ChatCompletionUserMessageParam(role='user', content=part.content) else: yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield chat.ChatCompletionUserMessageParam(role='user', content=part.model_response()) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: # noqa: C901 profile = OpenAIModelProfile.from_profile(self.profile) content: str | list[ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url: ImageURL = {'url': item.url} if metadata := item.vendor_metadata: image_url['detail'] = metadata.get('detail', 'auto') if item.force_download: image_content = await download_item(item, data_format='base64_uri', type_format='extension') image_url['url'] = image_content['data'] content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): if self._is_text_like_media_type(item.media_type): # Inline text-like binary content as a text block content.append( self._inline_text_file_part( item.data.decode('utf-8'), media_type=item.media_type, identifier=item.identifier, ) ) elif item.is_image: image_url = ImageURL(url=item.data_uri) if metadata := item.vendor_metadata: image_url['detail'] = metadata.get('detail', 'auto') content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif item.is_audio: assert item.format in ('wav', 'mp3') if profile.openai_chat_audio_input_encoding == 'uri': audio = InputAudio(data=item.data_uri, format=item.format) else: audio = InputAudio(data=item.base64, format=item.format) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif item.is_document: content.append( File( file=FileFile( file_data=item.data_uri, filename=f'filename.{item.format}', ), type='file', ) ) else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): data_format = 'base64_uri' if profile.openai_chat_audio_input_encoding == 'uri' else 'base64' downloaded_item = await download_item(item, data_format=data_format, type_format='extension') assert downloaded_item['data_type'] in ( 'wav', 'mp3', ), f'Unsupported audio format: {downloaded_item["data_type"]}' audio = InputAudio(data=downloaded_item['data'], format=downloaded_item['data_type']) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif isinstance(item, DocumentUrl): # OpenAI Chat API's FileFile only supports base64-encoded data, not URLs. # Some providers (e.g., OpenRouter) support URLs via the profile flag. if not item.force_download and profile.openai_chat_supports_file_urls: content.append( File( file=FileFile( file_data=item.url, filename=f'filename.{item.format}', ), type='file', ) ) elif self._is_text_like_media_type(item.media_type): downloaded_text = await download_item(item, data_format='text') content.append( self._inline_text_file_part( downloaded_text['data'], media_type=item.media_type, identifier=item.identifier, ) ) else: downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( File( file=FileFile( file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ), type='file', ) ) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI') elif isinstance(item, CachePoint): # OpenAI doesn't support prompt caching via CachePoint, so we filter it out pass else: assert_never(item) return chat.ChatCompletionUserMessageParam(role='user', content=content) @staticmethod def _is_text_like_media_type(media_type: str) -> bool: return ( media_type.startswith('text/') or media_type == 'application/json' or media_type.endswith('+json') or media_type == 'application/xml' or media_type.endswith('+xml') or media_type in ('application/x-yaml', 'application/yaml') ) @staticmethod def _inline_text_file_part(text: str, *, media_type: str, identifier: str) -> ChatCompletionContentPartTextParam: text = '\n'.join( [ f'-----BEGIN FILE id="{identifier}" type="{media_type}"-----', text, f'-----END FILE id="{identifier}"-----', ] ) return ChatCompletionContentPartTextParam(text=text, type='text') ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( OpenAIChatCompatibleProvider | Literal["openai", "openai-chat", "gateway"] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( OpenAIChatCompatibleProvider | Literal["openai", "openai-chat", "gateway"] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( OpenAIChatCompatibleProvider | Literal["openai", "openai-chat", "gateway"] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI model. Parameters: | Name | Type | Description | Default | | -------------------- | ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. List of model names available here (Unfortunately, despite being ask to do so, OpenAI do not provide .inv files for their API). | *required* | | `provider` | \`OpenAIChatCompatibleProvider | Literal['openai', 'openai-chat', 'gateway'] | Provider[AsyncOpenAI]\` | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `system_prompt_role` | \`OpenAISystemPromptRole | None\` | The role to use for the system prompt message. If not provided, defaults to 'system'. In the future, this may be inferred from the model name. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIChatCompatibleProvider | Literal[ 'openai', 'openai-chat', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/openai' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` Return the set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool}) ``` #### profile ```python profile: ModelProfile ``` The model profile. WebSearchTool is only supported if openai_chat_supports_web_search is True. ### OpenAIModel Bases: `OpenAIChatModel` Deprecated `OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or require a feature that the Responses API doesn't support yet like audio. Deprecated alias for `OpenAIChatModel`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated( '`OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which ' "uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or " "require a feature that the Responses API doesn't support yet like audio." ) @dataclass(init=False) class OpenAIModel(OpenAIChatModel): """Deprecated alias for `OpenAIChatModel`.""" ``` ### OpenAIResponsesModel Bases: `Model` A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIResponsesModel(Model): """A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIResponsesCompatibleProvider | Literal[ 'openai', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/openai' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, FileSearchTool, MCPServerTool, ImageGenerationTool}) async def request( self, messages: list[ModelRequest | ModelResponse], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._responses_create( messages, False, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) # Handle ModelResponse if isinstance(response, ModelResponse): return response return self._process_response( response, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._responses_create( messages, True, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response( response, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) def _process_response( # noqa: C901 self, response: responses.Response, model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" items: list[ModelResponsePart] = [] for item in response.output: if isinstance(item, responses.ResponseReasoningItem): signature = item.encrypted_content # Handle raw CoT content from gpt-oss models provider_details: dict[str, Any] = {} raw_content: list[str] | None = [c.text for c in item.content] if item.content else None if raw_content: provider_details['raw_content'] = raw_content if item.summary: for summary in item.summary: # We use the same id for all summaries so that we can merge them on the round trip. items.append( ThinkingPart( content=summary.text, id=item.id, signature=signature, provider_name=self.system, provider_details=provider_details or None, ) ) # We only need to store the signature and raw_content once. signature = None provider_details = {} elif signature or provider_details: items.append( ThinkingPart( content='', id=item.id, signature=signature, provider_name=self.system, provider_details=provider_details or None, ) ) elif isinstance(item, responses.ResponseOutputMessage): for content in item.content: if isinstance(content, responses.ResponseOutputText): # pragma: no branch part_provider_details: dict[str, Any] | None = None if content.logprobs: part_provider_details = {'logprobs': _map_logprobs(content.logprobs)} if model_settings.get('openai_include_raw_annotations') and content.annotations: part_provider_details = part_provider_details or {} part_provider_details['annotations'] = responses_output_text_annotations_ta.dump_python( list(content.annotations), warnings=False ) items.append( TextPart( content.text, id=item.id, provider_name=self.system, provider_details=part_provider_details, ) ) elif isinstance(item, responses.ResponseFunctionToolCall): items.append( ToolCallPart( item.name, item.arguments, tool_call_id=item.call_id, id=item.id, provider_name=self.system ) ) elif isinstance(item, responses.ResponseCodeInterpreterToolCall): call_part, return_part, file_parts = _map_code_interpreter_tool_call(item, self.system) items.append(call_part) if file_parts: items.extend(file_parts) items.append(return_part) elif isinstance(item, responses.ResponseFunctionWebSearch): call_part, return_part = _map_web_search_tool_call(item, self.system) items.append(call_part) items.append(return_part) elif isinstance(item, responses.response_output_item.ImageGenerationCall): call_part, return_part, file_part = _map_image_generation_tool_call(item, self.system) items.append(call_part) if file_part: # pragma: no branch items.append(file_part) items.append(return_part) elif isinstance(item, responses.ResponseComputerToolCall): # pragma: no cover # Pydantic AI doesn't yet support the ComputerUse built-in tool pass elif isinstance(item, responses.ResponseCustomToolCall): # pragma: no cover # Support is being implemented in https://github.com/pydantic/pydantic-ai/pull/2572 pass elif isinstance(item, responses.response_output_item.LocalShellCall): # pragma: no cover # Pydantic AI doesn't yet support the `codex-mini-latest` LocalShell built-in tool pass elif isinstance(item, responses.ResponseFileSearchToolCall): call_part, return_part = _map_file_search_tool_call(item, self.system) items.append(call_part) items.append(return_part) elif isinstance(item, responses.response_output_item.McpCall): call_part, return_part = _map_mcp_call(item, self.system) items.append(call_part) items.append(return_part) elif isinstance(item, responses.response_output_item.McpListTools): call_part, return_part = _map_mcp_list_tools(item, self.system) items.append(call_part) items.append(return_part) elif isinstance(item, responses.response_output_item.McpApprovalRequest): # pragma: no cover # Pydantic AI doesn't yet support McpApprovalRequest (explicit tool usage approval) pass finish_reason: FinishReason | None = None provider_details: dict[str, Any] = {} raw_finish_reason = details.reason if (details := response.incomplete_details) else response.status if raw_finish_reason: provider_details['finish_reason'] = raw_finish_reason finish_reason = _RESPONSES_FINISH_REASON_MAP.get(raw_finish_reason) if response.created_at: # pragma: no branch provider_details['timestamp'] = number_to_datetime(response.created_at) return ModelResponse( parts=items, usage=_map_usage(response, self._provider.name, self._provider.base_url, self.model_name), model_name=response.model, provider_response_id=response.id, timestamp=_now_utc(), provider_name=self._provider.name, provider_url=self._provider.base_url, finish_reason=finish_reason, provider_details=provider_details or None, ) async def _process_streamed_response( self, response: AsyncStream[responses.ResponseStreamEvent], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> OpenAIResponsesStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): # pragma: no cover raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') assert isinstance(first_chunk, responses.ResponseCreatedEvent) return OpenAIResponsesStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.response.model, _model_settings=model_settings, _response=peekable_response, _provider_name=self._provider.name, _provider_url=self._provider.base_url, _provider_timestamp=number_to_datetime(first_chunk.response.created_at) if first_chunk.response.created_at else None, ) @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[False], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response: ... @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[True], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[responses.ResponseStreamEvent]: ... async def _responses_create( # noqa: C901 self, messages: list[ModelRequest | ModelResponse], stream: bool, model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response | AsyncStream[responses.ResponseStreamEvent] | ModelResponse: tools = ( self._get_builtin_tools(model_request_parameters) + list(model_settings.get('openai_builtin_tools', [])) + self._get_tools(model_request_parameters) ) profile = OpenAIModelProfile.from_profile(self.profile) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output and profile.openai_supports_tool_choice_required: tool_choice = 'required' else: tool_choice = 'auto' previous_response_id = model_settings.get('openai_previous_response_id') if previous_response_id == 'auto': previous_response_id, messages = self._get_previous_response_id_and_new_messages(messages) instructions, openai_messages = await self._map_messages(messages, model_settings, model_request_parameters) reasoning = self._get_reasoning(model_settings) text: responses.ResponseTextConfigParam | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None text = {'format': self._map_json_schema(output_object)} elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch text = {'format': {'type': 'json_object'}} # Without this trick, we'd hit this error: # > Response input messages must contain the word 'json' in some form to use 'text.format' of type 'json_object'. # Apparently they're only checking input messages for "JSON", not instructions. assert isinstance(instructions, str) system_prompt_count = sum(1 for m in openai_messages if m.get('role') == 'system') openai_messages.insert( system_prompt_count, responses.EasyInputMessageParam(role='system', content=instructions) ) instructions = OMIT if verbosity := model_settings.get('openai_text_verbosity'): text = text or {} text['verbosity'] = verbosity _drop_sampling_params_for_reasoning(profile, model_settings) _drop_unsupported_params(profile, model_settings) include: list[responses.ResponseIncludable] = [] if profile.openai_supports_encrypted_reasoning_content: include.append('reasoning.encrypted_content') if model_settings.get('openai_include_code_execution_outputs'): include.append('code_interpreter_call.outputs') if model_settings.get('openai_include_web_search_sources'): include.append('web_search_call.action.sources') if model_settings.get('openai_include_file_search_results'): include.append('file_search_call.results') if model_settings.get('openai_logprobs'): include.append('message.output_text.logprobs') # When there are no input messages and we're not reusing a previous response, # the OpenAI API will reject a request without any input, # even if there are instructions. # To avoid this provide an explicit empty user message. if not openai_messages and not previous_response_id: openai_messages.append( responses.EasyInputMessageParam( role='user', content='', ) ) try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) # OpenAI SDK type stubs incorrectly use 'in-memory' but API requires 'in_memory', so we have to use `Any` to not hit type errors prompt_cache_retention: Any = model_settings.get('openai_prompt_cache_retention', OMIT) return await self.client.responses.create( input=openai_messages, model=self.model_name, instructions=instructions, parallel_tool_calls=model_settings.get('parallel_tool_calls', OMIT), tools=tools or OMIT, tool_choice=tool_choice or OMIT, max_output_tokens=model_settings.get('max_tokens', OMIT), stream=stream, temperature=model_settings.get('temperature', OMIT), top_p=model_settings.get('top_p', OMIT), truncation=model_settings.get('openai_truncation', OMIT), timeout=model_settings.get('timeout', NOT_GIVEN), service_tier=model_settings.get('openai_service_tier', OMIT), previous_response_id=previous_response_id or OMIT, top_logprobs=model_settings.get('openai_top_logprobs', OMIT), store=model_settings.get('openai_store', OMIT), reasoning=reasoning, user=model_settings.get('openai_user', OMIT), text=text or OMIT, include=include or OMIT, prompt_cache_key=model_settings.get('openai_prompt_cache_key', OMIT), prompt_cache_retention=prompt_cache_retention, extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if model_response := _check_azure_content_filter(e, self.system, self.model_name): return model_response if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover except APIConnectionError as e: raise ModelAPIError(model_name=self.model_name, message=e.message) from e def _get_reasoning(self, model_settings: OpenAIResponsesModelSettings) -> Reasoning | Omit: reasoning_effort = model_settings.get('openai_reasoning_effort', None) reasoning_summary = model_settings.get('openai_reasoning_summary', None) reasoning_generate_summary = model_settings.get('openai_reasoning_generate_summary', None) if reasoning_summary and reasoning_generate_summary: # pragma: no cover raise ValueError('`openai_reasoning_summary` and `openai_reasoning_generate_summary` cannot both be set.') if reasoning_generate_summary is not None: # pragma: no cover warnings.warn( '`openai_reasoning_generate_summary` is deprecated, use `openai_reasoning_summary` instead', DeprecationWarning, ) reasoning_summary = reasoning_generate_summary reasoning: Reasoning = {} if reasoning_effort: reasoning['effort'] = reasoning_effort if reasoning_summary: reasoning['summary'] = reasoning_summary return reasoning or OMIT def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.FunctionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.ToolParam]: tools: list[responses.ToolParam] = [] has_image_generating_tool = False for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): web_search_tool = responses.WebSearchToolParam( type='web_search', search_context_size=tool.search_context_size ) if tool.user_location: web_search_tool['user_location'] = responses.web_search_tool_param.UserLocation( type='approximate', **tool.user_location ) if tool.allowed_domains: web_search_tool['filters'] = responses.web_search_tool_param.Filters( allowed_domains=tool.allowed_domains ) tools.append(web_search_tool) elif isinstance(tool, FileSearchTool): file_search_tool = cast( responses.FileSearchToolParam, {'type': 'file_search', 'vector_store_ids': list(tool.file_store_ids)}, ) tools.append(file_search_tool) elif isinstance(tool, CodeExecutionTool): has_image_generating_tool = True tools.append({'type': 'code_interpreter', 'container': {'type': 'auto'}}) elif isinstance(tool, MCPServerTool): mcp_tool = responses.tool_param.Mcp( type='mcp', server_label=tool.id, require_approval='never', ) if tool.authorization_token: # pragma: no branch mcp_tool['authorization'] = tool.authorization_token if tool.allowed_tools is not None: # pragma: no branch mcp_tool['allowed_tools'] = tool.allowed_tools if tool.description: # pragma: no branch mcp_tool['server_description'] = tool.description if tool.headers: # pragma: no branch mcp_tool['headers'] = tool.headers if tool.url.startswith(MCP_SERVER_TOOL_CONNECTOR_URI_SCHEME + ':'): _, connector_id = tool.url.split(':', maxsplit=1) mcp_tool['connector_id'] = connector_id # pyright: ignore[reportGeneralTypeIssues] else: mcp_tool['server_url'] = tool.url tools.append(mcp_tool) elif isinstance(tool, ImageGenerationTool): # pragma: no branch has_image_generating_tool = True size = _resolve_openai_image_generation_size(tool) output_compression = tool.output_compression if tool.output_compression is not None else 100 tools.append( responses.tool_param.ImageGeneration( type='image_generation', background=tool.background, input_fidelity=tool.input_fidelity, moderation=tool.moderation, output_compression=output_compression, output_format=tool.output_format or 'png', partial_images=tool.partial_images, quality=tool.quality, size=size, ) ) else: raise UserError( # pragma: no cover f'`{tool.__class__.__name__}` is not supported by `OpenAIResponsesModel`. If it should be, please file an issue.' ) if model_request_parameters.allow_image_output and not has_image_generating_tool: tools.append({'type': 'image_generation'}) return tools def _map_tool_definition(self, f: ToolDefinition) -> responses.FunctionToolParam: return { 'name': f.name, 'parameters': f.parameters_json_schema, 'type': 'function', 'description': f.description, 'strict': bool( f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition ), } def _get_previous_response_id_and_new_messages( self, messages: list[ModelMessage] ) -> tuple[str | None, list[ModelMessage]]: # When `openai_previous_response_id` is set to 'auto', the most recent # `provider_response_id` from the message history is selected and all # earlier messages are omitted. This allows the OpenAI SDK to reuse # server-side history for efficiency. The returned tuple contains the # `previous_response_id` (if found) and the trimmed list of messages. previous_response_id = None trimmed_messages: list[ModelMessage] = [] for m in reversed(messages): if isinstance(m, ModelResponse) and m.provider_name == self.system: previous_response_id = m.provider_response_id break else: trimmed_messages.append(m) if previous_response_id and trimmed_messages: return previous_response_id, list(reversed(trimmed_messages)) else: return None, messages async def _map_messages( # noqa: C901 self, messages: list[ModelMessage], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> tuple[str | Omit, list[responses.ResponseInputItemParam]]: """Maps a `pydantic_ai.Message` to a `openai.types.responses.ResponseInputParam` i.e. the OpenAI Responses API input format. For `ThinkingParts`, this method: - Sends `signature` back as `encrypted_content` (for official OpenAI reasoning) - Sends `content` back as `summary` text - Sends `provider_details['raw_content']` back as `content` items (for gpt-oss raw CoT) Raw CoT is sent back to improve model performance in multi-turn conversations. """ profile = OpenAIModelProfile.from_profile(self.profile) send_item_ids = model_settings.get( 'openai_send_reasoning_ids', profile.openai_supports_encrypted_reasoning_content ) openai_messages: list[responses.ResponseInputItemParam] = [] for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart): openai_messages.append(responses.EasyInputMessageParam(role='system', content=part.content)) elif isinstance(part, UserPromptPart): openai_messages.append(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response_str(), ) openai_messages.append(item) elif isinstance(part, RetryPromptPart): if part.tool_name is None: openai_messages.append( Message(role='user', content=[{'type': 'input_text', 'text': part.model_response()}]) ) else: call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response(), ) openai_messages.append(item) else: assert_never(part) elif isinstance(message, ModelResponse): message_item: responses.ResponseOutputMessageParam | None = None reasoning_item: responses.ResponseReasoningItemParam | None = None web_search_item: responses.ResponseFunctionWebSearchParam | None = None file_search_item: responses.ResponseFileSearchToolCallParam | None = None code_interpreter_item: responses.ResponseCodeInterpreterToolCallParam | None = None for item in message.parts: should_send_item_id = send_item_ids and ( item.provider_name == self.system or (item.provider_name is None and message.provider_name == self.system) ) if isinstance(item, TextPart): if item.id and should_send_item_id: if message_item is None or message_item['id'] != item.id: # pragma: no branch message_item = responses.ResponseOutputMessageParam( role='assistant', id=item.id, content=[], type='message', status='completed', ) openai_messages.append(message_item) message_item['content'] = [ *message_item['content'], responses.ResponseOutputTextParam( text=item.content, type='output_text', annotations=[] ), ] else: openai_messages.append( responses.EasyInputMessageParam(role='assistant', content=item.content) ) elif isinstance(item, ToolCallPart): call_id = _guard_tool_call_id(t=item) call_id, id = _split_combined_tool_call_id(call_id) id = id or item.id param = responses.ResponseFunctionToolCallParam( name=item.tool_name, arguments=item.args_as_json_str(), call_id=call_id, type='function_call', ) if profile.openai_responses_requires_function_call_status_none: param['status'] = None # type: ignore[reportGeneralTypeIssues] if id and should_send_item_id: # pragma: no branch param['id'] = id openai_messages.append(param) elif isinstance(item, BuiltinToolCallPart): if should_send_item_id: # pragma: no branch if ( item.tool_name == CodeExecutionTool.kind and item.tool_call_id and (args := item.args_as_dict()) and (container_id := args.get('container_id')) ): code_interpreter_item = responses.ResponseCodeInterpreterToolCallParam( id=item.tool_call_id, code=args.get('code'), container_id=container_id, outputs=None, # These can be read server-side status='completed', type='code_interpreter_call', ) openai_messages.append(code_interpreter_item) elif ( item.tool_name == WebSearchTool.kind and item.tool_call_id and (args := item.args_as_dict()) ): # We need to exclude None values because of https://github.com/pydantic/pydantic-ai/issues/3653 args = {k: v for k, v in args.items() if v is not None} web_search_item = responses.ResponseFunctionWebSearchParam( id=item.tool_call_id, action=cast(responses.response_function_web_search_param.Action, args), status='completed', type='web_search_call', ) openai_messages.append(web_search_item) elif ( # pragma: no cover item.tool_name == FileSearchTool.kind and item.tool_call_id and (args := item.args_as_dict()) ): file_search_item = cast( responses.ResponseFileSearchToolCallParam, { 'id': item.tool_call_id, 'queries': args.get('queries', []), 'status': 'completed', 'type': 'file_search_call', }, ) openai_messages.append(file_search_item) elif item.tool_name == ImageGenerationTool.kind and item.tool_call_id: # The cast is necessary because of https://github.com/openai/openai-python/issues/2648 image_generation_item = cast( responses.response_input_item_param.ImageGenerationCall, { 'id': item.tool_call_id, 'type': 'image_generation_call', }, ) openai_messages.append(image_generation_item) elif ( # pragma: no branch item.tool_name.startswith(MCPServerTool.kind) and item.tool_call_id and (server_id := item.tool_name.split(':', 1)[1]) and (args := item.args_as_dict()) and (action := args.get('action')) ): if action == 'list_tools': mcp_list_tools_item = responses.response_input_item_param.McpListTools( id=item.tool_call_id, type='mcp_list_tools', server_label=server_id, tools=[], # These can be read server-side ) openai_messages.append(mcp_list_tools_item) elif ( # pragma: no branch action == 'call_tool' and (tool_name := args.get('tool_name')) and (tool_args := args.get('tool_args')) ): mcp_call_item = responses.response_input_item_param.McpCall( id=item.tool_call_id, server_label=server_id, name=tool_name, arguments=to_json(tool_args).decode(), error=None, # These can be read server-side output=None, # These can be read server-side type='mcp_call', ) openai_messages.append(mcp_call_item) elif isinstance(item, BuiltinToolReturnPart): if should_send_item_id: # pragma: no branch content_is_dict = isinstance(item.content, dict) status = cast(dict[str, Any], item.content).get('status') if content_is_dict else None kind_to_item = { CodeExecutionTool.kind: code_interpreter_item, WebSearchTool.kind: web_search_item, FileSearchTool.kind: file_search_item, } if status and (builtin_item := kind_to_item.get(item.tool_name)) is not None: builtin_item['status'] = status elif item.tool_name == ImageGenerationTool.kind: # Image generation result does not need to be sent back, just the `id` off of `BuiltinToolCallPart`. pass elif item.tool_name.startswith(MCPServerTool.kind): # pragma: no branch # MCP call result does not need to be sent back, just the fields off of `BuiltinToolCallPart`. pass elif isinstance(item, FilePart): # This was generated by the `ImageGenerationTool` or `CodeExecutionTool`, # and does not need to be sent back separately from the corresponding `BuiltinToolReturnPart`. # If `send_item_ids` is false, we won't send the `BuiltinToolReturnPart`, but OpenAI does not have a type for files from the assistant. pass elif isinstance(item, ThinkingPart): # Get raw CoT content from provider_details if present and from this provider raw_content: list[str] | None = None if item.provider_name == self.system: raw_content = (item.provider_details or {}).get('raw_content') if item.id and (should_send_item_id or raw_content): signature: str | None = None if ( item.signature and item.provider_name == self.system and profile.openai_supports_encrypted_reasoning_content ): signature = item.signature if (reasoning_item is None or reasoning_item['id'] != item.id) and ( signature or item.content or raw_content ): # pragma: no branch reasoning_item = responses.ResponseReasoningItemParam( id=item.id, summary=[], encrypted_content=signature, type='reasoning', ) openai_messages.append(reasoning_item) if item.content: # The check above guarantees that `reasoning_item` is not None assert reasoning_item is not None reasoning_item['summary'] = [ *reasoning_item['summary'], ReasoningSummary(text=item.content, type='summary_text'), ] if raw_content: # Send raw CoT back assert reasoning_item is not None reasoning_item['content'] = [ ReasoningContent(text=text, type='reasoning_text') for text in raw_content ] else: start_tag, end_tag = profile.thinking_tags openai_messages.append( responses.EasyInputMessageParam( role='assistant', content='\n'.join([start_tag, item.content, end_tag]) ) ) else: assert_never(item) else: assert_never(message) instructions = self._get_instructions(messages, model_request_parameters) or OMIT return instructions, openai_messages def _map_json_schema(self, o: OutputObjectDefinition) -> responses.ResponseFormatTextJSONSchemaConfigParam: response_format_param: responses.ResponseFormatTextJSONSchemaConfigParam = { 'type': 'json_schema', 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, } if o.description: response_format_param['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['strict'] = o.strict return response_format_param @staticmethod async def _map_user_prompt(part: UserPromptPart) -> responses.EasyInputMessageParam: # noqa: C901 content: str | list[responses.ResponseInputContentParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(responses.ResponseInputTextParam(text=item, type='input_text')) elif isinstance(item, BinaryContent): if item.is_image: detail: Literal['auto', 'low', 'high'] = 'auto' if metadata := item.vendor_metadata: detail = cast( Literal['auto', 'low', 'high'], metadata.get('detail', 'auto'), ) content.append( responses.ResponseInputImageParam( image_url=item.data_uri, type='input_image', detail=detail, ) ) elif item.is_document: content.append( responses.ResponseInputFileParam( type='input_file', file_data=item.data_uri, # NOTE: Type wise it's not necessary to include the filename, but it's required by the # API itself. If we add empty string, the server sends a 500 error - which OpenAI needs # to fix. In any case, we add a placeholder name. filename=f'filename.{item.format}', ) ) elif item.is_audio: raise NotImplementedError('Audio as binary content is not supported for OpenAI Responses API.') else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, ImageUrl): detail: Literal['auto', 'low', 'high'] = 'auto' image_url = item.url if metadata := item.vendor_metadata: detail = cast(Literal['auto', 'low', 'high'], metadata.get('detail', 'auto')) if item.force_download: downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') image_url = downloaded_item['data'] content.append( responses.ResponseInputImageParam( image_url=image_url, type='input_image', detail=detail, ) ) elif isinstance(item, AudioUrl | DocumentUrl): if item.force_download: downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( responses.ResponseInputFileParam( type='input_file', file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ) ) else: content.append( responses.ResponseInputFileParam( type='input_file', file_url=item.url, ) ) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI.') elif isinstance(item, CachePoint): # OpenAI doesn't support prompt caching via CachePoint, so we filter it out pass else: assert_never(item) return responses.EasyInputMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( OpenAIResponsesCompatibleProvider | Literal["openai", "gateway"] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI Responses model. Parameters: | Name | Type | Description | Default | | ------------ | ----------------------------------- | ------------------------------------ | ----------------------------------------------------------------------------------------------- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. | *required* | | `provider` | \`OpenAIResponsesCompatibleProvider | Literal['openai', 'gateway'] | Provider[AsyncOpenAI]\` | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: OpenAIResponsesCompatibleProvider | Literal[ 'openai', 'gateway', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider('gateway/openai' if provider == 'gateway' else provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` Return the set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, FileSearchTool, MCPServerTool, ImageGenerationTool}) ``` ### OpenAIStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for OpenAI models. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass class OpenAIStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for OpenAI models.""" _model_name: OpenAIModelName _model_profile: ModelProfile _response: AsyncIterable[ChatCompletionChunk] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_now_utc) _model_settings: OpenAIChatModelSettings | None = None async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: if self._provider_timestamp is not None: # pragma: no branch self.provider_details = {'timestamp': self._provider_timestamp} async for chunk in self._validate_response(): chunk_usage = self._map_usage(chunk) if self._model_settings and self._model_settings.get('openai_continuous_usage_stats'): # When continuous_usage_stats is enabled, each chunk contains cumulative usage, # so we replace rather than increment to avoid double-counting. self._usage = chunk_usage else: self._usage += chunk_usage if chunk.id: # pragma: no branch self.provider_response_id = chunk.id if chunk.model: self._model_name = chunk.model try: choice = chunk.choices[0] except IndexError: continue # When using Azure OpenAI and an async content filter is enabled, the openai SDK can return None deltas. if choice.delta is None: # pyright: ignore[reportUnnecessaryComparison] continue if raw_finish_reason := choice.finish_reason: self.finish_reason = self._map_finish_reason(raw_finish_reason) if provider_details := self._map_provider_details(chunk): # pragma: no branch self.provider_details = {**(self.provider_details or {}), **provider_details} for event in self._map_part_delta(choice): yield event def _validate_response(self) -> AsyncIterable[ChatCompletionChunk]: """Hook that validates incoming chunks. This method may be overridden by subclasses of `OpenAIStreamedResponse` to apply custom chunk validations. By default, this is a no-op since `ChatCompletionChunk` is already validated. """ return self._response def _map_part_delta(self, choice: chat_completion_chunk.Choice) -> Iterable[ModelResponseStreamEvent]: """Hook that determines the sequence of mappings that will be called to produce events. This method may be overridden by subclasses of `OpenAIStreamResponse` to customize the mapping. """ return itertools.chain( self._map_thinking_delta(choice), self._map_text_delta(choice), self._map_tool_call_delta(choice) ) def _map_thinking_delta(self, choice: chat_completion_chunk.Choice) -> Iterable[ModelResponseStreamEvent]: """Hook that maps thinking delta content to events. This method may be overridden by subclasses of `OpenAIStreamResponse` to customize the mapping. """ profile = OpenAIModelProfile.from_profile(self._model_profile) custom_field = profile.openai_chat_thinking_field # Prefer the configured custom reasoning field, if present in profile. # Fall back to built-in fields if no custom field result was found. # The `reasoning_content` field is typically present in DeepSeek and Moonshot models. # https://api-docs.deepseek.com/guides/reasoning_model # The `reasoning` field is typically present in gpt-oss via Ollama and OpenRouter. # - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api # - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens for field_name in (custom_field, 'reasoning', 'reasoning_content'): if not field_name: continue reasoning: str | None = getattr(choice.delta, field_name, None) if reasoning: # pragma: no branch yield from self._parts_manager.handle_thinking_delta( vendor_part_id=field_name, id=field_name, content=reasoning, provider_name=self.provider_name, ) break def _map_text_delta(self, choice: chat_completion_chunk.Choice) -> Iterable[ModelResponseStreamEvent]: """Hook that maps text delta content to events. This method may be overridden by subclasses of `OpenAIStreamResponse` to customize the mapping. """ # Handle the text part of the response content = choice.delta.content if content: for event in self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ): if isinstance(event, PartStartEvent) and isinstance(event.part, ThinkingPart): event.part.id = 'content' event.part.provider_name = self.provider_name yield event def _map_tool_call_delta(self, choice: chat_completion_chunk.Choice) -> Iterable[ModelResponseStreamEvent]: """Hook that maps tool call delta content to events. This method may be overridden by subclasses of `OpenAIStreamResponse` to customize the mapping. """ for dtc in choice.delta.tool_calls or []: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc.index, tool_name=dtc.function and dtc.function.name, args=dtc.function and dtc.function.arguments, tool_call_id=dtc.id, ) if maybe_event is not None: yield maybe_event def _map_provider_details(self, chunk: ChatCompletionChunk) -> dict[str, Any] | None: """Hook that generates the provider details from chunk content. This method may be overridden by subclasses of `OpenAIStreamResponse` to customize the provider details. """ return _map_provider_details(chunk.choices[0]) def _map_usage(self, response: ChatCompletionChunk) -> usage.RequestUsage: return _map_usage(response, self._provider_name, self._provider_url, self.model_name) def _map_finish_reason( self, key: Literal['stop', 'length', 'tool_calls', 'content_filter', 'function_call'] ) -> FinishReason | None: """Hooks that maps a finish reason key to a [FinishReason](pydantic_ai.messages.FinishReason). This method may be overridden by subclasses of `OpenAIChatModel` to accommodate custom keys. """ return _CHAT_FINISH_REASON_MAP.get(key) @property def model_name(self) -> OpenAIModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: OpenAIModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. ### OpenAIResponsesStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for OpenAI Responses API. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass class OpenAIResponsesStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for OpenAI Responses API.""" _model_name: OpenAIModelName _model_settings: OpenAIResponsesModelSettings _response: AsyncIterable[responses.ResponseStreamEvent] _provider_name: str _provider_url: str _provider_timestamp: datetime | None = None _timestamp: datetime = field(default_factory=_now_utc) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 # Track annotations by item_id and content_index _annotations_by_item: dict[str, list[Any]] = {} if self._provider_timestamp is not None: # pragma: no branch self.provider_details = {'timestamp': self._provider_timestamp} async for chunk in self._response: # NOTE: You can inspect the builtin tools used checking the `ResponseCompletedEvent`. if isinstance(chunk, responses.ResponseCompletedEvent): self._usage += self._map_usage(chunk.response) raw_finish_reason = ( details.reason if (details := chunk.response.incomplete_details) else chunk.response.status ) if raw_finish_reason: # pragma: no branch self.provider_details = {**(self.provider_details or {}), 'finish_reason': raw_finish_reason} self.finish_reason = _RESPONSES_FINISH_REASON_MAP.get(raw_finish_reason) elif isinstance(chunk, responses.ResponseContentPartAddedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseContentPartDoneEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseCreatedEvent): if chunk.response.id: # pragma: no branch self.provider_response_id = chunk.response.id elif isinstance(chunk, responses.ResponseFailedEvent): # pragma: no cover self._usage += self._map_usage(chunk.response) elif isinstance(chunk, responses.ResponseFunctionCallArgumentsDeltaEvent): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=chunk.item_id, args=chunk.delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk, responses.ResponseFunctionCallArgumentsDoneEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseIncompleteEvent): # pragma: no cover self._usage += self._map_usage(chunk.response) elif isinstance(chunk, responses.ResponseInProgressEvent): self._usage += self._map_usage(chunk.response) elif isinstance(chunk, responses.ResponseOutputItemAddedEvent): if isinstance(chunk.item, responses.ResponseFunctionToolCall): yield self._parts_manager.handle_tool_call_part( vendor_part_id=chunk.item.id, tool_name=chunk.item.name, args=chunk.item.arguments, tool_call_id=chunk.item.call_id, id=chunk.item.id, provider_name=self.provider_name, ) elif isinstance(chunk.item, responses.ResponseReasoningItem): pass elif isinstance(chunk.item, responses.ResponseOutputMessage): pass elif isinstance(chunk.item, responses.ResponseFunctionWebSearch): call_part, _ = _map_web_search_tool_call(chunk.item, self.provider_name) yield self._parts_manager.handle_part( vendor_part_id=f'{chunk.item.id}-call', part=replace(call_part, args=None) ) elif isinstance(chunk.item, responses.ResponseFileSearchToolCall): call_part, _ = _map_file_search_tool_call(chunk.item, self.provider_name) yield self._parts_manager.handle_part( vendor_part_id=f'{chunk.item.id}-call', part=replace(call_part, args=None) ) elif isinstance(chunk.item, responses.ResponseCodeInterpreterToolCall): call_part, _, _ = _map_code_interpreter_tool_call(chunk.item, self.provider_name) args_json = call_part.args_as_json_str() # Drop the final `"}` so that we can add code deltas args_json_delta = args_json[:-2] assert args_json_delta.endswith('"code":"'), f'Expected {args_json_delta!r} to end in `"code":"`' yield self._parts_manager.handle_part( vendor_part_id=f'{chunk.item.id}-call', part=replace(call_part, args=None) ) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item.id}-call', args=args_json_delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk.item, responses.response_output_item.ImageGenerationCall): call_part, _, _ = _map_image_generation_tool_call(chunk.item, self.provider_name) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-call', part=call_part) elif isinstance(chunk.item, responses.response_output_item.McpCall): call_part, _ = _map_mcp_call(chunk.item, self.provider_name) args_json = call_part.args_as_json_str() # Drop the final `{}}` so that we can add tool args deltas args_json_delta = args_json[:-3] assert args_json_delta.endswith('"tool_args":'), ( f'Expected {args_json_delta!r} to end in `"tool_args":"`' ) yield self._parts_manager.handle_part( vendor_part_id=f'{chunk.item.id}-call', part=replace(call_part, args=None) ) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item.id}-call', args=args_json_delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk.item, responses.response_output_item.McpListTools): call_part, _ = _map_mcp_list_tools(chunk.item, self.provider_name) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-call', part=call_part) else: warnings.warn( # pragma: no cover f'Handling of this item type is not yet implemented. Please report on our GitHub: {chunk}', UserWarning, ) elif isinstance(chunk, responses.ResponseOutputItemDoneEvent): if isinstance(chunk.item, responses.ResponseReasoningItem): if signature := chunk.item.encrypted_content: # pragma: no branch # Add the signature to the part corresponding to the first summary/raw CoT for event in self._parts_manager.handle_thinking_delta( vendor_part_id=chunk.item.id, id=chunk.item.id, signature=signature, provider_name=self.provider_name, ): yield event elif isinstance(chunk.item, responses.ResponseCodeInterpreterToolCall): _, return_part, file_parts = _map_code_interpreter_tool_call(chunk.item, self.provider_name) for i, file_part in enumerate(file_parts): yield self._parts_manager.handle_part( vendor_part_id=f'{chunk.item.id}-file-{i}', part=file_part ) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk.item, responses.ResponseFunctionWebSearch): call_part, return_part = _map_web_search_tool_call(chunk.item, self.provider_name) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item.id}-call', args=call_part.args, ) if maybe_event is not None: # pragma: no branch yield maybe_event yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk.item, responses.ResponseFileSearchToolCall): call_part, return_part = _map_file_search_tool_call(chunk.item, self.provider_name) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item.id}-call', args=call_part.args, ) if maybe_event is not None: # pragma: no branch yield maybe_event yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk.item, responses.response_output_item.ImageGenerationCall): _, return_part, file_part = _map_image_generation_tool_call(chunk.item, self.provider_name) if file_part: # pragma: no branch yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-file', part=file_part) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk.item, responses.response_output_item.McpCall): _, return_part = _map_mcp_call(chunk.item, self.provider_name) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk.item, responses.response_output_item.McpListTools): _, return_part = _map_mcp_list_tools(chunk.item, self.provider_name) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item.id}-return', part=return_part) elif isinstance(chunk, responses.ResponseReasoningSummaryPartAddedEvent): # Use same vendor_part_id as raw CoT for first summary (index 0) so they merge into one ThinkingPart vendor_id = chunk.item_id if chunk.summary_index == 0 else f'{chunk.item_id}-{chunk.summary_index}' for event in self._parts_manager.handle_thinking_delta( vendor_part_id=vendor_id, content=chunk.part.text, id=chunk.item_id, provider_name=self.provider_name, ): yield event elif isinstance(chunk, responses.ResponseReasoningSummaryPartDoneEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseReasoningSummaryTextDoneEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseReasoningSummaryTextDeltaEvent): # Use same vendor_part_id as raw CoT for first summary (index 0) so they merge into one ThinkingPart vendor_id = chunk.item_id if chunk.summary_index == 0 else f'{chunk.item_id}-{chunk.summary_index}' for event in self._parts_manager.handle_thinking_delta( vendor_part_id=vendor_id, content=chunk.delta, id=chunk.item_id, provider_name=self.provider_name, ): yield event elif isinstance(chunk, responses.ResponseReasoningTextDeltaEvent): # Handle raw CoT from gpt-oss models using callback pattern for event in self._parts_manager.handle_thinking_delta( vendor_part_id=chunk.item_id, id=chunk.item_id, provider_name=self.provider_name, provider_details=_make_raw_content_updater(chunk.delta, chunk.content_index), ): yield event elif isinstance(chunk, responses.ResponseReasoningTextDoneEvent): pass # content already accumulated via delta events elif isinstance(chunk, responses.ResponseOutputTextAnnotationAddedEvent): # Collect annotations if the setting is enabled if self._model_settings.get('openai_include_raw_annotations'): _annotations_by_item.setdefault(chunk.item_id, []).append(chunk.annotation) elif isinstance(chunk, responses.ResponseTextDeltaEvent): for event in self._parts_manager.handle_text_delta( vendor_part_id=chunk.item_id, content=chunk.delta, id=chunk.item_id, provider_name=self.provider_name, ): yield event elif isinstance(chunk, responses.ResponseTextDoneEvent): # Add annotations to provider_details if available provider_details: dict[str, Any] = {} annotations = _annotations_by_item.get(chunk.item_id) if annotations: provider_details['annotations'] = responses_output_text_annotations_ta.dump_python( list(annotations), warnings=False ) if provider_details: for event in self._parts_manager.handle_text_delta( vendor_part_id=chunk.item_id, content='', provider_name=self.provider_name, provider_details=provider_details, ): yield event elif isinstance(chunk, responses.ResponseWebSearchCallInProgressEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseWebSearchCallSearchingEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseWebSearchCallCompletedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseAudioDeltaEvent): # pragma: lax no cover pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseCodeInterpreterCallCodeDeltaEvent): json_args_delta = to_json(chunk.delta).decode()[1:-1] # Drop the surrounding `"` maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item_id}-call', args=json_args_delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk, responses.ResponseCodeInterpreterCallCodeDoneEvent): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item_id}-call', args='"}', ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk, responses.ResponseCodeInterpreterCallCompletedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseCodeInterpreterCallInProgressEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseCodeInterpreterCallInterpretingEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseImageGenCallCompletedEvent): # pragma: no cover pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseImageGenCallGeneratingEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseImageGenCallInProgressEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseImageGenCallPartialImageEvent): # Not present on the type, but present on the actual object. # See https://github.com/openai/openai-python/issues/2649 output_format = getattr(chunk, 'output_format', 'png') file_part = FilePart( content=BinaryImage( data=base64.b64decode(chunk.partial_image_b64), media_type=f'image/{output_format}', ), id=chunk.item_id, ) yield self._parts_manager.handle_part(vendor_part_id=f'{chunk.item_id}-file', part=file_part) elif isinstance(chunk, responses.ResponseMcpCallArgumentsDoneEvent): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item_id}-call', args='}', ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk, responses.ResponseMcpCallArgumentsDeltaEvent): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=f'{chunk.item_id}-call', args=chunk.delta, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(chunk, responses.ResponseMcpListToolsInProgressEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseMcpListToolsCompletedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseMcpListToolsFailedEvent): # pragma: no cover pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseMcpCallInProgressEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseMcpCallFailedEvent): # pragma: no cover pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseMcpCallCompletedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseFileSearchCallCompletedEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseFileSearchCallSearchingEvent): pass # there's nothing we need to do here elif isinstance(chunk, responses.ResponseFileSearchCallInProgressEvent): pass # there's nothing we need to do here else: # pragma: no cover warnings.warn( f'Handling of this event type is not yet implemented. Please report on our GitHub: {chunk}', UserWarning, ) def _map_usage(self, response: responses.Response) -> usage.RequestUsage: return _map_usage(response, self._provider_name, self._provider_url, self.model_name) @property def model_name(self) -> OpenAIModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: OpenAIModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.openrouter` ## Setup For details on how to set up authentication with this model, see [model configuration for OpenRouter](https://ai.pydantic.dev/models/openrouter/index.md). ### KnownOpenRouterProviders ```python KnownOpenRouterProviders = Literal[ "z-ai", "cerebras", "venice", "moonshotai", "morph", "stealth", "wandb", "klusterai", "openai", "sambanova", "amazon-bedrock", "mistral", "nextbit", "atoma", "ai21", "minimax", "baseten", "anthropic", "featherless", "groq", "lambda", "azure", "ncompass", "deepseek", "hyperbolic", "crusoe", "cohere", "mancer", "avian", "perplexity", "novita", "siliconflow", "switchpoint", "xai", "inflection", "fireworks", "deepinfra", "inference-net", "inception", "atlas-cloud", "nvidia", "alibaba", "friendli", "infermatic", "targon", "ubicloud", "aion-labs", "liquid", "nineteen", "cloudflare", "nebius", "chutes", "enfer", "crofai", "open-inference", "phala", "gmicloud", "meta", "relace", "parasail", "together", "google-ai-studio", "google-vertex", ] ``` Known providers in the OpenRouter marketplace ### OpenRouterProviderName ```python OpenRouterProviderName = str | KnownOpenRouterProviders ``` Possible OpenRouter provider names. Since OpenRouter is constantly updating their list of providers, we explicitly list some known providers but allow any name in the type hints. See [the OpenRouter API](https://openrouter.ai/docs/api-reference/list-available-providers) for a full list. ### OpenRouterTransforms ```python OpenRouterTransforms = Literal['middle-out'] ``` Available messages transforms for OpenRouter models with limited token windows. Currently only supports 'middle-out', but is expected to grow in the future. ### OpenRouterProviderConfig Bases: `TypedDict` Represents the 'Provider' object from the OpenRouter API. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python class OpenRouterProviderConfig(TypedDict, total=False): """Represents the 'Provider' object from the OpenRouter API.""" order: list[OpenRouterProviderName] """List of provider slugs to try in order (e.g. ["anthropic", "openai"]). [See details](https://openrouter.ai/docs/features/provider-routing#ordering-specific-providers)""" allow_fallbacks: bool """Whether to allow backup providers when the primary is unavailable. [See details](https://openrouter.ai/docs/features/provider-routing#disabling-fallbacks)""" require_parameters: bool """Only use providers that support all parameters in your request.""" data_collection: Literal['allow', 'deny'] """Control whether to use providers that may store data. [See details](https://openrouter.ai/docs/features/provider-routing#requiring-providers-to-comply-with-data-policies)""" zdr: bool """Restrict routing to only ZDR (Zero Data Retention) endpoints. [See details](https://openrouter.ai/docs/features/provider-routing#zero-data-retention-enforcement)""" only: list[OpenRouterProviderName] """List of provider slugs to allow for this request. [See details](https://openrouter.ai/docs/features/provider-routing#allowing-only-specific-providers)""" ignore: list[str] """List of provider slugs to skip for this request. [See details](https://openrouter.ai/docs/features/provider-routing#ignoring-providers)""" quantizations: list[Literal['int4', 'int8', 'fp4', 'fp6', 'fp8', 'fp16', 'bf16', 'fp32', 'unknown']] """List of quantization levels to filter by (e.g. ["int4", "int8"]). [See details](https://openrouter.ai/docs/features/provider-routing#quantization)""" sort: Literal['price', 'throughput', 'latency'] """Sort providers by price or throughput. (e.g. "price" or "throughput"). [See details](https://openrouter.ai/docs/features/provider-routing#provider-sorting)""" max_price: _OpenRouterMaxPrice """The maximum pricing you want to pay for this request. [See details](https://openrouter.ai/docs/features/provider-routing#max-price)""" ``` #### order ```python order: list[OpenRouterProviderName] ``` List of provider slugs to try in order (e.g. ["anthropic", "openai"]). [See details](https://openrouter.ai/docs/features/provider-routing#ordering-specific-providers) #### allow_fallbacks ```python allow_fallbacks: bool ``` Whether to allow backup providers when the primary is unavailable. [See details](https://openrouter.ai/docs/features/provider-routing#disabling-fallbacks) #### require_parameters ```python require_parameters: bool ``` Only use providers that support all parameters in your request. #### data_collection ```python data_collection: Literal['allow', 'deny'] ``` Control whether to use providers that may store data. [See details](https://openrouter.ai/docs/features/provider-routing#requiring-providers-to-comply-with-data-policies) #### zdr ```python zdr: bool ``` Restrict routing to only ZDR (Zero Data Retention) endpoints. [See details](https://openrouter.ai/docs/features/provider-routing#zero-data-retention-enforcement) #### only ```python only: list[OpenRouterProviderName] ``` List of provider slugs to allow for this request. [See details](https://openrouter.ai/docs/features/provider-routing#allowing-only-specific-providers) #### ignore ```python ignore: list[str] ``` List of provider slugs to skip for this request. [See details](https://openrouter.ai/docs/features/provider-routing#ignoring-providers) #### quantizations ```python quantizations: list[ Literal[ "int4", "int8", "fp4", "fp6", "fp8", "fp16", "bf16", "fp32", "unknown", ] ] ``` List of quantization levels to filter by (e.g. ["int4", "int8"]). [See details](https://openrouter.ai/docs/features/provider-routing#quantization) #### sort ```python sort: Literal['price', 'throughput', 'latency'] ``` Sort providers by price or throughput. (e.g. "price" or "throughput"). [See details](https://openrouter.ai/docs/features/provider-routing#provider-sorting) #### max_price ```python max_price: _OpenRouterMaxPrice ``` The maximum pricing you want to pay for this request. [See details](https://openrouter.ai/docs/features/provider-routing#max-price) ### OpenRouterReasoning Bases: `TypedDict` Configuration for reasoning tokens in OpenRouter requests. Reasoning tokens allow models to show their step-by-step thinking process. You can configure this using either OpenAI-style effort levels or Anthropic-style token limits, but not both simultaneously. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python class OpenRouterReasoning(TypedDict, total=False): """Configuration for reasoning tokens in OpenRouter requests. Reasoning tokens allow models to show their step-by-step thinking process. You can configure this using either OpenAI-style effort levels or Anthropic-style token limits, but not both simultaneously. """ effort: Literal['high', 'medium', 'low'] """OpenAI-style reasoning effort level. Cannot be used with max_tokens.""" max_tokens: int """Anthropic-style specific token limit for reasoning. Cannot be used with effort.""" exclude: bool """Whether to exclude reasoning tokens from the response. Default is False. All models support this.""" enabled: bool """Whether to enable reasoning with default parameters. Default is inferred from effort or max_tokens.""" ``` #### effort ```python effort: Literal['high', 'medium', 'low'] ``` OpenAI-style reasoning effort level. Cannot be used with max_tokens. #### max_tokens ```python max_tokens: int ``` Anthropic-style specific token limit for reasoning. Cannot be used with effort. #### exclude ```python exclude: bool ``` Whether to exclude reasoning tokens from the response. Default is False. All models support this. #### enabled ```python enabled: bool ``` Whether to enable reasoning with default parameters. Default is inferred from effort or max_tokens. ### OpenRouterUsageConfig Bases: `TypedDict` Configuration for OpenRouter usage. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python class OpenRouterUsageConfig(TypedDict, total=False): """Configuration for OpenRouter usage.""" include: bool ``` ### OpenRouterModelSettings Bases: `ModelSettings` Settings used for an OpenRouter model request. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python class OpenRouterModelSettings(ModelSettings, total=False): """Settings used for an OpenRouter model request.""" # ALL FIELDS MUST BE `openrouter_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. openrouter_models: list[str] """A list of fallback models. These models will be tried, in order, if the main model returns an error. [See details](https://openrouter.ai/docs/features/model-routing#the-models-parameter) """ openrouter_provider: OpenRouterProviderConfig """OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime. You can customize how your requests are routed using the provider object. [See more](https://openrouter.ai/docs/features/provider-routing)""" openrouter_preset: str """Presets allow you to separate your LLM configuration from your code. Create and manage presets through the OpenRouter web application to control provider routing, model selection, system prompts, and other parameters, then reference them in OpenRouter API requests. [See more](https://openrouter.ai/docs/features/presets)""" openrouter_transforms: list[OpenRouterTransforms] """To help with prompts that exceed the maximum context size of a model. Transforms work by removing or truncating messages from the middle of the prompt, until the prompt fits within the model's context window. [See more](https://openrouter.ai/docs/features/message-transforms) """ openrouter_reasoning: OpenRouterReasoning """To control the reasoning tokens in the request. The reasoning config object consolidates settings for controlling reasoning strength across different models. [See more](https://openrouter.ai/docs/use-cases/reasoning-tokens) """ openrouter_usage: OpenRouterUsageConfig """To control the usage of the model. The usage config object consolidates settings for enabling detailed usage information. [See more](https://openrouter.ai/docs/use-cases/usage-accounting) """ ``` #### openrouter_models ```python openrouter_models: list[str] ``` A list of fallback models. These models will be tried, in order, if the main model returns an error. [See details](https://openrouter.ai/docs/features/model-routing#the-models-parameter) #### openrouter_provider ```python openrouter_provider: OpenRouterProviderConfig ``` OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime. You can customize how your requests are routed using the provider object. [See more](https://openrouter.ai/docs/features/provider-routing) #### openrouter_preset ```python openrouter_preset: str ``` Presets allow you to separate your LLM configuration from your code. Create and manage presets through the OpenRouter web application to control provider routing, model selection, system prompts, and other parameters, then reference them in OpenRouter API requests. [See more](https://openrouter.ai/docs/features/presets) #### openrouter_transforms ```python openrouter_transforms: list[OpenRouterTransforms] ``` To help with prompts that exceed the maximum context size of a model. Transforms work by removing or truncating messages from the middle of the prompt, until the prompt fits within the model's context window. [See more](https://openrouter.ai/docs/features/message-transforms) #### openrouter_reasoning ```python openrouter_reasoning: OpenRouterReasoning ``` To control the reasoning tokens in the request. The reasoning config object consolidates settings for controlling reasoning strength across different models. [See more](https://openrouter.ai/docs/use-cases/reasoning-tokens) #### openrouter_usage ```python openrouter_usage: OpenRouterUsageConfig ``` To control the usage of the model. The usage config object consolidates settings for enabling detailed usage information. [See more](https://openrouter.ai/docs/use-cases/usage-accounting) ### OpenRouterModel Bases: `OpenAIChatModel` Extends OpenAIModel to capture extra metadata for Openrouter. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python class OpenRouterModel(OpenAIChatModel): """Extends OpenAIModel to capture extra metadata for Openrouter.""" def __init__( self, model_name: str, *, provider: Literal['openrouter'] | Provider[AsyncOpenAI] = 'openrouter', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenRouter model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. If not provided, a new provider will be created with the default settings. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ super().__init__(model_name, provider=provider or OpenRouterProvider(), profile=profile, settings=settings) @override def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: merged_settings, customized_parameters = super().prepare_request(model_settings, model_request_parameters) new_settings = _openrouter_settings_to_openai_settings(cast(OpenRouterModelSettings, merged_settings or {})) return new_settings, customized_parameters @override def _validate_completion(self, response: chat.ChatCompletion) -> _OpenRouterChatCompletion: response = _OpenRouterChatCompletion.model_validate(response.model_dump()) if error := response.error: raise ModelHTTPError(status_code=error.code, model_name=response.model, body=error.message) return response @override def _process_thinking(self, message: chat.ChatCompletionMessage) -> list[ThinkingPart] | None: assert isinstance(message, _OpenRouterCompletionMessage) if reasoning_details := message.reasoning_details: return [_from_reasoning_detail(detail) for detail in reasoning_details] else: return super()._process_thinking(message) @override def _process_provider_details(self, response: chat.ChatCompletion) -> dict[str, Any] | None: assert isinstance(response, _OpenRouterChatCompletion) provider_details = super()._process_provider_details(response) or {} provider_details.update(_map_openrouter_provider_details(response)) return provider_details or None @dataclass class _MapModelResponseContext(OpenAIChatModel._MapModelResponseContext): # type: ignore[reportPrivateUsage] reasoning_details: list[dict[str, Any]] = field(default_factory=list[dict[str, Any]]) def _into_message_param(self) -> chat.ChatCompletionAssistantMessageParam: message_param = super()._into_message_param() if self.reasoning_details: message_param['reasoning_details'] = self.reasoning_details # type: ignore[reportGeneralTypeIssues] return message_param @override def _map_response_thinking_part(self, item: ThinkingPart) -> None: assert isinstance(self._model, OpenRouterModel) if item.provider_name == self._model.system: if reasoning_detail := _into_reasoning_detail(item): # pragma: lax no cover self.reasoning_details.append(reasoning_detail.model_dump()) else: # pragma: lax no cover super()._map_response_thinking_part(item) @property @override def _streamed_response_cls(self): return OpenRouterStreamedResponse @override def _map_finish_reason( # type: ignore[reportIncompatibleMethodOverride] self, key: Literal['stop', 'length', 'tool_calls', 'content_filter', 'error'] ) -> FinishReason | None: return _CHAT_FINISH_REASON_MAP.get(key) ``` #### __init__ ```python __init__( model_name: str, *, provider: ( Literal["openrouter"] | Provider[AsyncOpenAI] ) = "openrouter", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an OpenRouter model. Parameters: | Name | Type | Description | Default | | ------------ | ----------------------- | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | | `model_name` | `str` | The name of the model to use. | *required* | | `provider` | \`Literal['openrouter'] | Provider[AsyncOpenAI]\` | The provider to use for authentication and API access. If not provided, a new provider will be created with the default settings. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Model-specific settings that will be used as defaults for this model. | Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python def __init__( self, model_name: str, *, provider: Literal['openrouter'] | Provider[AsyncOpenAI] = 'openrouter', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenRouter model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. If not provided, a new provider will be created with the default settings. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ super().__init__(model_name, provider=provider or OpenRouterProvider(), profile=profile, settings=settings) ``` ### OpenRouterStreamedResponse Bases: `OpenAIStreamedResponse` Implementation of `StreamedResponse` for OpenRouter models. Source code in `pydantic_ai_slim/pydantic_ai/models/openrouter.py` ```python @dataclass class OpenRouterStreamedResponse(OpenAIStreamedResponse): """Implementation of `StreamedResponse` for OpenRouter models.""" @override async def _validate_response(self): try: async for chunk in self._response: yield _OpenRouterChatCompletionChunk.model_validate(chunk.model_dump()) except APIError as e: error = _OpenRouterError.model_validate(e.body) raise ModelHTTPError(status_code=error.code, model_name=self._model_name, body=error.message) @override def _map_thinking_delta(self, choice: chat_completion_chunk.Choice) -> Iterable[ModelResponseStreamEvent]: assert isinstance(choice, _OpenRouterChunkChoice) if reasoning_details := choice.delta.reasoning_details: for i, detail in enumerate(reasoning_details): thinking_part = _from_reasoning_detail(detail) # Use unique vendor_part_id for each reasoning detail type to prevent # different detail types (e.g., reasoning.text, reasoning.encrypted) # from being incorrectly merged into a single ThinkingPart. # This is required for Gemini 3 Pro which returns multiple reasoning # detail types that must be preserved separately for thought_signature handling. vendor_id = f'reasoning_detail_{detail.type}_{i}' yield from self._parts_manager.handle_thinking_delta( vendor_part_id=vendor_id, id=thinking_part.id, content=thinking_part.content, signature=thinking_part.signature, provider_name=self._provider_name, provider_details=thinking_part.provider_details, ) else: return super()._map_thinking_delta(choice) @override def _map_provider_details(self, chunk: chat.ChatCompletionChunk) -> dict[str, Any] | None: assert isinstance(chunk, _OpenRouterChatCompletionChunk) provider_details = super()._map_provider_details(chunk) or {} provider_details.update(_map_openrouter_provider_details(chunk)) return provider_details or None @override def _map_finish_reason( # type: ignore[reportIncompatibleMethodOverride] self, key: Literal['stop', 'length', 'tool_calls', 'content_filter', 'error'] ) -> FinishReason | None: return _CHAT_FINISH_REASON_MAP.get(key) ``` # `pydantic_ai.models.outlines` ## Setup For details on how to set up this model, see [model configuration for Outlines](https://ai.pydantic.dev/models/outlines/index.md). ### OutlinesModel Bases: `Model` A model that relies on the Outlines library to run non API-based models. Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @dataclass(init=False) class OutlinesModel(Model): """A model that relies on the Outlines library to run non API-based models.""" def __init__( self, model: OutlinesBaseModel | OutlinesAsyncBaseModel, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Outlines model. Args: model: The Outlines model used for the model. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ self.model: OutlinesBaseModel | OutlinesAsyncBaseModel = model self._model_name: str = 'outlines-model' if isinstance(provider, str): provider = infer_provider(provider) super().__init__(settings=settings, profile=profile or provider.model_profile) @classmethod def from_transformers( cls, hf_model: transformers.modeling_utils.PreTrainedModel, hf_tokenizer_or_processor: transformers.tokenization_utils.PreTrainedTokenizer | transformers.processing_utils.ProcessorMixin, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a Hugging Face model and tokenizer. Args: hf_model: The Hugging Face PreTrainedModel or any model that is compatible with the `transformers` API. hf_tokenizer_or_processor: Either a HuggingFace `PreTrainedTokenizer` or any tokenizer that is compatible with the `transformers` API, or a HuggingFace processor inheriting from `ProcessorMixin`. If a tokenizer is provided, a regular model will be used, while if you provide a processor, it will be a multimodal model. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_transformers(hf_model, hf_tokenizer_or_processor) return cls(outlines_model, provider=provider, profile=profile, settings=settings) @classmethod def from_llamacpp( # pragma: lax no cover cls, llama_model: llama_cpp.Llama, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a LlamaCpp model. Args: llama_model: The llama_cpp.Llama model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_llamacpp(llama_model) return cls(outlines_model, provider=provider, profile=profile, settings=settings) @classmethod def from_mlxlm( # pragma: no cover cls, mlx_model: nn.Module, # pyright: ignore[reportUnknownParameterType, reportUnknownMemberType] mlx_tokenizer: transformers.tokenization_utils.PreTrainedTokenizer, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a MLXLM model. Args: mlx_model: The nn.Module model to use. mlx_tokenizer: The PreTrainedTokenizer to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_mlxlm(mlx_model, mlx_tokenizer) # pyright: ignore[reportUnknownArgumentType] return cls(outlines_model, provider=provider, profile=profile, settings=settings) @classmethod def from_sglang( cls, base_url: str, api_key: str | None = None, model_name: str | None = None, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model to send requests to an SGLang server. Args: base_url: The url of the SGLang server. api_key: The API key to use for authenticating requests to the SGLang server. model_name: The name of the model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ try: from openai import AsyncOpenAI except ImportError as _import_error: raise ImportError( 'Please install `openai` to use the Outlines SGLang model, ' 'you can use the `openai` optional group — `pip install "pydantic-ai-slim[openai]"`' ) from _import_error openai_client = AsyncOpenAI(base_url=base_url, api_key=api_key) outlines_model: OutlinesBaseModel | OutlinesAsyncBaseModel = from_sglang(openai_client, model_name) return cls(outlines_model, provider=provider, profile=profile, settings=settings) @classmethod def from_vllm_offline( # pragma: no cover cls, vllm_model: Any, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a vLLM offline inference model. Args: vllm_model: The vllm.LLM local model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel | OutlinesAsyncBaseModel = from_vllm_offline(vllm_model) return cls(outlines_model, provider=provider, profile=profile, settings=settings) @property def model_name(self) -> str: return self._model_name @property def system(self) -> str: return 'outlines' async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) """Make a request to the model.""" prompt, output_type, inference_kwargs = await self._build_generation_arguments( messages, model_settings, model_request_parameters ) # Async is available for SgLang response: str if isinstance(self.model, OutlinesAsyncBaseModel): response = await self.model(prompt, output_type, None, **inference_kwargs) else: response = self.model(prompt, output_type, None, **inference_kwargs) return self._process_response(response) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) prompt, output_type, inference_kwargs = await self._build_generation_arguments( messages, model_settings, model_request_parameters ) # Async is available for SgLang if isinstance(self.model, OutlinesAsyncBaseModel): response = self.model.stream(prompt, output_type, None, **inference_kwargs) yield await self._process_streamed_response(response, model_request_parameters) else: # pragma: lax no cover response = self.model.stream(prompt, output_type, None, **inference_kwargs) async def async_response(): for chunk in response: yield chunk yield await self._process_streamed_response(async_response(), model_request_parameters) async def _build_generation_arguments( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[Chat, JsonSchema | None, dict[str, Any]]: """Build the generation arguments for the model.""" # the builtin_tool check now happens in `Model.prepare_request()` if model_request_parameters.function_tools or model_request_parameters.output_tools: raise UserError('Outlines does not support function tools yet.') if model_request_parameters.output_object: output_type = JsonSchema(model_request_parameters.output_object.json_schema) else: output_type = None prompt = await self._format_prompt(messages, model_request_parameters) inference_kwargs = self.format_inference_kwargs(model_settings) return prompt, output_type, inference_kwargs def format_inference_kwargs(self, model_settings: ModelSettings | None) -> dict[str, Any]: """Format the model settings for the inference kwargs.""" settings_dict: dict[str, Any] = dict(model_settings) if model_settings else {} if isinstance(self.model, Transformers): settings_dict = self._format_transformers_inference_kwargs(settings_dict) elif isinstance(self.model, LlamaCpp): # pragma: lax no cover settings_dict = self._format_llama_cpp_inference_kwargs(settings_dict) elif isinstance(self.model, MLXLM): # pragma: no cover settings_dict = self._format_mlxlm_inference_kwargs(settings_dict) elif isinstance(self.model, SGLang | AsyncSGLang): settings_dict = self._format_sglang_inference_kwargs(settings_dict) elif isinstance(self.model, VLLMOffline): # pragma: no cover settings_dict = self._format_vllm_offline_inference_kwargs(settings_dict) extra_body = settings_dict.pop('extra_body', {}) settings_dict.update(extra_body) return settings_dict def _format_transformers_inference_kwargs(self, model_settings: dict[str, Any]) -> dict[str, Any]: """Select the model settings supported by the Transformers model.""" supported_args = [ 'max_tokens', 'temperature', 'top_p', 'logit_bias', 'extra_body', ] filtered_settings = {k: model_settings[k] for k in supported_args if k in model_settings} return filtered_settings def _format_llama_cpp_inference_kwargs( # pragma: lax no cover self, model_settings: dict[str, Any] ) -> dict[str, Any]: """Select the model settings supported by the LlamaCpp model.""" supported_args = [ 'max_tokens', 'temperature', 'top_p', 'seed', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'extra_body', ] filtered_settings = {k: model_settings[k] for k in supported_args if k in model_settings} return filtered_settings def _format_mlxlm_inference_kwargs( # pragma: no cover self, model_settings: dict[str, Any] ) -> dict[str, Any]: """Select the model settings supported by the MLXLM model.""" supported_args = [ 'extra_body', ] filtered_settings = {k: model_settings[k] for k in supported_args if k in model_settings} return filtered_settings def _format_sglang_inference_kwargs(self, model_settings: dict[str, Any]) -> dict[str, Any]: """Select the model settings supported by the SGLang model.""" supported_args = [ 'max_tokens', 'temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'extra_body', ] filtered_settings = {k: model_settings[k] for k in supported_args if k in model_settings} return filtered_settings def _format_vllm_offline_inference_kwargs( # pragma: no cover self, model_settings: dict[str, Any] ) -> dict[str, Any]: """Select the model settings supported by the vLLMOffline model.""" from vllm.sampling_params import ( # pyright: ignore[reportMissingImports] SamplingParams, # pyright: ignore[reportUnknownVariableType] ) supported_args = [ 'max_tokens', 'temperature', 'top_p', 'seed', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'extra_body', ] # The arguments that are part of the fields of `ModelSettings` must be put in a `SamplingParams` object and # provided through the `sampling_params` argument to vLLM sampling_params = model_settings.get('extra_body', {}).pop('sampling_params', SamplingParams()) for key in supported_args: setattr(sampling_params, key, model_settings.get(key, None)) filtered_settings = { 'sampling_params': sampling_params, **model_settings.get('extra_body', {}), } return filtered_settings async def _format_prompt( # noqa: C901 self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters ) -> Chat: """Turn the model messages into an Outlines Chat instance.""" chat = Chat() if instructions := self._get_instructions(messages, model_request_parameters): chat.add_system_message(instructions) for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart): chat.add_system_message(part.content) elif isinstance(part, UserPromptPart): if isinstance(part.content, str): chat.add_user_message(part.content) elif isinstance(part.content, Sequence): outlines_input: Sequence[str | Image] = [] for item in part.content: if isinstance(item, str): outlines_input.append(item) elif isinstance(item, ImageUrl): image_content: DownloadedItem[bytes] = await download_item( item, data_format='bytes', type_format='mime' ) image = self._create_PIL_image(image_content['data'], image_content['data_type']) outlines_input.append(Image(image)) elif isinstance(item, BinaryContent) and item.is_image: image = self._create_PIL_image(item.data, item.media_type) outlines_input.append(Image(image)) else: raise UserError( 'Each element of the content sequence must be a string, an `ImageUrl`' + ' or a `BinaryImage`.' ) chat.add_user_message(outlines_input) else: assert_never(part.content) elif isinstance(part, RetryPromptPart): chat.add_user_message(part.model_response()) elif isinstance(part, ToolReturnPart): raise UserError('Tool calls are not supported for Outlines models yet.') else: assert_never(part) elif isinstance(message, ModelResponse): text_parts: list[str] = [] image_parts: list[Image] = [] for part in message.parts: if isinstance(part, TextPart): text_parts.append(part.content) elif isinstance(part, ThinkingPart): # NOTE: We don't send ThinkingPart to the providers yet. pass elif isinstance(part, ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart): raise UserError('Tool calls are not supported for Outlines models yet.') elif isinstance(part, FilePart): if isinstance(part.content, BinaryContent) and part.content.is_image: image = self._create_PIL_image(part.content.data, part.content.media_type) image_parts.append(Image(image)) else: raise UserError( 'File parts other than `BinaryImage` are not supported for Outlines models yet.' ) else: assert_never(part) if len(text_parts) == 1 and len(image_parts) == 0: chat.add_assistant_message(text_parts[0]) else: chat.add_assistant_message([*text_parts, *image_parts]) else: assert_never(message) return chat def _create_PIL_image(self, data: bytes, data_type: str) -> PILImage.Image: """Create a PIL Image from the data and data type.""" image = PILImage.open(io.BytesIO(data)) image.format = data_type.split('/')[-1] return image def _process_response(self, response: str) -> ModelResponse: """Turn the Outlines text response into a Pydantic AI model response instance.""" return ModelResponse( parts=cast( list[ModelResponsePart], split_content_into_text_and_thinking(response, self.profile.thinking_tags) ), ) async def _process_streamed_response( self, response: AsyncIterable[str], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Turn the Outlines text response into a Pydantic AI streamed response instance.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): # pragma: no cover raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') return OutlinesStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _model_profile=self.profile, _response=peekable_response, _provider_name='outlines', ) ``` #### __init__ ```python __init__( model: Model | AsyncModel, *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Outlines model. Parameters: | Name | Type | Description | Default | | ---------- | --------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `model` | \`Model | AsyncModel\` | The Outlines model used for the model. | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python def __init__( self, model: OutlinesBaseModel | OutlinesAsyncBaseModel, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Outlines model. Args: model: The Outlines model used for the model. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ self.model: OutlinesBaseModel | OutlinesAsyncBaseModel = model self._model_name: str = 'outlines-model' if isinstance(provider, str): provider = infer_provider(provider) super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### from_transformers ```python from_transformers( hf_model: PreTrainedModel, hf_tokenizer_or_processor: ( PreTrainedTokenizer | ProcessorMixin ), *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Create an Outlines model from a Hugging Face model and tokenizer. Parameters: | Name | Type | Description | Default | | --------------------------- | --------------------- | ------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `hf_model` | `PreTrainedModel` | The Hugging Face PreTrainedModel or any model that is compatible with the transformers API. | *required* | | `hf_tokenizer_or_processor` | \`PreTrainedTokenizer | ProcessorMixin\` | Either a HuggingFace PreTrainedTokenizer or any tokenizer that is compatible with the transformers API, or a HuggingFace processor inheriting from ProcessorMixin. If a tokenizer is provided, a regular model will be used, while if you provide a processor, it will be a multimodal model. | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @classmethod def from_transformers( cls, hf_model: transformers.modeling_utils.PreTrainedModel, hf_tokenizer_or_processor: transformers.tokenization_utils.PreTrainedTokenizer | transformers.processing_utils.ProcessorMixin, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a Hugging Face model and tokenizer. Args: hf_model: The Hugging Face PreTrainedModel or any model that is compatible with the `transformers` API. hf_tokenizer_or_processor: Either a HuggingFace `PreTrainedTokenizer` or any tokenizer that is compatible with the `transformers` API, or a HuggingFace processor inheriting from `ProcessorMixin`. If a tokenizer is provided, a regular model will be used, while if you provide a processor, it will be a multimodal model. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_transformers(hf_model, hf_tokenizer_or_processor) return cls(outlines_model, provider=provider, profile=profile, settings=settings) ``` #### from_llamacpp ```python from_llamacpp( llama_model: Llama, *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Create an Outlines model from a LlamaCpp model. Parameters: | Name | Type | Description | Default | | ------------- | --------------------- | --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `llama_model` | `Llama` | The llama_cpp.Llama model to use. | *required* | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @classmethod def from_llamacpp( # pragma: lax no cover cls, llama_model: llama_cpp.Llama, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a LlamaCpp model. Args: llama_model: The llama_cpp.Llama model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_llamacpp(llama_model) return cls(outlines_model, provider=provider, profile=profile, settings=settings) ``` #### from_mlxlm ```python from_mlxlm( mlx_model: Module, mlx_tokenizer: PreTrainedTokenizer, *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Create an Outlines model from a MLXLM model. Parameters: | Name | Type | Description | Default | | --------------- | --------------------- | ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `mlx_model` | `Module` | The nn.Module model to use. | *required* | | `mlx_tokenizer` | `PreTrainedTokenizer` | The PreTrainedTokenizer to use. | *required* | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @classmethod def from_mlxlm( # pragma: no cover cls, mlx_model: nn.Module, # pyright: ignore[reportUnknownParameterType, reportUnknownMemberType] mlx_tokenizer: transformers.tokenization_utils.PreTrainedTokenizer, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a MLXLM model. Args: mlx_model: The nn.Module model to use. mlx_tokenizer: The PreTrainedTokenizer to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel = from_mlxlm(mlx_model, mlx_tokenizer) # pyright: ignore[reportUnknownArgumentType] return cls(outlines_model, provider=provider, profile=profile, settings=settings) ``` #### from_sglang ```python from_sglang( base_url: str, api_key: str | None = None, model_name: str | None = None, *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Create an Outlines model to send requests to an SGLang server. Parameters: | Name | Type | Description | Default | | ------------ | --------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `base_url` | `str` | The url of the SGLang server. | *required* | | `api_key` | \`str | None\` | The API key to use for authenticating requests to the SGLang server. | | `model_name` | \`str | None\` | The name of the model to use. | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @classmethod def from_sglang( cls, base_url: str, api_key: str | None = None, model_name: str | None = None, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model to send requests to an SGLang server. Args: base_url: The url of the SGLang server. api_key: The API key to use for authenticating requests to the SGLang server. model_name: The name of the model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ try: from openai import AsyncOpenAI except ImportError as _import_error: raise ImportError( 'Please install `openai` to use the Outlines SGLang model, ' 'you can use the `openai` optional group — `pip install "pydantic-ai-slim[openai]"`' ) from _import_error openai_client = AsyncOpenAI(base_url=base_url, api_key=api_key) outlines_model: OutlinesBaseModel | OutlinesAsyncBaseModel = from_sglang(openai_client, model_name) return cls(outlines_model, provider=provider, profile=profile, settings=settings) ``` #### from_vllm_offline ```python from_vllm_offline( vllm_model: Any, *, provider: ( Literal["outlines"] | Provider[Model] ) = "outlines", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Create an Outlines model from a vLLM offline inference model. Parameters: | Name | Type | Description | Default | | ------------ | --------------------- | -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `vllm_model` | `Any` | The vllm.LLM local model to use. | *required* | | `provider` | \`Literal['outlines'] | Provider[Model]\` | The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of Provider[OutlinesBaseModel]. If not provided, the other parameters will be used. | | `profile` | \`ModelProfileSpec | None\` | The model profile to use. Defaults to a profile picked by the provider. | | `settings` | \`ModelSettings | None\` | Default model settings for this model instance. | Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @classmethod def from_vllm_offline( # pragma: no cover cls, vllm_model: Any, *, provider: Literal['outlines'] | Provider[OutlinesBaseModel] = 'outlines', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Create an Outlines model from a vLLM offline inference model. Args: vllm_model: The vllm.LLM local model to use. provider: The provider to use for OutlinesModel. Can be either the string 'outlines' or an instance of `Provider[OutlinesBaseModel]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider. settings: Default model settings for this model instance. """ outlines_model: OutlinesBaseModel | OutlinesAsyncBaseModel = from_vllm_offline(vllm_model) return cls(outlines_model, provider=provider, profile=profile, settings=settings) ``` #### format_inference_kwargs ```python format_inference_kwargs( model_settings: ModelSettings | None, ) -> dict[str, Any] ``` Format the model settings for the inference kwargs. Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python def format_inference_kwargs(self, model_settings: ModelSettings | None) -> dict[str, Any]: """Format the model settings for the inference kwargs.""" settings_dict: dict[str, Any] = dict(model_settings) if model_settings else {} if isinstance(self.model, Transformers): settings_dict = self._format_transformers_inference_kwargs(settings_dict) elif isinstance(self.model, LlamaCpp): # pragma: lax no cover settings_dict = self._format_llama_cpp_inference_kwargs(settings_dict) elif isinstance(self.model, MLXLM): # pragma: no cover settings_dict = self._format_mlxlm_inference_kwargs(settings_dict) elif isinstance(self.model, SGLang | AsyncSGLang): settings_dict = self._format_sglang_inference_kwargs(settings_dict) elif isinstance(self.model, VLLMOffline): # pragma: no cover settings_dict = self._format_vllm_offline_inference_kwargs(settings_dict) extra_body = settings_dict.pop('extra_body', {}) settings_dict.update(extra_body) return settings_dict ``` ### OutlinesStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Outlines models. Source code in `pydantic_ai_slim/pydantic_ai/models/outlines.py` ```python @dataclass class OutlinesStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Outlines models.""" _model_name: str _model_profile: ModelProfile _response: AsyncIterable[str] _provider_name: str _provider_url: str | None = None _timestamp: datetime = field(default_factory=_utils.now_utc) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: async for content in self._response: for event in self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ): yield event @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str | None: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str | None ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.test` Utility model for quickly testing apps built with Pydantic AI. Here's a minimal example: [Learn about Gateway](https://ai.pydantic.dev/gateway) test_model_usage.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel my_agent = Agent('gateway/openai:gpt-5.2', instructions='...') async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" m = TestModel() with my_agent.override(model=m): result = await my_agent.run('Testing my agent...') assert result.output == 'success (no tool calls)' assert m.last_model_request_parameters.function_tools == [] ``` test_model_usage.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel my_agent = Agent('openai:gpt-5.2', instructions='...') async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" m = TestModel() with my_agent.override(model=m): result = await my_agent.run('Testing my agent...') assert result.output == 'success (no tool calls)' assert m.last_model_request_parameters.function_tools == [] ``` See [Unit testing with `TestModel`](https://ai.pydantic.dev/testing/#unit-testing-with-testmodel) for detailed documentation. ### TestModel Bases: `Model` A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass(init=False) class TestModel(Model): """A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. """ # NOTE: Avoid test discovery by pytest. __test__ = False call_tools: list[str] | Literal['all'] = 'all' """List of tools to call. If `'all'`, all tools will be called.""" custom_output_text: str | None = None """If set, this text is returned as the final output.""" custom_output_args: Any | None = None """If set, these args will be passed to the output tool.""" seed: int = 0 """Seed for generating random data.""" last_model_request_parameters: ModelRequestParameters | None = field(default=None, init=False) """The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. """ _model_name: str = field(default='test', repr=False) _system: str = field(default='test', repr=False) def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, model_name: str = 'test', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = model_name self._system = 'test' super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) model_response.usage = _estimate_usage([*messages, model_response]) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) yield TestStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _structured_response=model_response, _messages=messages, _provider_name=self._system, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._system @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """TestModel supports all builtin tools for testing flexibility.""" return SUPPORTED_BUILTIN_TOOLS def gen_tool_args(self, tool_def: ToolDefinition) -> Any: return _JsonSchemaTestData(tool_def.parameters_json_schema, self.seed).generate() def _get_tool_calls(self, model_request_parameters: ModelRequestParameters) -> list[tuple[str, ToolDefinition]]: if self.call_tools == 'all': return [(r.name, r) for r in model_request_parameters.function_tools] else: function_tools_lookup = {t.name: t for t in model_request_parameters.function_tools} tools_to_call = (function_tools_lookup[name] for name in self.call_tools) return [(r.name, r) for r in tools_to_call] def _get_output(self, model_request_parameters: ModelRequestParameters) -> _WrappedTextOutput | _WrappedToolOutput: if self.custom_output_text is not None: assert model_request_parameters.output_mode != 'tool', ( 'Plain response not allowed, but `custom_output_text` is set.' ) assert self.custom_output_args is None, 'Cannot set both `custom_output_text` and `custom_output_args`.' return _WrappedTextOutput(self.custom_output_text) elif self.custom_output_args is not None: assert model_request_parameters.output_tools is not None, ( 'No output tools provided, but `custom_output_args` is set.' ) output_tool = model_request_parameters.output_tools[0] if k := output_tool.outer_typed_dict_key: return _WrappedToolOutput({k: self.custom_output_args}) else: return _WrappedToolOutput(self.custom_output_args) elif model_request_parameters.allow_text_output: return _WrappedTextOutput(None) elif model_request_parameters.output_tools: return _WrappedToolOutput(None) else: return _WrappedTextOutput(None) def _request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: if model_request_parameters.builtin_tools: raise UserError('TestModel does not support built-in tools') tool_calls = self._get_tool_calls(model_request_parameters) output_wrapper = self._get_output(model_request_parameters) output_tools = model_request_parameters.output_tools # if there are tools, the first thing we want to do is call all of them if tool_calls and not any(isinstance(m, ModelResponse) for m in messages): return ModelResponse( parts=[ ToolCallPart(name, self.gen_tool_args(args), tool_call_id=f'pyd_ai_tool_call_id__{name}') for name, args in tool_calls ], model_name=self._model_name, ) if messages: # pragma: no branch last_message = messages[-1] assert isinstance(last_message, ModelRequest), 'Expected last message to be a `ModelRequest`.' # check if there are any retry prompts, if so retry them new_retry_names = {p.tool_name for p in last_message.parts if isinstance(p, RetryPromptPart)} if new_retry_names: # Handle retries for both function tools and output tools # Check function tools first retry_parts: list[ModelResponsePart] = [ ToolCallPart(name, self.gen_tool_args(args)) for name, args in tool_calls if name in new_retry_names ] # Check output tools if output_tools: retry_parts.extend( [ ToolCallPart( tool.name, output_wrapper.value if isinstance(output_wrapper, _WrappedToolOutput) and output_wrapper.value is not None else self.gen_tool_args(tool), tool_call_id=f'pyd_ai_tool_call_id__{tool.name}', ) for tool in output_tools if tool.name in new_retry_names ] ) return ModelResponse(parts=retry_parts, model_name=self._model_name) if isinstance(output_wrapper, _WrappedTextOutput): if (response_text := output_wrapper.value) is None: # build up details of tool responses output: dict[str, Any] = {} for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, ToolReturnPart): output[part.tool_name] = part.content if output: return ModelResponse( parts=[TextPart(pydantic_core.to_json(output).decode())], model_name=self._model_name ) else: return ModelResponse(parts=[TextPart('success (no tool calls)')], model_name=self._model_name) else: return ModelResponse(parts=[TextPart(response_text)], model_name=self._model_name) else: assert output_tools, 'No output tools provided' custom_output_args = output_wrapper.value output_tool = output_tools[self.seed % len(output_tools)] if custom_output_args is not None: return ModelResponse( parts=[ ToolCallPart( output_tool.name, custom_output_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) else: response_args = self.gen_tool_args(output_tool) return ModelResponse( parts=[ ToolCallPart( output_tool.name, response_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) ``` #### __init__ ```python __init__( *, call_tools: list[str] | Literal["all"] = "all", custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, model_name: str = "test", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize TestModel with optional settings and profile. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, model_name: str = 'test', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = model_name self._system = 'test' super().__init__(settings=settings, profile=profile) ``` #### call_tools ```python call_tools: list[str] | Literal['all'] = call_tools ``` List of tools to call. If `'all'`, all tools will be called. #### custom_output_text ```python custom_output_text: str | None = custom_output_text ``` If set, this text is returned as the final output. #### custom_output_args ```python custom_output_args: Any | None = custom_output_args ``` If set, these args will be passed to the output tool. #### seed ```python seed: int = seed ``` Seed for generating random data. #### last_model_request_parameters ```python last_model_request_parameters: ( ModelRequestParameters | None ) = None ``` The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> ( frozenset[type[AbstractBuiltinTool]] ) ``` TestModel supports all builtin tools for testing flexibility. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type[AbstractBuiltinTool]]: """TestModel supports all builtin tools for testing flexibility.""" return SUPPORTED_BUILTIN_TOOLS ``` ### TestStreamedResponse Bases: `StreamedResponse` A structured response that streams test data. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass class TestStreamedResponse(StreamedResponse): """A structured response that streams test data.""" _model_name: str _structured_response: ModelResponse _messages: InitVar[Iterable[ModelMessage]] _provider_name: str _provider_url: str | None = None _timestamp: datetime = field(default_factory=_utils.now_utc, init=False) def __post_init__(self, _messages: Iterable[ModelMessage]): self._usage = _estimate_usage(_messages) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: for i, part in enumerate(self._structured_response.parts): if isinstance(part, TextPart): text = part.content *words, last_word = text.split(' ') words = [f'{word} ' for word in words] words.append(last_word) if len(words) == 1 and len(text) > 2: mid = len(text) // 2 words = [text[:mid], text[mid:]] self._usage += _get_string_usage('') for event in self._parts_manager.handle_text_delta(vendor_part_id=i, content=''): yield event for word in words: self._usage += _get_string_usage(word) for event in self._parts_manager.handle_text_delta(vendor_part_id=i, content=word): yield event elif isinstance(part, ToolCallPart): yield self._parts_manager.handle_tool_call_part( vendor_part_id=i, tool_name=part.tool_name, args=part.args, tool_call_id=part.tool_call_id ) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # NOTE: These parts are not generated by TestModel, but we need to handle them for type checking assert False, f'Unexpected part type in TestModel: {type(part).__name__}' elif isinstance(part, ThinkingPart): # pragma: no cover # NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel. assert False, "This should be unreachable — we don't generate ThinkingPart on TestModel." elif isinstance(part, FilePart): # pragma: no cover # NOTE: There's no way to reach this part of the code, since we don't generate FilePart on TestModel. assert False, "This should be unreachable — we don't generate FilePart on TestModel." else: assert_never(part) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def provider_url(self) -> str | None: """Get the provider base URL.""" return self._provider_url @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### provider_url ```python provider_url: str | None ``` Get the provider base URL. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # pydantic_ai.models.wrapper ### WrapperModel Bases: `Model` Model which wraps another model. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/models/wrapper.py` ```python @dataclass(init=False) class WrapperModel(Model): """Model which wraps another model. Does nothing on its own, used as a base class. """ wrapped: Model """The underlying model being wrapped.""" def __init__(self, wrapped: Model | KnownModelName): super().__init__() self.wrapped = infer_model(wrapped) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await self.wrapped.request(messages, model_settings, model_request_parameters) async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: return await self.wrapped.count_tokens(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: return self.wrapped.customize_request_parameters(model_request_parameters) # pragma: no cover def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: return self.wrapped.prepare_request(model_settings, model_request_parameters) @property def model_name(self) -> str: return self.wrapped.model_name @property def system(self) -> str: return self.wrapped.system @cached_property def profile(self) -> ModelProfile: return self.wrapped.profile @property def settings(self) -> ModelSettings | None: """Get the settings from the wrapped model.""" return self.wrapped.settings def __getattr__(self, item: str): return getattr(self.wrapped, item) ``` #### wrapped ```python wrapped: Model = infer_model(wrapped) ``` The underlying model being wrapped. #### settings ```python settings: ModelSettings | None ``` Get the settings from the wrapped model. # `pydantic_ai.models.xai` ## Setup For details on how to set up authentication with this model, see [model configuration for xAI](https://ai.pydantic.dev/models/xai/index.md). xAI model implementation using [xAI SDK](https://github.com/xai-org/xai-sdk-python). ### XaiModelName ```python XaiModelName = str | ChatModel ``` Possible xAI model names. ### XaiModelSettings Bases: `ModelSettings` Settings specific to xAI models. See [xAI SDK documentation](https://docs.x.ai/docs) for more details on these parameters. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python class XaiModelSettings(ModelSettings, total=False): """Settings specific to xAI models. See [xAI SDK documentation](https://docs.x.ai/docs) for more details on these parameters. """ xai_logprobs: bool """Whether to return log probabilities of the output tokens or not.""" xai_top_logprobs: int """An integer between 0 and 20 specifying the number of most likely tokens to return at each position.""" xai_user: str """A unique identifier representing your end-user, which can help xAI to monitor and detect abuse.""" xai_store_messages: bool """Whether to store messages on xAI's servers for conversation continuity.""" xai_previous_response_id: str """The ID of the previous response to continue the conversation.""" xai_include_encrypted_content: bool """Whether to include the encrypted content in the response. Corresponds to the `use_encrypted_content` value of the model settings in the Responses API. """ xai_include_code_execution_output: bool """Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. """ xai_include_web_search_output: bool """Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. """ xai_include_inline_citations: bool """Whether to include inline citations in the response. Corresponds to the `inline_citations` option in the xAI `include` parameter. """ xai_include_mcp_output: bool """Whether to include the MCP results in the response. Corresponds to the `mcp_call.outputs` value of the `include` parameter in the Responses API. """ ``` #### xai_logprobs ```python xai_logprobs: bool ``` Whether to return log probabilities of the output tokens or not. #### xai_top_logprobs ```python xai_top_logprobs: int ``` An integer between 0 and 20 specifying the number of most likely tokens to return at each position. #### xai_user ```python xai_user: str ``` A unique identifier representing your end-user, which can help xAI to monitor and detect abuse. #### xai_store_messages ```python xai_store_messages: bool ``` Whether to store messages on xAI's servers for conversation continuity. #### xai_previous_response_id ```python xai_previous_response_id: str ``` The ID of the previous response to continue the conversation. #### xai_include_encrypted_content ```python xai_include_encrypted_content: bool ``` Whether to include the encrypted content in the response. Corresponds to the `use_encrypted_content` value of the model settings in the Responses API. #### xai_include_code_execution_output ```python xai_include_code_execution_output: bool ``` Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. #### xai_include_web_search_output ```python xai_include_web_search_output: bool ``` Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. #### xai_include_inline_citations ```python xai_include_inline_citations: bool ``` Whether to include inline citations in the response. Corresponds to the `inline_citations` option in the xAI `include` parameter. #### xai_include_mcp_output ```python xai_include_mcp_output: bool ``` Whether to include the MCP results in the response. Corresponds to the `mcp_call.outputs` value of the `include` parameter in the Responses API. ### XaiModel Bases: `Model` A model that uses the xAI SDK to interact with xAI models. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python class XaiModel(Model): """A model that uses the xAI SDK to interact with xAI models.""" _model_name: str _provider: Provider[AsyncClient] def __init__( self, model_name: XaiModelName, *, provider: Literal['xai'] | Provider[AsyncClient] = 'xai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize the xAI model. Args: model_name: The name of the xAI model to use (e.g., "grok-4-1-fast-non-reasoning") provider: The provider to use for API calls. Defaults to `'xai'`. profile: Optional model profile specification. Defaults to a profile picked by the provider based on the model name. settings: Optional model settings. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile(model_name)) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return 'xai' @classmethod def supported_builtin_tools(cls) -> frozenset[type]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, MCPServerTool}) async def _map_messages( self, messages: list[ModelMessage], model_request_parameters: ModelRequestParameters, ) -> list[chat_types.chat_pb2.Message]: """Convert pydantic_ai messages to xAI SDK messages.""" xai_messages: list[chat_types.chat_pb2.Message] = [] # xAI expects tool results in the same order as tool calls. # # Pydantic AI doesn't guarantee tool-result part ordering, so we track # tool call order as we walk message history and reorder tool results. pending_tool_call_ids: list[str] = [] for message in messages: if isinstance(message, ModelRequest): mapped_request_parts = await self._map_request_parts( message.parts, pending_tool_call_ids, ) xai_messages.extend(mapped_request_parts) elif isinstance(message, ModelResponse): xai_messages.extend(self._map_response_parts(message.parts)) pending_tool_call_ids.extend( part.tool_call_id for part in message.parts if isinstance(part, ToolCallPart) and part.tool_call_id ) else: assert_never(message) # Insert instructions as a system message after existing system messages if present if instructions := self._get_instructions(messages, model_request_parameters): system_prompt_count = sum(1 for m in xai_messages if m.role == chat_types.chat_pb2.MessageRole.ROLE_SYSTEM) xai_messages.insert(system_prompt_count, system(instructions)) return xai_messages async def _map_request_parts( self, parts: Sequence[ModelRequestPart], pending_tool_call_ids: list[str], ) -> list[chat_types.chat_pb2.Message]: """Map ModelRequest parts to xAI messages.""" xai_messages: list[chat_types.chat_pb2.Message] = [] tool_results: list[ToolReturnPart | RetryPromptPart] = [] for part in parts: if isinstance(part, SystemPromptPart): xai_messages.append(system(part.content)) elif isinstance(part, UserPromptPart): if user_msg := await self._map_user_prompt(part): xai_messages.append(user_msg) elif isinstance(part, ToolReturnPart): tool_results.append(part) elif isinstance(part, RetryPromptPart): if part.tool_name is None: xai_messages.append(user(part.model_response())) else: tool_results.append(part) else: assert_never(part) # Sort tool results by requested order, then emit if tool_results: order = {id: i for i, id in enumerate(pending_tool_call_ids)} tool_results.sort(key=lambda p: order.get(p.tool_call_id, float('inf'))) for part in tool_results: text = part.model_response_str() if isinstance(part, ToolReturnPart) else part.model_response() xai_messages.append(tool_result(text)) return xai_messages def _map_response_parts(self, parts: Sequence[ModelResponsePart]) -> list[chat_types.chat_pb2.Message]: """Map ModelResponse parts to xAI assistant messages (one message per part).""" messages: list[chat_types.chat_pb2.Message] = [] # Track builtin tool calls by tool_call_id to update their status with return parts builtin_calls: dict[str, chat_types.chat_pb2.ToolCall] = {} for item in parts: if isinstance(item, TextPart): messages.append(assistant(item.content)) elif isinstance(item, ThinkingPart): if (thinking_msg := self._map_thinking_part(item)) is not None: messages.append(thinking_msg) elif isinstance(item, ToolCallPart): client_side_tool_call = self._map_tool_call(item) self._append_tool_call(messages, client_side_tool_call) elif isinstance(item, BuiltinToolCallPart): builtin_call = self._map_builtin_tool_call_part(item) if item.provider_name == self.system and builtin_call: self._append_tool_call(messages, builtin_call) # Track specific tool calls for status updates # Note: tool_call_id is always truthy here since _map_builtin_tool_call_part # returns None when tool_call_id is empty if item.tool_call_id: # pragma: no branch builtin_calls[item.tool_call_id] = builtin_call elif isinstance(item, BuiltinToolReturnPart): if ( item.provider_name == self.system and item.tool_call_id and (details := item.provider_details) is not None and details.get('status') == 'failed' and (call := builtin_calls.get(item.tool_call_id)) ): call.status = chat_types.chat_pb2.TOOL_CALL_STATUS_FAILED if error_msg := details.get('error'): call.error_message = str(error_msg) elif isinstance(item, FilePart): # Files generated by models (e.g., from CodeExecutionTool) are not sent back pass else: assert_never(item) return messages @staticmethod def _append_tool_call(messages: list[chat_types.chat_pb2.Message], tool_call: chat_types.chat_pb2.ToolCall) -> None: """Append a tool call to the most recent tool-call assistant message, or create a new one. We keep tool calls grouped to avoid generating one assistant message per tool call. """ if messages and messages[-1].tool_calls: messages[-1].tool_calls.append(tool_call) else: msg = assistant('') msg.tool_calls.append(tool_call) messages.append(msg) def _map_thinking_part(self, item: ThinkingPart) -> chat_types.chat_pb2.Message | None: """Map a `ThinkingPart` into a single xAI assistant message. - Native xAI thinking (with optional signature) is sent via `reasoning_content`/`encrypted_content` - Non-xAI (or non-native) thinking is preserved by wrapping in the model profile's thinking tags """ if item.provider_name == self.system and (item.content or item.signature): msg = assistant('') if item.content: msg.reasoning_content = item.content if item.signature: msg.encrypted_content = item.signature return msg elif item.content: start_tag, end_tag = self.profile.thinking_tags return assistant('\n'.join([start_tag, item.content, end_tag])) else: return None def _map_tool_call(self, tool_call_part: ToolCallPart) -> chat_types.chat_pb2.ToolCall: """Map a ToolCallPart to an xAI SDK ToolCall.""" return chat_types.chat_pb2.ToolCall( id=tool_call_part.tool_call_id, type=chat_types.chat_pb2.TOOL_CALL_TYPE_CLIENT_SIDE_TOOL, status=chat_types.chat_pb2.TOOL_CALL_STATUS_COMPLETED, function=chat_types.chat_pb2.FunctionCall( name=tool_call_part.tool_name, arguments=tool_call_part.args_as_json_str(), ), ) def _map_builtin_tool_call_part(self, item: BuiltinToolCallPart) -> chat_types.chat_pb2.ToolCall | None: """Map a BuiltinToolCallPart to an xAI SDK ToolCall with appropriate type and status.""" if not item.tool_call_id: return None if item.tool_name == CodeExecutionTool.kind: return chat_types.chat_pb2.ToolCall( id=item.tool_call_id, type=chat_types.chat_pb2.TOOL_CALL_TYPE_CODE_EXECUTION_TOOL, status=chat_types.chat_pb2.TOOL_CALL_STATUS_COMPLETED, function=chat_types.chat_pb2.FunctionCall( name=CodeExecutionTool.kind, arguments=item.args_as_json_str(), ), ) elif item.tool_name == WebSearchTool.kind: return chat_types.chat_pb2.ToolCall( id=item.tool_call_id, type=chat_types.chat_pb2.TOOL_CALL_TYPE_WEB_SEARCH_TOOL, status=chat_types.chat_pb2.TOOL_CALL_STATUS_COMPLETED, function=chat_types.chat_pb2.FunctionCall( name=WebSearchTool.kind, arguments=item.args_as_json_str(), ), ) elif item.tool_name.startswith(MCPServerTool.kind): # Extract server label from tool_name (format: 'mcp_server:server_label') server_label = item.tool_name.split(':', 1)[1] if ':' in item.tool_name else item.tool_name args_dict = item.args_as_dict() or {} # Extract tool_name and tool_args from the structured args (matches OpenAI/Anthropic pattern) actual_tool_name = args_dict.get('tool_name', '') tool_args = args_dict.get('tool_args', {}) # Construct the full function name in xAI's format: 'server_label.tool_name' function_name = f'{server_label}.{actual_tool_name}' if actual_tool_name else server_label return chat_types.chat_pb2.ToolCall( id=item.tool_call_id, type=chat_types.chat_pb2.TOOL_CALL_TYPE_MCP_TOOL, status=chat_types.chat_pb2.TOOL_CALL_STATUS_COMPLETED, function=chat_types.chat_pb2.FunctionCall( name=function_name, arguments=json.dumps(tool_args), ), ) return None async def _upload_file_to_xai(self, data: bytes, filename: str) -> str: """Upload a file to xAI files API and return the file ID. Args: data: The file content as bytes filename: The filename to use for the upload Returns: The file ID from xAI """ uploaded_file = await self._provider.client.files.upload(data, filename=filename) return uploaded_file.id async def _map_user_prompt(self, part: UserPromptPart) -> chat_types.chat_pb2.Message | None: # noqa: C901 """Map a UserPromptPart to an xAI user message.""" if isinstance(part.content, str): return user(part.content) # Handle complex content (images, text, etc.) content_items: list[chat_types.Content] = [] for item in part.content: if isinstance(item, str): content_items.append(item) elif isinstance(item, ImageUrl): # Get detail from vendor_metadata if available detail: chat_types.ImageDetail = 'auto' if item.vendor_metadata and 'detail' in item.vendor_metadata: detail = item.vendor_metadata['detail'] image_url = item.url if item.force_download: downloaded = await download_item(item, data_format='base64_uri', type_format='extension') image_url = downloaded['data'] content_items.append(image(image_url, detail=detail)) elif isinstance(item, BinaryContent): if item.is_image: # Convert binary content to data URI and use image() image_detail: chat_types.ImageDetail = 'auto' if item.vendor_metadata and 'detail' in item.vendor_metadata: image_detail = item.vendor_metadata['detail'] content_items.append(image(item.data_uri, detail=image_detail)) elif item.is_audio: raise NotImplementedError('AudioUrl/BinaryContent with audio is not supported by xAI SDK') elif item.is_document: # Upload document to xAI files API and reference it filename = item.identifier or f'document.{item.format}' file_id = await self._upload_file_to_xai(item.data, filename) content_items.append(file(file_id)) else: raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): raise NotImplementedError('AudioUrl is not supported by xAI SDK') elif isinstance(item, DocumentUrl): # Download and upload to xAI files API downloaded = await download_item(item, data_format='bytes') filename = item.identifier or 'document' # Add extension if data_type is available from download if 'data_type' in downloaded and downloaded['data_type']: filename = f'{filename}.{downloaded["data_type"]}' file_id = await self._upload_file_to_xai(downloaded['data'], filename) content_items.append(file(file_id)) elif isinstance(item, VideoUrl): raise NotImplementedError('VideoUrl is not supported by xAI SDK') elif isinstance(item, CachePoint): # xAI doesn't support prompt caching via CachePoint, so we filter it out pass else: assert_never(item) if content_items: return user(*content_items) return None async def _create_chat( self, messages: list[ModelMessage], model_settings: XaiModelSettings, model_request_parameters: ModelRequestParameters, ) -> Any: """Create an xAI chat instance with common setup for both request and stream. Returns: The xAI SDK chat object, ready to call .sample() or .stream() on. """ # Convert messages to xAI format xai_messages = await self._map_messages(messages, model_request_parameters) # Convert tools: combine built-in (server-side) tools and custom (client-side) tools tools: list[chat_types.chat_pb2.Tool] = [] if model_request_parameters.builtin_tools: tools.extend(_get_builtin_tools(model_request_parameters)) if model_request_parameters.tool_defs: tools.extend(_map_tools(model_request_parameters)) tools_param = tools if tools else None # Set tool_choice based on whether tools are available and text output is allowed profile = GrokModelProfile.from_profile(self.profile) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output and profile.grok_supports_tool_choice_required: tool_choice = 'required' else: tool_choice = 'auto' # Set response_format based on the output_mode response_format: chat_pb2.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = _map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and not tools and profile.supports_json_object_output ): # pragma: no branch response_format = _map_json_object() # Map model settings to xAI SDK parameters xai_settings = _map_model_settings(model_settings) # Populate use_encrypted_content and include based on model settings include: list[chat_pb2.IncludeOption] = [] use_encrypted_content = model_settings.get('xai_include_encrypted_content') or False if model_settings.get('xai_include_code_execution_output'): include.append(chat_pb2.IncludeOption.INCLUDE_OPTION_CODE_EXECUTION_CALL_OUTPUT) if model_settings.get('xai_include_web_search_output'): include.append(chat_pb2.IncludeOption.INCLUDE_OPTION_WEB_SEARCH_CALL_OUTPUT) if model_settings.get('xai_include_inline_citations'): include.append(chat_pb2.IncludeOption.INCLUDE_OPTION_INLINE_CITATIONS) # x_search not yet supported # collections_search not yet supported (could be mapped to file search) if model_settings.get('xai_include_mcp_output'): include.append(chat_pb2.IncludeOption.INCLUDE_OPTION_MCP_CALL_OUTPUT) # Create and return chat instance return self._provider.client.chat.create( model=self._model_name, messages=xai_messages, tools=tools_param, tool_choice=tool_choice, response_format=response_format, use_encrypted_content=use_encrypted_content, include=include, **xai_settings, ) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the xAI model.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) chat = await self._create_chat(messages, cast(XaiModelSettings, model_settings or {}), model_request_parameters) response = await chat.sample() return self._process_response(response) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the xAI model.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) chat = await self._create_chat(messages, cast(XaiModelSettings, model_settings or {}), model_request_parameters) response_stream = chat.stream() yield await self._process_streamed_response(response_stream, model_request_parameters) def _process_response(self, response: chat_types.Response) -> ModelResponse: """Convert xAI SDK response to pydantic_ai ModelResponse. Processes response.proto.outputs to extract (in order): - ThinkingPart: For reasoning/thinking content - TextPart: For text content - ToolCallPart: For client-side tool calls - BuiltinToolCallPart + BuiltinToolReturnPart: For server-side (builtin) tool calls """ parts: list[ModelResponsePart] = [] outputs = response.proto.outputs for output in outputs: message = output.message # Add reasoning/thinking content if present if message.reasoning_content or message.encrypted_content: signature = message.encrypted_content or None parts.append( ThinkingPart( content=message.reasoning_content or '', signature=signature, provider_name=self.system if signature else None, ) ) # Add text content from assistant messages if message.content and message.role == chat_types.chat_pb2.MessageRole.ROLE_ASSISTANT: part_provider_details: dict[str, Any] | None = None if output.logprobs and output.logprobs.content: part_provider_details = {'logprobs': _map_logprobs(output.logprobs)} parts.append(TextPart(content=message.content, provider_details=part_provider_details)) # Process tool calls in this output for tool_call in message.tool_calls: tool_result_content = _get_tool_result_content(message.content) _, part = _create_tool_call_part( tool_call, tool_result_content, self.system, message_role=message.role, ) parts.append(part) # Convert usage with detailed token information usage = _extract_usage(response, self._model_name, self._provider.name, self._provider.base_url) # Map finish reason. # # The xAI SDK exposes `response.finish_reason` as a *string* for the overall response, but in # multi-output responses (e.g. server-side tools) it can reflect an intermediate TOOL_CALLS # output rather than the final STOP output. We derive the finish reason from the final output # when available. if outputs: last_reason = outputs[-1].finish_reason finish_reason = _FINISH_REASON_PROTO_MAP.get(last_reason, 'stop') else: # pragma: no cover finish_reason = _FINISH_REASON_MAP.get(response.finish_reason, 'stop') return ModelResponse( parts=parts, usage=usage, model_name=self._model_name, timestamp=response.created, provider_name=self.system, provider_url=self._provider.base_url, provider_response_id=response.id, finish_reason=finish_reason, ) async def _process_streamed_response( self, response: AsyncIterator[tuple[chat_types.Response, Any]], model_request_parameters: ModelRequestParameters, ) -> 'XaiStreamedResponse': """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_item = await peekable_response.peek() if isinstance(first_item, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') first_response, _ = first_item return XaiStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _response=peekable_response, _timestamp=first_response.created, _provider=self._provider, ) ``` #### __init__ ```python __init__( model_name: XaiModelName, *, provider: ( Literal["xai"] | Provider[AsyncClient] ) = "xai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize the xAI model. Parameters: | Name | Type | Description | Default | | ------------ | ------------------ | ---------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- | | `model_name` | `XaiModelName` | The name of the xAI model to use (e.g., "grok-4-1-fast-non-reasoning") | *required* | | `provider` | \`Literal['xai'] | Provider[AsyncClient]\` | The provider to use for API calls. Defaults to 'xai'. | | `profile` | \`ModelProfileSpec | None\` | Optional model profile specification. Defaults to a profile picked by the provider based on the model name. | | `settings` | \`ModelSettings | None\` | Optional model settings. | Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python def __init__( self, model_name: XaiModelName, *, provider: Literal['xai'] | Provider[AsyncClient] = 'xai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize the xAI model. Args: model_name: The name of the xAI model to use (e.g., "grok-4-1-fast-non-reasoning") provider: The provider to use for API calls. Defaults to `'xai'`. profile: Optional model profile specification. Defaults to a profile picked by the provider based on the model name. settings: Optional model settings. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile(model_name)) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. #### supported_builtin_tools ```python supported_builtin_tools() -> frozenset[type] ``` Return the set of builtin tool types this model can handle. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python @classmethod def supported_builtin_tools(cls) -> frozenset[type]: """Return the set of builtin tool types this model can handle.""" return frozenset({WebSearchTool, CodeExecutionTool, MCPServerTool}) ``` #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the xAI model. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the xAI model.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) chat = await self._create_chat(messages, cast(XaiModelSettings, model_settings or {}), model_request_parameters) response = await chat.sample() return self._process_response(response) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the xAI model. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the xAI model.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) chat = await self._create_chat(messages, cast(XaiModelSettings, model_settings or {}), model_request_parameters) response_stream = chat.stream() yield await self._process_streamed_response(response_stream, model_request_parameters) ``` ### XaiStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for xAI SDK. Source code in `pydantic_ai_slim/pydantic_ai/models/xai.py` ```python @dataclass class XaiStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for xAI SDK.""" _model_name: str _response: _utils.PeekableAsyncStream[tuple[chat_types.Response, chat_types.Chunk]] _timestamp: datetime _provider: Provider[AsyncClient] @property def system(self) -> str: """The model provider system name.""" return self._provider.name @property def provider_url(self) -> str: """Get the provider base URL.""" return self._provider.base_url def _update_response_state(self, response: chat_types.Response) -> None: """Update response state including usage, response ID, and finish reason.""" # Update usage (SDK Response always provides a usage object) self._usage = _extract_usage(response, self._model_name, self._provider.name, self._provider.base_url) # Set provider response ID (only set once) if response.id and self.provider_response_id is None: self.provider_response_id = response.id # Handle finish reason (SDK Response always provides a finish_reason) self.finish_reason = _FINISH_REASON_MAP.get(response.finish_reason, 'stop') def _collect_reasoning_events( self, *, response: chat_types.Response, prev_reasoning_content: str, prev_encrypted_content: str, ) -> tuple[str, str, list[ModelResponseStreamEvent]]: """Collect thinking/reasoning events and return updated previous values. Note: xAI exposes reasoning via the accumulated Response object (not the per-chunk delta), so we compute deltas ourselves to avoid re-emitting the entire accumulated content on every chunk. """ events: list[ModelResponseStreamEvent] = [] if response.reasoning_content and response.reasoning_content != prev_reasoning_content: if response.reasoning_content.startswith(prev_reasoning_content): reasoning_delta = response.reasoning_content[len(prev_reasoning_content) :] else: reasoning_delta = response.reasoning_content prev_reasoning_content = response.reasoning_content if reasoning_delta: # pragma: no branch events.extend( self._parts_manager.handle_thinking_delta( vendor_part_id='reasoning', content=reasoning_delta, # Only set provider_name when we have an encrypted signature to send back. provider_name=self.system if response.encrypted_content else None, ) ) if response.encrypted_content and response.encrypted_content != prev_encrypted_content: prev_encrypted_content = response.encrypted_content events.extend( self._parts_manager.handle_thinking_delta( vendor_part_id='reasoning', signature=response.encrypted_content, provider_name=self.system, ) ) return prev_reasoning_content, prev_encrypted_content, events def _handle_server_side_tool_call( self, *, tool_call: chat_pb2.ToolCall, delta: chat_pb2.Delta, seen_tool_call_ids: set[str], seen_tool_return_ids: set[str], last_tool_return_content: dict[str, dict[str, Any] | str | None], ) -> Iterator[ModelResponseStreamEvent]: """Handle a single server-side tool call delta, yielding stream events.""" builtin_tool_name = _get_builtin_tool_name(tool_call) if delta.role == chat_pb2.MessageRole.ROLE_ASSISTANT: # Emit the call part once per tool_call_id. if tool_call.id in seen_tool_call_ids: return seen_tool_call_ids.add(tool_call.id) if builtin_tool_name.startswith(MCPServerTool.kind): parsed_args = _build_mcp_tool_call_args(tool_call) else: parsed_args = _parse_tool_args(tool_call.function.arguments) call_part = BuiltinToolCallPart( tool_name=builtin_tool_name, args=parsed_args, tool_call_id=tool_call.id, provider_name=self.system ) yield self._parts_manager.handle_part(vendor_part_id=tool_call.id, part=call_part) return if delta.role == chat_pb2.MessageRole.ROLE_TOOL: # Emit the return part once per tool_call_id. return_vendor_id = f'{tool_call.id}_return' tool_result_content = _get_tool_result_content(delta.content) if return_vendor_id in seen_tool_return_ids and tool_result_content == last_tool_return_content.get( return_vendor_id ): return seen_tool_return_ids.add(return_vendor_id) last_tool_return_content[return_vendor_id] = tool_result_content return_part = BuiltinToolReturnPart( tool_name=builtin_tool_name, content=tool_result_content, tool_call_id=tool_call.id, provider_name=self.system, ) yield self._parts_manager.handle_part(vendor_part_id=return_vendor_id, part=return_part) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: """Iterate over streaming events from xAI SDK.""" # Local state to avoid re-emmiting duplicate events. prev_reasoning_content = '' prev_encrypted_content = '' seen_tool_call_ids: set[str] = set() seen_tool_return_ids: set[str] = set() last_tool_return_content: dict[str, dict[str, Any] | str | None] = {} # Track previous tool call args to compute deltas (like we do for reasoning content). prev_tool_call_args: dict[str, str] = {} async for response, chunk in self._response: self._update_response_state(response) prev_reasoning_content, prev_encrypted_content, reasoning_events = self._collect_reasoning_events( response=response, prev_reasoning_content=prev_reasoning_content, prev_encrypted_content=prev_encrypted_content, ) for event in reasoning_events: yield event # Handle text content (property filters for ROLE_ASSISTANT) if chunk.content: for event in self._parts_manager.handle_text_delta( vendor_part_id='content', content=chunk.content, ): yield event # Handle tool calls/tool results from *this chunk*. # # Important: xAI SDK `Response` is an accumulated view; `response.tool_calls` includes tool calls from # previous chunks. Iterating over it would re-emit tool calls repeatedly. Instead, we read tool calls # from the chunk's deltas which represent what changed in this frame. for output_chunk in chunk.proto.outputs: delta = output_chunk.delta if not delta.tool_calls: continue for tool_call in delta.tool_calls: if not tool_call.function.name: continue if tool_call.type != chat_pb2.ToolCallType.TOOL_CALL_TYPE_CLIENT_SIDE_TOOL: for event in self._handle_server_side_tool_call( tool_call=tool_call, delta=delta, seen_tool_call_ids=seen_tool_call_ids, seen_tool_return_ids=seen_tool_return_ids, last_tool_return_content=last_tool_return_content, ): yield event else: # Client-side tools: emit args as deltas so UI adapters receive PartDeltaEvents # (not repeated PartStartEvents). Use accumulated args from response.tool_calls # and compute the delta like we do for reasoning content. accumulated = next((tc for tc in response.tool_calls if tc.id == tool_call.id), None) accumulated_args = ( accumulated.function.arguments if accumulated is not None and accumulated.function.arguments else tool_call.function.arguments ) prev_args = prev_tool_call_args.get(tool_call.id, '') is_new_tool_call = tool_call.id not in prev_tool_call_args args_changed = accumulated_args != prev_args if is_new_tool_call or args_changed: # Compute delta: if accumulated starts with prev, extract the new portion. if accumulated_args.startswith(prev_args): args_delta = accumulated_args[len(prev_args) :] or None else: args_delta = accumulated_args or None prev_tool_call_args[tool_call.id] = accumulated_args maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=tool_call.id, # Only pass tool_name on the first call; it would be appended otherwise. tool_name=tool_call.function.name if is_new_tool_call else None, args=args_delta, tool_call_id=tool_call.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """The model provider.""" return self.system @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### system ```python system: str ``` The model provider system name. #### provider_url ```python provider_url: str ``` Get the provider base URL. #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` The model provider. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_evals.dataset` Dataset management for pydantic evals. This module provides functionality for creating, loading, saving, and evaluating datasets of test cases. Each case must have inputs, and can optionally have a name, expected output, metadata, and case-specific evaluators. Datasets can be loaded from and saved to YAML or JSON files, and can be evaluated against a task function to produce an evaluation report. ### InputsT ```python InputsT = TypeVar('InputsT', default=Any) ``` Generic type for the inputs to the task being evaluated. ### OutputT ```python OutputT = TypeVar('OutputT', default=Any) ``` Generic type for the expected output of the task being evaluated. ### MetadataT ```python MetadataT = TypeVar('MetadataT', default=Any) ``` Generic type for the metadata associated with the task being evaluated. ### DEFAULT_DATASET_PATH ```python DEFAULT_DATASET_PATH = './test_cases.yaml' ``` Default path for saving/loading datasets. ### DEFAULT_SCHEMA_PATH_TEMPLATE ```python DEFAULT_SCHEMA_PATH_TEMPLATE = './{stem}_schema.json' ``` Default template for schema file paths, where {stem} is replaced with the dataset filename stem. ### Case Bases: `Generic[InputsT, OutputT, MetadataT]` A single row of a Dataset. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python @dataclass(init=False) class Case(Generic[InputsT, OutputT, MetadataT]): """A single row of a [`Dataset`][pydantic_evals.Dataset]. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` """ name: str | None """Name of the case. This is used to identify the case in the report and can be used to filter cases.""" inputs: InputsT """Inputs to the task. This is the input to the task that will be evaluated.""" metadata: MetadataT | None = None """Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. """ expected_output: OutputT | None = None """Expected output of the task. This is the expected output of the task that will be evaluated.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = field( default_factory=list[Evaluator[InputsT, OutputT, MetadataT]] ) """Evaluators to be used just on this case.""" def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ```` #### __init__ ```python __init__( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) ``` Initialize a new test case. Parameters: | Name | Type | Description | Default | | ----------------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | | `name` | \`str | None\` | Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | \`MetadataT | None\` | Optional metadata for the case, which can be used by evaluators. | | `expected_output` | \`OutputT | None\` | Optional expected output of the task, used for comparison in evaluators. | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ``` #### name ```python name: str | None = name ``` Name of the case. This is used to identify the case in the report and can be used to filter cases. #### inputs ```python inputs: InputsT = inputs ``` Inputs to the task. This is the input to the task that will be evaluated. #### metadata ```python metadata: MetadataT | None = metadata ``` Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. #### expected_output ```python expected_output: OutputT | None = expected_output ``` Expected output of the task. This is the expected output of the task that will be evaluated. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( list(evaluators) ) ``` Evaluators to be used just on this case. ### Dataset Bases: `BaseModel`, `Generic[InputsT, OutputT, MetadataT]` A dataset of test cases. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python class Dataset(BaseModel, Generic[InputsT, OutputT, MetadataT], extra='forbid', arbitrary_types_allowed=True): """A dataset of test [cases][pydantic_evals.Case]. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` """ name: str | None = None """Optional name of the dataset.""" cases: list[Case[InputsT, OutputT, MetadataT]] """List of test cases in the dataset.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = [] """List of evaluators to be used on all cases in the dataset.""" def __init__( self, *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: name: Optional name for the dataset. cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( name=name, cases=cases, evaluators=list(evaluators), ) # TODO in v2: Make everything not required keyword-only async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. metadata: Optional dict of experiment metadata. Returns: A report containing the results of the evaluation. """ task_name = task_name or get_unwrapped_function_name(task) name = name or task_name total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() extra_attributes: dict[str, Any] = {'gen_ai.operation.name': 'experiment'} if metadata is not None: extra_attributes['metadata'] = metadata with ( logfire_span( 'evaluate {name}', name=name, task_name=task_name, dataset_name=self.name, n_cases=len(self.cases), **extra_attributes, ) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {task_name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, experiment_metadata=metadata, span_id=span_id, trace_id=trace_id, ) full_experiment_metadata: dict[str, Any] = {'n_cases': len(self.cases)} if metadata is not None: full_experiment_metadata['metadata'] = metadata if (averages := report.averages()) is not None: full_experiment_metadata['averages'] = averages if averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) eval_span.set_attribute('logfire.experiment.metadata', full_experiment_metadata) return report def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.dataset.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. metadata: Optional dict of experiment metadata. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate( task, name=name, max_concurrency=max_concurrency, progress=progress, retry_task=retry_task, retry_evaluators=retry_evaluators, task_name=task_name, metadata=metadata, ) ) def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') @classmethod @functools.cache def _params(cls) -> tuple[type[InputsT], type[OutputT], type[MetadataT]]: """Get the type parameters for the Dataset class. Returns: A tuple of (InputsT, OutputT, MetadataT) types. """ for c in cls.__mro__: metadata = getattr(c, '__pydantic_generic_metadata__', {}) if len(args := (metadata.get('args', ()) or getattr(c, '__args__', ()))) == 3: # pragma: no branch return args else: # pragma: no cover warnings.warn( f'Could not determine the generic parameters for {cls}; using `Any` for each.' f' You should explicitly set the generic parameters via `Dataset[MyInputs, MyOutput, MyMetadata]`' f' when serializing or deserializing.', UserWarning, ) return Any, Any, Any # type: ignore @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text(encoding='utf-8') try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types, default_name=path.stem) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the serialized contents. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types, default_name=default_name) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the data. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) @classmethod def _from_dataset_model( cls, dataset_model: _DatasetModel[InputsT, OutputT, MetadataT], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), default_name: str | None = None, ) -> Self: """Create a Dataset from a _DatasetModel. Args: dataset_model: The _DatasetModel to convert. custom_evaluator_types: Custom evaluator classes to register for deserialization. default_name: Default name of the dataset, to be used if the value is `None` in the provided model. Returns: A new Dataset instance created from the _DatasetModel. """ registry = _get_registry(custom_evaluator_types) cases: list[Case[InputsT, OutputT, MetadataT]] = [] errors: list[ValueError] = [] dataset_evaluators: list[Evaluator] = [] for spec in dataset_model.evaluators: try: dataset_evaluator = _load_evaluator_from_registry(registry, None, spec) except ValueError as e: errors.append(e) continue dataset_evaluators.append(dataset_evaluator) for row in dataset_model.cases: evaluators: list[Evaluator] = [] for spec in row.evaluators: try: evaluator = _load_evaluator_from_registry(registry, row.name, spec) except ValueError as e: errors.append(e) continue evaluators.append(evaluator) row = Case[InputsT, OutputT, MetadataT]( name=row.name, inputs=row.inputs, metadata=row.metadata, expected_output=row.expected_output, ) row.evaluators = evaluators cases.append(row) if errors: raise ExceptionGroup(f'{len(errors)} error(s) loading evaluators from registry', errors[:3]) result = cls(name=dataset_model.name, cases=cases) if result.name is None: result.name = default_name result.evaluators = dataset_evaluators return result def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content, encoding='utf-8') else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, context=context) path.write_text(json_data + '\n', encoding='utf-8') @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa: UP007 class Dataset(BaseModel, extra='forbid'): name: str | None = None cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa: UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema @classmethod def _save_schema( cls, path: Path | str, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = () ): """Save the JSON schema for this dataset type to a file. Args: path: Path to save the schema to. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) json_schema = cls.model_json_schema_with_evaluators(custom_evaluator_types) schema_content = to_json(json_schema, indent=2).decode() + '\n' if not path.exists() or path.read_text(encoding='utf-8') != schema_content: # pragma: no branch path.write_text(schema_content, encoding='utf-8') @classmethod @functools.cache def _serialization_type(cls) -> type[_DatasetModel[InputsT, OutputT, MetadataT]]: """Get the serialization type for this dataset class. Returns: A _DatasetModel type with the same generic parameters as this Dataset class. """ input_type, output_type, metadata_type = cls._params() return _DatasetModel[input_type, output_type, metadata_type] @classmethod def _infer_fmt(cls, path: Path, fmt: Literal['yaml', 'json'] | None) -> Literal['yaml', 'json']: """Infer the format to use for a file based on its extension. Args: path: The path to infer the format for. fmt: The explicitly provided format, if any. Returns: The inferred format ('yaml' or 'json'). Raises: ValueError: If the format cannot be inferred from the file extension. """ if fmt is not None: return fmt suffix = path.suffix.lower() if suffix in {'.yaml', '.yml'}: return 'yaml' elif suffix == '.json': return 'json' raise ValueError( f'Could not infer format for filename {path.name!r}. Use the `fmt` argument to specify the format.' ) @model_serializer(mode='wrap') def _add_json_schema(self, nxt: SerializerFunctionWrapHandler, info: SerializationInfo) -> dict[str, Any]: """Add the JSON schema path to the serialized output. See for context, that seems to be the nearest there is to a spec for this. """ context = cast(dict[str, Any] | None, info.context) if isinstance(context, dict) and (schema := context.get('$schema')): return {'$schema': schema} | nxt(self) else: return nxt(self) ```` #### name ```python name: str | None = None ``` Optional name of the dataset. #### cases ```python cases: list[Case[InputsT, OutputT, MetadataT]] ``` List of test cases in the dataset. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( [] ) ``` List of evaluators to be used on all cases in the dataset. #### __init__ ```python __init__( *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[ Evaluator[InputsT, OutputT, MetadataT] ] = () ) ``` Initialize a new dataset with test cases and optional evaluators. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------------------------------- | --------------------------------------------------------------------- | ------------------------------ | | `name` | \`str | None\` | Optional name for the dataset. | | `cases` | `Sequence[Case[InputsT, OutputT, MetadataT]]` | Sequence of test cases to include in the dataset. | *required* | | `evaluators` | `Sequence[Evaluator[InputsT, OutputT, MetadataT]]` | Optional sequence of evaluators to apply to all cases in the dataset. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: name: Optional name for the dataset. cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( name=name, cases=cases, evaluators=list(evaluators), ) ``` #### evaluate ```python evaluate( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Parameters: | Name | Type | Description | Default | | ------------------ | ------------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `task` | \`Callable\[[InputsT], Awaitable[OutputT]\] | Callable\[[InputsT], OutputT\]\` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | | `name` | \`str | None\` | The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. | | `max_concurrency` | \`int | None\` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | | `retry_task` | \`RetryConfig | None\` | Optional retry configuration for the task execution. | | `retry_evaluators` | \`RetryConfig | None\` | Optional retry configuration for evaluator execution. | | `task_name` | \`str | None\` | Optional override to the name of the task being executed, otherwise the name of the task function will be used. | | `metadata` | \`dict[str, Any] | None\` | Optional dict of experiment metadata. | Returns: | Type | Description | | ----------------------------------------------- | -------------------------------------------------- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. metadata: Optional dict of experiment metadata. Returns: A report containing the results of the evaluation. """ task_name = task_name or get_unwrapped_function_name(task) name = name or task_name total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() extra_attributes: dict[str, Any] = {'gen_ai.operation.name': 'experiment'} if metadata is not None: extra_attributes['metadata'] = metadata with ( logfire_span( 'evaluate {name}', name=name, task_name=task_name, dataset_name=self.name, n_cases=len(self.cases), **extra_attributes, ) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {task_name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, experiment_metadata=metadata, span_id=span_id, trace_id=trace_id, ) full_experiment_metadata: dict[str, Any] = {'n_cases': len(self.cases)} if metadata is not None: full_experiment_metadata['metadata'] = metadata if (averages := report.averages()) is not None: full_experiment_metadata['averages'] = averages if averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) eval_span.set_attribute('logfire.experiment.metadata', full_experiment_metadata) return report ``` #### evaluate_sync ```python evaluate_sync( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around evaluate provided for convenience. Parameters: | Name | Type | Description | Default | | ------------------ | ------------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `task` | \`Callable\[[InputsT], Awaitable[OutputT]\] | Callable\[[InputsT], OutputT\]\` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | | `name` | \`str | None\` | The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. | | `max_concurrency` | \`int | None\` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | | `retry_task` | \`RetryConfig | None\` | Optional retry configuration for the task execution. | | `retry_evaluators` | \`RetryConfig | None\` | Optional retry configuration for evaluator execution. | | `task_name` | \`str | None\` | Optional override to the name of the task being executed, otherwise the name of the task function will be used. | | `metadata` | \`dict[str, Any] | None\` | Optional dict of experiment metadata. | Returns: | Type | Description | | ----------------------------------------------- | -------------------------------------------------- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, metadata: dict[str, Any] | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.dataset.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. metadata: Optional dict of experiment metadata. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate( task, name=name, max_concurrency=max_concurrency, progress=progress, retry_task=retry_task, retry_evaluators=retry_evaluators, task_name=task_name, metadata=metadata, ) ) ``` #### add_case ```python add_case( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) -> None ``` Adds a case to the dataset. This is a convenience method for creating a Case and adding it to the dataset. Parameters: | Name | Type | Description | Default | | ----------------- | ---------------------------------------------------- | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | | `name` | \`str | None\` | Optional name for the case. If not provided, a generic name will be assigned. | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | \`MetadataT | None\` | Optional metadata for the case, which can be used by evaluators. | | `expected_output` | \`OutputT | None\` | The expected output of the task, used for comparison in evaluators. | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case, in addition to dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) ``` #### add_evaluator ```python add_evaluator( evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None ``` Adds an evaluator to the dataset or a specific case. Parameters: | Name | Type | Description | Default | | --------------- | ---------------------------------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- | | `evaluator` | `Evaluator[InputsT, OutputT, MetadataT]` | The evaluator to add. | *required* | | `specific_case` | \`str | None\` | If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. | Raises: | Type | Description | | ------------ | ------------------------------------------------------------------------------ | | `ValueError` | If specific_case is provided but no case with that name exists in the dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') ``` #### from_file ```python from_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> Self ``` Load a dataset from a file. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------ | | `path` | \`Path | str\` | Path to the file to load. | | `fmt` | \`Literal['yaml', 'json'] | None\` | Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | Returns: | Type | Description | | ------ | -------------------------------------------- | | `Self` | A new Dataset instance loaded from the file. | Raises: | Type | Description | | ----------------- | --------------------------------------------------------- | | `ValidationError` | If the file cannot be parsed as a valid dataset. | | `ValueError` | If the format cannot be inferred from the file extension. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text(encoding='utf-8') try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types, default_name=path.stem) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e ``` #### from_text ```python from_text( contents: str, fmt: Literal["yaml", "json"] = "yaml", custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), *, default_name: str | None = None ) -> Self ``` Load a dataset from a string. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------ | | `contents` | `str` | The string content to parse. | *required* | | `fmt` | `Literal['yaml', 'json']` | Format of the content. Must be either 'yaml' or 'json'. | `'yaml'` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | | `default_name` | \`str | None\` | Default name of the dataset, to be used if not specified in the serialized contents. | Returns: | Type | Description | | ------ | ---------------------------------------------- | | `Self` | A new Dataset instance parsed from the string. | Raises: | Type | Description | | ----------------- | --------------------------------------------------- | | `ValidationError` | If the content cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the serialized contents. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types, default_name=default_name) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) ``` #### from_dict ```python from_dict( data: dict[str, Any], custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), *, default_name: str | None = None ) -> Self ``` Load a dataset from a dictionary. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------- | | `data` | `dict[str, Any]` | Dictionary representation of the dataset. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | | `default_name` | \`str | None\` | Default name of the dataset, to be used if not specified in the data. | Returns: | Type | Description | | ------ | --------------------------------------------------- | | `Self` | A new Dataset instance created from the dictionary. | Raises: | Type | Description | | ----------------- | --------------------------------------------------------- | | `ValidationError` | If the dictionary cannot be converted to a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the data. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) ``` #### to_file ```python to_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, schema_path: ( Path | str | None ) = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) ``` Save the dataset to a file. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | -------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- | | `path` | \`Path | str\` | Path to save the dataset to. | | `fmt` | \`Literal['yaml', 'json'] | None\` | Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | | `schema_path` | \`Path | str | None\` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content, encoding='utf-8') else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, context=context) path.write_text(json_data + '\n', encoding='utf-8') ``` #### model_json_schema_with_evaluators ```python model_json_schema_with_evaluators( custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> dict[str, Any] ``` Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | -------------------------------------------------- | ------- | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Returns: | Type | Description | | ---------------- | ------------------------------------------ | | `dict[str, Any]` | A dictionary representing the JSON schema. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa: UP007 class Dataset(BaseModel, extra='forbid'): name: str | None = None cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa: UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema ``` ### set_eval_attribute ```python set_eval_attribute(name: str, value: Any) -> None ``` Set an attribute on the current task run. Parameters: | Name | Type | Description | Default | | ------- | ----- | --------------------------- | ---------- | | `name` | `str` | The name of the attribute. | *required* | | `value` | `Any` | The value of the attribute. | *required* | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def set_eval_attribute(name: str, value: Any) -> None: """Set an attribute on the current task run. Args: name: The name of the attribute. value: The value of the attribute. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.record_attribute(name, value) ``` ### increment_eval_metric ```python increment_eval_metric( name: str, amount: int | float ) -> None ``` Increment a metric on the current task run. Parameters: | Name | Type | Description | Default | | -------- | ----- | ----------------------- | --------------------------- | | `name` | `str` | The name of the metric. | *required* | | `amount` | \`int | float\` | The amount to increment by. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def increment_eval_metric(name: str, amount: int | float) -> None: """Increment a metric on the current task run. Args: name: The name of the metric. amount: The amount to increment by. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.increment_metric(name, amount) ``` # `pydantic_evals.evaluators` ### Contains Bases: `Evaluator[object, object, object]` Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Contains(Evaluator[object, object, object]): """Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. """ value: Any case_sensitive: bool = True as_strings: bool = False evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluationReason: # Convert objects to strings if requested failure_reason: str | None = None as_strings = self.as_strings or (isinstance(self.value, str) and isinstance(ctx.output, str)) if as_strings: output_str = str(ctx.output) expected_str = str(self.value) if not self.case_sensitive: output_str = output_str.lower() expected_str = expected_str.lower() failure_reason: str | None = None if expected_str not in output_str: output_trunc = _truncated_repr(output_str, max_length=100) expected_trunc = _truncated_repr(expected_str, max_length=100) failure_reason = f'Output string {output_trunc} does not contain expected string {expected_trunc}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) try: # Handle different collection types if isinstance(ctx.output, dict): if isinstance(self.value, dict): # Cast to Any to avoid type checking issues output_dict = cast(dict[Any, Any], ctx.output) # pyright: ignore[reportUnknownMemberType] expected_dict = cast(dict[Any, Any], self.value) # pyright: ignore[reportUnknownMemberType] for k in expected_dict: if k not in output_dict: k_trunc = _truncated_repr(k, max_length=30) failure_reason = f'Output dictionary does not contain expected key {k_trunc}' break elif output_dict[k] != expected_dict[k]: k_trunc = _truncated_repr(k, max_length=30) output_v_trunc = _truncated_repr(output_dict[k], max_length=100) expected_v_trunc = _truncated_repr(expected_dict[k], max_length=100) failure_reason = f'Output dictionary has different value for key {k_trunc}: {output_v_trunc} != {expected_v_trunc}' break else: if self.value not in ctx.output: # pyright: ignore[reportUnknownMemberType] output_trunc = _truncated_repr(ctx.output, max_length=200) # pyright: ignore[reportUnknownMemberType] failure_reason = f'Output {output_trunc} does not contain provided value as a key' elif self.value not in ctx.output: # pyright: ignore[reportOperatorIssue] # will be handled by except block output_trunc = _truncated_repr(ctx.output, max_length=200) failure_reason = f'Output {output_trunc} does not contain provided value' except (TypeError, ValueError) as e: failure_reason = f'Containment check failed: {e}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) ``` ### Equals Bases: `Evaluator[object, object, object]` Check if the output exactly equals the provided value. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Equals(Evaluator[object, object, object]): """Check if the output exactly equals the provided value.""" value: Any evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: return ctx.output == self.value ``` ### EqualsExpected Bases: `Evaluator[object, object, object]` Check if the output exactly equals the expected output. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class EqualsExpected(Evaluator[object, object, object]): """Check if the output exactly equals the expected output.""" evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool | dict[str, bool]: if ctx.expected_output is None: return {} # Only compare if expected output is provided return ctx.output == ctx.expected_output ``` ### HasMatchingSpan Bases: `Evaluator[object, object, object]` Check if the span tree contains a span that matches the specified query. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class HasMatchingSpan(Evaluator[object, object, object]): """Check if the span tree contains a span that matches the specified query.""" query: SpanQuery evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> bool: return ctx.span_tree.any(self.query) ``` ### IsInstance Bases: `Evaluator[object, object, object]` Check if the output is an instance of a type with the given name. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class IsInstance(Evaluator[object, object, object]): """Check if the output is an instance of a type with the given name.""" type_name: str evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> EvaluationReason: output = ctx.output for cls in type(output).__mro__: if cls.__name__ == self.type_name or cls.__qualname__ == self.type_name: return EvaluationReason(value=True) reason = f'output is of type {type(output).__name__}' if type(output).__qualname__ != type(output).__name__: reason += f' (qualname: {type(output).__qualname__})' return EvaluationReason(value=False, reason=reason) ``` ### LLMJudge Bases: `Evaluator[object, object, object]` Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-5.2', but can be overridden by calling set_default_judge_model. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class LLMJudge(Evaluator[object, object, object]): """Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-5.2', but can be overridden by calling [`set_default_judge_model`][pydantic_evals.evaluators.llm_as_a_judge.set_default_judge_model]. """ rubric: str model: models.Model | models.KnownModelName | str | None = None include_input: bool = False include_expected_output: bool = False model_settings: ModelSettings | None = None score: OutputConfig | Literal[False] = False assertion: OutputConfig | Literal[False] = field(default_factory=lambda: OutputConfig(include_reason=True)) async def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluatorOutput: if self.include_input: if self.include_expected_output: from .llm_as_a_judge import judge_input_output_expected grading_output = await judge_input_output_expected( ctx.inputs, ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_input_output grading_output = await judge_input_output( ctx.inputs, ctx.output, self.rubric, self.model, self.model_settings ) else: if self.include_expected_output: from .llm_as_a_judge import judge_output_expected grading_output = await judge_output_expected( ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_output grading_output = await judge_output(ctx.output, self.rubric, self.model, self.model_settings) output: dict[str, EvaluationScalar | EvaluationReason] = {} include_both = self.score is not False and self.assertion is not False evaluation_name = self.get_default_evaluation_name() if self.score is not False: default_name = f'{evaluation_name}_score' if include_both else evaluation_name _update_combined_output(output, grading_output.score, grading_output.reason, self.score, default_name) if self.assertion is not False: default_name = f'{evaluation_name}_pass' if include_both else evaluation_name _update_combined_output(output, grading_output.pass_, grading_output.reason, self.assertion, default_name) return output def build_serialization_arguments(self): result = super().build_serialization_arguments() # always serialize the model as a string when present; use its name if it's a KnownModelName if (model := result.get('model')) and isinstance(model, models.Model): # pragma: no branch result['model'] = f'{model.system}:{model.model_name}' # Note: this may lead to confusion if you try to serialize-then-deserialize with a custom model. # I expect that is rare enough to be worth not solving yet, but common enough that we probably will want to # solve it eventually. I'm imagining some kind of model registry, but don't want to work out the details yet. return result ``` ### MaxDuration Bases: `Evaluator[object, object, object]` Check if the execution time is under the specified maximum. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class MaxDuration(Evaluator[object, object, object]): """Check if the execution time is under the specified maximum.""" seconds: float | timedelta def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: duration = timedelta(seconds=ctx.duration) seconds = self.seconds if not isinstance(seconds, timedelta): seconds = timedelta(seconds=seconds) return duration <= seconds ``` ### OutputConfig Bases: `TypedDict` Configuration for the score and assertion outputs of the LLMJudge evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python class OutputConfig(TypedDict, total=False): """Configuration for the score and assertion outputs of the LLMJudge evaluator.""" evaluation_name: str include_reason: bool ``` ### EvaluatorContext Bases: `Generic[InputsT, OutputT, MetadataT]` Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/context.py` ````python @dataclass(kw_only=True) class EvaluatorContext(Generic[InputsT, OutputT, MetadataT]): """Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` """ name: str | None """The name of the case.""" inputs: InputsT """The inputs provided to the task for this case.""" metadata: MetadataT | None """Metadata associated with the case, if provided. May be None if no metadata was specified.""" expected_output: OutputT | None """The expected output for the case, if provided. May be None if no expected output was specified.""" output: OutputT """The actual output produced by the task for this case.""" duration: float """The duration of the task run for this case.""" _span_tree: SpanTree | SpanTreeRecordingError = field(repr=False) """The span tree for the task run for this case. This will be `None` if `logfire.configure` has not been called. """ attributes: dict[str, Any] """Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task.""" metrics: dict[str, int | float] """Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task.""" @property def span_tree(self) -> SpanTree: """Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: The span tree for the task execution. Raises: SpanTreeRecordingError: If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. """ if isinstance(self._span_tree, SpanTreeRecordingError): # In this case, there was a reason we couldn't record the SpanTree. We raise that now raise self._span_tree return self._span_tree ```` #### name ```python name: str | None ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs provided to the task for this case. #### metadata ```python metadata: MetadataT | None ``` Metadata associated with the case, if provided. May be None if no metadata was specified. #### expected_output ```python expected_output: OutputT | None ``` The expected output for the case, if provided. May be None if no expected output was specified. #### output ```python output: OutputT ``` The actual output produced by the task for this case. #### duration ```python duration: float ``` The duration of the task run for this case. #### attributes ```python attributes: dict[str, Any] ``` Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task. #### metrics ```python metrics: dict[str, int | float] ``` Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task. #### span_tree ```python span_tree: SpanTree ``` Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: | Type | Description | | ---------- | ------------------------------------- | | `SpanTree` | The span tree for the task execution. | Raises: | Type | Description | | ------------------------ | --------------------------------------------------------------------------------------------------------------------- | | `SpanTreeRecordingError` | If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. | ### EvaluationReason The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Parameters: | Name | Type | Description | Default | | -------- | ------------------ | ------------------------------------------------------------------------- | ------------------------------------------------- | | `value` | `EvaluationScalar` | The scalar result of the evaluation (boolean, integer, float, or string). | *required* | | `reason` | \`str | None\` | An optional explanation of the evaluation result. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationReason: """The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Args: value: The scalar result of the evaluation (boolean, integer, float, or string). reason: An optional explanation of the evaluation result. """ value: EvaluationScalar reason: str | None = None ``` ### EvaluationResult Bases: `Generic[EvaluationScalarT]` The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Parameters: | Name | Type | Description | Default | | -------- | ------------------- | ---------------------------------------------------- | ------------------------------------------------- | | `name` | `str` | The name of the evaluation. | *required* | | `value` | `EvaluationScalarT` | The scalar result of the evaluation. | *required* | | `reason` | \`str | None\` | An optional explanation of the evaluation result. | | `source` | `EvaluatorSpec` | The spec of the evaluator that produced this result. | *required* | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationResult(Generic[EvaluationScalarT]): """The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Args: name: The name of the evaluation. value: The scalar result of the evaluation. reason: An optional explanation of the evaluation result. source: The spec of the evaluator that produced this result. """ name: str value: EvaluationScalarT reason: str | None source: EvaluatorSpec def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` #### downcast ```python downcast( *value_types: type[T], ) -> EvaluationResult[T] | None ``` Attempt to downcast this result to a more specific type. Parameters: | Name | Type | Description | Default | | -------------- | --------- | ------------------------------------- | ------- | | `*value_types` | `type[T]` | The types to check the value against. | `()` | Returns: | Type | Description | | --------------------- | ----------- | | \`EvaluationResult[T] | None\` | | \`EvaluationResult[T] | None\` | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` ### Evaluator Bases: `Generic[InputsT, OutputT, MetadataT]` Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ````python @dataclass(repr=False) class Evaluator(Generic[InputsT, OutputT, MetadataT], metaclass=_StrictABCMeta): """Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` """ __pydantic_config__ = ConfigDict(arbitrary_types_allowed=True) @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) def as_spec(self) -> EvaluatorSpec: raw_arguments = self.build_serialization_arguments() arguments: None | tuple[Any,] | dict[str, Any] if len(raw_arguments) == 0: arguments = None elif len(raw_arguments) == 1: arguments = (next(iter(raw_arguments.values())),) else: arguments = raw_arguments return EvaluatorSpec(name=self.get_serialization_name(), arguments=arguments) def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments __repr__ = _utils.dataclasses_no_defaults_repr ```` #### get_serialization_name ```python get_serialization_name() -> str ``` Return the 'name' of this Evaluator to use during serialization. Returns: | Type | Description | | ----- | ------------------------------------------------------------- | | `str` | The name of the Evaluator, which is typically the class name. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ ``` #### name ```python name() -> str ``` Deprecated `name` has been renamed, use `get_serialization_name` instead. `name` has been renamed, use `get_serialization_name` instead. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() ``` #### get_default_evaluation_name ```python get_default_evaluation_name() -> str ``` Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() ``` #### evaluate ```python evaluate( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput | Awaitable[EvaluatorOutput] ``` Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Parameters: | Name | Type | Description | Default | | ----- | ----------------------------------------------- | ------------------------------------------------------------------------ | ---------- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | ----------------- | ---------------------------- | | \`EvaluatorOutput | Awaitable[EvaluatorOutput]\` | | \`EvaluatorOutput | Awaitable[EvaluatorOutput]\` | | \`EvaluatorOutput | Awaitable[EvaluatorOutput]\` | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') ``` #### evaluate_sync ```python evaluate_sync( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Parameters: | Name | Type | Description | Default | | ----- | ----------------------------------------------- | ------------------------------------------------------------------------ | ---------- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) ``` #### evaluate_async ```python evaluate_async( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Parameters: | Name | Type | Description | Default | | ----- | ----------------------------------------------- | ------------------------------------------------------------------------ | ---------- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | ----------------- | ------------------------------------------------------------------------------------- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) ``` #### serialize ```python serialize(info: SerializationInfo) -> Any ``` Serialize this Evaluator to a JSON-serializable form. Returns: | Type | Description | | ----- | ------------------------------------------------------------------------- | | `Any` | A JSON-serializable representation of this evaluator as an EvaluatorSpec. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) ``` #### build_serialization_arguments ```python build_serialization_arguments() -> dict[str, Any] ``` Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: | Type | Description | | ---------------- | ---------------------------------------------------------- | | `dict[str, Any]` | A dictionary of arguments to be used during serialization. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments ``` ### EvaluatorFailure Represents a failure raised during the execution of an evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluatorFailure: """Represents a failure raised during the execution of an evaluator.""" name: str error_message: str error_stacktrace: str source: EvaluatorSpec ``` ### EvaluatorOutput ```python EvaluatorOutput = ( EvaluationScalar | EvaluationReason | Mapping[str, EvaluationScalar | EvaluationReason] ) ``` Type for the output of an evaluator, which can be a scalar, an EvaluationReason, or a mapping of names to either. ### EvaluatorSpec Bases: `BaseModel` The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python class EvaluatorSpec(BaseModel): """The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` """ name: str """The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()`""" arguments: None | tuple[Any] | dict[str, Any] """The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). """ @property def args(self) -> tuple[Any, ...]: """Get the positional arguments for the evaluator. Returns: A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. """ if isinstance(self.arguments, tuple): return self.arguments return () @property def kwargs(self) -> dict[str, Any]: """Get the keyword arguments for the evaluator. Returns: A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. """ if isinstance(self.arguments, dict): return self.arguments return {} @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` #### name ```python name: str ``` The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()` #### arguments ```python arguments: None | tuple[Any] | dict[str, Any] ``` The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). #### args ```python args: tuple[Any, ...] ``` Get the positional arguments for the evaluator. Returns: | Type | Description | | ----------------- | ---------------------------------------------------------------------------------- | | `tuple[Any, ...]` | A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. | #### kwargs ```python kwargs: dict[str, Any] ``` Get the keyword arguments for the evaluator. Returns: | Type | Description | | ---------------- | ---------------------------------------------------------------------------------- | | `dict[str, Any]` | A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. | #### deserialize ```python deserialize( value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec], ) -> EvaluatorSpec ``` Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------------------ | ------------------------- | ---------- | | `value` | `Any` | The value to deserialize. | *required* | | `handler` | `ModelWrapValidatorHandler[EvaluatorSpec]` | The validator handler. | *required* | Returns: | Type | Description | | --------------- | ------------------------------- | | `EvaluatorSpec` | The deserialized EvaluatorSpec. | Raises: | Type | Description | | ----------------- | ------------------------------------ | | `ValidationError` | If the value cannot be deserialized. | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() ``` #### serialize ```python serialize( handler: SerializerFunctionWrapHandler, info: SerializationInfo, ) -> Any ``` Serialize using the appropriate short-form if possible. Returns: | Type | Description | | ----- | ------------------------------------------------------------------------- | | `Any` | The serialized evaluator specification, using the shortest form possible: | | `Any` | Just the name if there are no arguments | | `Any` | {name: first_arg} if there's a single positional argument | | `Any` | {name: {kwargs}} if there are multiple (keyword) arguments | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` ### GradingOutput Bases: `BaseModel` The output of a grading operation. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python class GradingOutput(BaseModel, populate_by_name=True): """The output of a grading operation.""" reason: str pass_: bool = Field(validation_alias='pass', serialization_alias='pass') score: float ``` ### judge_output ```python judge_output( output: Any, rubric: str, model: Model | KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output( output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric) return ( await _judge_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output ```python judge_input_output( inputs: Any, output: Any, rubric: str, model: Model | KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output( inputs: Any, output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric) return ( await _judge_input_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output_expected ```python judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_input_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### judge_output_expected ```python judge_output_expected( output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output_expected( output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | str | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-5.2', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### set_default_judge_model ```python set_default_judge_model( model: Model | KnownModelName, ) -> None ``` Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python def set_default_judge_model(model: models.Model | models.KnownModelName) -> None: """Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. """ global _default_model _default_model = model ``` # `pydantic_evals.generation` Utilities for generating example datasets for pydantic_evals. This module provides functions for generating sample datasets for testing and examples, using LLMs to create realistic test data with proper structure. ### InputsT ```python InputsT = TypeVar('InputsT', default=Any) ``` Generic type for the inputs to the task being evaluated. ### OutputT ```python OutputT = TypeVar('OutputT', default=Any) ``` Generic type for the expected output of the task being evaluated. ### MetadataT ```python MetadataT = TypeVar('MetadataT', default=Any) ``` Generic type for the metadata associated with the task being evaluated. ### generate_dataset ```python generate_dataset( *, dataset_type: type[ Dataset[InputsT, OutputT, MetadataT] ], path: Path | str | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), model: Model | KnownModelName = "openai:gpt-5.2", n_examples: int = 3, extra_instructions: str | None = None ) -> Dataset[InputsT, OutputT, MetadataT] ``` Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Parameters: | Name | Type | Description | Default | | ------------------------ | -------------------------------------------------------- | ------------------------------------------------------------------------------------ | -------------------------------------------------------------------------- | | `path` | \`Path | str | None\` | | `dataset_type` | `type[Dataset[InputsT, OutputT, MetadataT]]` | The type of dataset to generate, with the desired input, output, and metadata types. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Optional sequence of custom evaluator classes to include in the schema. | `()` | | `model` | \`Model | KnownModelName\` | The Pydantic AI model to use for generation. Defaults to 'openai:gpt-5.2'. | | `n_examples` | `int` | Number of examples to generate. Defaults to 3. | `3` | | `extra_instructions` | \`str | None\` | Optional additional instructions to provide to the LLM. | Returns: | Type | Description | | -------------------------------------- | --------------------------------------------------------------- | | `Dataset[InputsT, OutputT, MetadataT]` | A properly structured Dataset object with generated test cases. | Raises: | Type | Description | | ----------------- | ---------------------------------------------------------- | | `ValidationError` | If the LLM's response cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/generation.py` ```python async def generate_dataset( *, dataset_type: type[Dataset[InputsT, OutputT, MetadataT]], path: Path | str | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), model: models.Model | models.KnownModelName = 'openai:gpt-5.2', n_examples: int = 3, extra_instructions: str | None = None, ) -> Dataset[InputsT, OutputT, MetadataT]: """Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Args: path: Optional path to save the generated dataset. If provided, the dataset will be saved to this location. dataset_type: The type of dataset to generate, with the desired input, output, and metadata types. custom_evaluator_types: Optional sequence of custom evaluator classes to include in the schema. model: The Pydantic AI model to use for generation. Defaults to 'openai:gpt-5.2'. n_examples: Number of examples to generate. Defaults to 3. extra_instructions: Optional additional instructions to provide to the LLM. Returns: A properly structured Dataset object with generated test cases. Raises: ValidationError: If the LLM's response cannot be parsed as a valid dataset. """ output_schema = dataset_type.model_json_schema_with_evaluators(custom_evaluator_types) # TODO: Use `output_type=StructuredDict(output_schema)` (and `from_dict` below) once https://github.com/pydantic/pydantic/issues/12145 # is fixed and `StructuredDict` no longer needs to use `InlineDefsJsonSchemaTransformer`. agent = Agent( model, system_prompt=( f'Generate an object that is in compliance with this JSON schema:\n{output_schema}\n\n' f'Include {n_examples} example cases.' ' You must not include any characters in your response before the opening { of the JSON object, or after the closing }.' ), output_type=str, retries=1, ) result = await agent.run(extra_instructions or 'Please generate the object.') output = strip_markdown_fences(result.output) try: result = dataset_type.from_text(output, fmt='json', custom_evaluator_types=custom_evaluator_types) except ValidationError as e: # pragma: no cover print(f'Raw response from model:\n{result.output}') raise e if path is not None: result.to_file(path, custom_evaluator_types=custom_evaluator_types) # pragma: no cover return result ``` # `pydantic_evals.otel` ### SpanNode A node in the span tree; provides references to parents/children for easy traversal and queries. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanNode: """A node in the span tree; provides references to parents/children for easy traversal and queries.""" name: str trace_id: int span_id: int parent_span_id: int | None start_timestamp: datetime end_timestamp: datetime attributes: dict[str, AttributeValue] @property def duration(self) -> timedelta: """Return the span's duration as a timedelta, or None if start/end not set.""" return self.end_timestamp - self.start_timestamp @property def children(self) -> list[SpanNode]: return list(self.children_by_id.values()) @property def descendants(self) -> list[SpanNode]: """Return all descendants of this node in DFS order.""" return self.find_descendants(lambda _: True) @property def ancestors(self) -> list[SpanNode]: """Return all ancestors of this node.""" return self.find_ancestors(lambda _: True) @property def node_key(self) -> str: return f'{self.trace_id:032x}:{self.span_id:016x}' @property def parent_node_key(self) -> str | None: return None if self.parent_span_id is None else f'{self.trace_id:032x}:{self.parent_span_id:016x}' # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self.parent: SpanNode | None = None self.children_by_id: dict[str, SpanNode] = {} @staticmethod def from_readable_span(span: ReadableSpan) -> SpanNode: assert span.context is not None, 'Span has no context' assert span.start_time is not None, 'Span has no start time' assert span.end_time is not None, 'Span has no end time' return SpanNode( name=span.name, trace_id=span.context.trace_id, span_id=span.context.span_id, parent_span_id=span.parent.span_id if span.parent else None, start_timestamp=datetime.fromtimestamp(span.start_time / 1e9, tz=timezone.utc), end_timestamp=datetime.fromtimestamp(span.end_time / 1e9, tz=timezone.utc), attributes=dict(span.attributes or {}), ) def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self # ------------------------------------------------------------------------- # Child queries # ------------------------------------------------------------------------- def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None def _filter_children(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: return (child for child in self.children if child.matches(predicate)) # ------------------------------------------------------------------------- # Descendant queries (DFS) # ------------------------------------------------------------------------- def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None def _filter_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: stack = list(self.children) while stack: node = stack.pop() if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): continue stack.extend(node.children) # ------------------------------------------------------------------------- # Ancestor queries (DFS "up" the chain) # ------------------------------------------------------------------------- def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None def _filter_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: node = self.parent while node: if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): break node = node.parent # ------------------------------------------------------------------------- # Query matching # ------------------------------------------------------------------------- def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) def _matches_query(self, query: SpanQuery) -> bool: # noqa: C901 """Check if the span matches the query conditions.""" # Logical combinations if or_ := query.get('or_'): if len(query) > 1: raise ValueError("Cannot combine 'or_' conditions with other conditions at the same level") return any(self._matches_query(q) for q in or_) if not_ := query.get('not_'): if self._matches_query(not_): return False if and_ := query.get('and_'): results = [self._matches_query(q) for q in and_] if not all(results): return False # At this point, all existing ANDs and no existing ORs have passed, so it comes down to this condition # Name conditions if (name_equals := query.get('name_equals')) and self.name != name_equals: return False if (name_contains := query.get('name_contains')) and name_contains not in self.name: return False if (name_matches_regex := query.get('name_matches_regex')) and not re.match(name_matches_regex, self.name): return False # Attribute conditions if (has_attributes := query.get('has_attributes')) and not all( self.attributes.get(key) == value for key, value in has_attributes.items() ): return False if (has_attributes_keys := query.get('has_attribute_keys')) and not all( key in self.attributes for key in has_attributes_keys ): return False # Timing conditions if (min_duration := query.get('min_duration')) is not None: if not isinstance(min_duration, timedelta): min_duration = timedelta(seconds=min_duration) if self.duration < min_duration: return False if (max_duration := query.get('max_duration')) is not None: if not isinstance(max_duration, timedelta): max_duration = timedelta(seconds=max_duration) if self.duration > max_duration: return False # Children conditions if (min_child_count := query.get('min_child_count')) and len(self.children) < min_child_count: return False if (max_child_count := query.get('max_child_count')) and len(self.children) > max_child_count: return False if (some_child_has := query.get('some_child_has')) and not any( child._matches_query(some_child_has) for child in self.children ): return False if (all_children_have := query.get('all_children_have')) and not all( child._matches_query(all_children_have) for child in self.children ): return False if (no_child_has := query.get('no_child_has')) and any( child._matches_query(no_child_has) for child in self.children ): return False # Descendant conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def descendants(): return self.descendants @cache def pruned_descendants(): stop_recursing_when = query.get('stop_recursing_when') return ( self._filter_descendants(lambda _: True, stop_recursing_when) if stop_recursing_when else descendants() ) if (min_descendant_count := query.get('min_descendant_count')) and len(descendants()) < min_descendant_count: return False if (max_descendant_count := query.get('max_descendant_count')) and len(descendants()) > max_descendant_count: return False if (some_descendant_has := query.get('some_descendant_has')) and not any( descendant._matches_query(some_descendant_has) for descendant in pruned_descendants() ): return False if (all_descendants_have := query.get('all_descendants_have')) and not all( descendant._matches_query(all_descendants_have) for descendant in pruned_descendants() ): return False if (no_descendant_has := query.get('no_descendant_has')) and any( descendant._matches_query(no_descendant_has) for descendant in pruned_descendants() ): return False # Ancestor conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def ancestors(): return self.ancestors @cache def pruned_ancestors(): stop_recursing_when = query.get('stop_recursing_when') return self._filter_ancestors(lambda _: True, stop_recursing_when) if stop_recursing_when else ancestors() if (min_depth := query.get('min_depth')) and len(ancestors()) < min_depth: return False if (max_depth := query.get('max_depth')) and len(ancestors()) > max_depth: return False if (some_ancestor_has := query.get('some_ancestor_has')) and not any( ancestor._matches_query(some_ancestor_has) for ancestor in pruned_ancestors() ): return False if (all_ancestors_have := query.get('all_ancestors_have')) and not all( ancestor._matches_query(all_ancestors_have) for ancestor in pruned_ancestors() ): return False if (no_ancestor_has := query.get('no_ancestor_has')) and any( ancestor._matches_query(no_ancestor_has) for ancestor in pruned_ancestors() ): return False return True # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) def __str__(self) -> str: if self.children: return f"..." else: return f"" def __repr__(self) -> str: return self.repr_xml() ``` #### duration ```python duration: timedelta ``` Return the span's duration as a timedelta, or None if start/end not set. #### descendants ```python descendants: list[SpanNode] ``` Return all descendants of this node in DFS order. #### ancestors ```python ancestors: list[SpanNode] ``` Return all ancestors of this node. #### add_child ```python add_child(child: SpanNode) -> None ``` Attach a child node to this node's list of children. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self ``` #### find_children ```python find_children( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Return all immediate children that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) ``` #### first_child ```python first_child( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Return the first immediate child that satisfies the given predicate, or None if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) ``` #### any_child ```python any_child(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if there is at least one child that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None ``` #### find_descendants ```python find_descendants( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all descendant nodes that satisfy the given predicate in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) ``` #### first_descendant ```python first_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) ``` #### any_descendant ```python any_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns `True` if there is at least one descendant that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None ``` #### find_ancestors ```python find_ancestors( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all ancestors that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) ``` #### first_ancestor ```python first_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` Return the closest ancestor that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) ``` #### any_ancestor ```python any_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns True if any ancestor satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None ``` #### matches ```python matches(query: SpanQuery | SpanPredicate) -> bool ``` Check if the span node matches the query conditions or predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) ``` ### SpanQuery Bases: `TypedDict` A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python class SpanQuery(TypedDict, total=False): """A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. """ # These fields are ordered to match the implementation of SpanNode.matches_query for easy review. # * Individual span conditions come first because these are generally the cheapest to evaluate # * Logical combinations come next because they may just be combinations of individual span conditions # * Related-span conditions come last because they may require the most work to evaluate # Individual span conditions ## Name conditions name_equals: str name_contains: str name_matches_regex: str # regex pattern ## Attribute conditions has_attributes: dict[str, Any] has_attribute_keys: list[str] ## Timing conditions min_duration: timedelta | float max_duration: timedelta | float # Logical combinations of conditions not_: SpanQuery and_: list[SpanQuery] or_: list[SpanQuery] # Child conditions min_child_count: int max_child_count: int some_child_has: SpanQuery all_children_have: SpanQuery no_child_has: SpanQuery # Recursive conditions stop_recursing_when: SpanQuery """If present, stop recursing through ancestors or descendants at nodes that match this condition.""" ## Descendant conditions min_descendant_count: int max_descendant_count: int some_descendant_has: SpanQuery all_descendants_have: SpanQuery no_descendant_has: SpanQuery ## Ancestor conditions min_depth: int # depth is equivalent to ancestor count; roots have depth 0 max_depth: int some_ancestor_has: SpanQuery all_ancestors_have: SpanQuery no_ancestor_has: SpanQuery ``` #### stop_recursing_when ```python stop_recursing_when: SpanQuery ``` If present, stop recursing through ancestors or descendants at nodes that match this condition. ### SpanTree A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanTree: """A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). """ roots: list[SpanNode] = field(default_factory=list[SpanNode]) nodes_by_id: dict[str, SpanNode] = field(default_factory=dict[str, SpanNode]) # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self._rebuild_tree() def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() def add_readable_spans(self, readable_spans: list[ReadableSpan]): self.add_spans([SpanNode.from_readable_span(span) for span in readable_spans]) def _rebuild_tree(self): # Ensure spans are ordered by start_timestamp so that roots and children end up in the right order nodes = list(self.nodes_by_id.values()) nodes.sort(key=lambda node: node.start_timestamp or datetime.min) self.nodes_by_id = {node.node_key: node for node in nodes} # Build the parent/child relationships for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is not None: parent_node = self.nodes_by_id.get(parent_node_key) if parent_node is not None: parent_node.add_child(node) # Determine the roots # A node is a "root" if its parent is None or if its parent's span_id is not in the current set of spans. self.roots = [] for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is None or parent_node_key not in self.nodes_by_id: self.roots.append(node) # ------------------------------------------------------------------------- # Node filtering and iteration # ------------------------------------------------------------------------- def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None def _filter(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: for node in self: if node.matches(predicate): yield node def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) def __str__(self): return f'' def __repr__(self): return self.repr_xml() ``` #### add_spans ```python add_spans(spans: list[SpanNode]) -> None ``` Add a list of spans to the tree, rebuilding the tree structure. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() ``` #### find ```python find( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) ``` #### first ```python first( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) ``` #### any ```python any(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if any node in the tree matches the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None ``` #### __iter__ ```python __iter__() -> Iterator[SpanNode] ``` Return an iterator over all nodes in the tree. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) ``` # `pydantic_evals.reporting` ### ReportCase Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCase(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.dataset.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.dataset.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.dataset.Case.expected_output].""" output: OutputT """The output of the task execution.""" metrics: dict[str, float | int] attributes: dict[str, Any] scores: dict[str, EvaluationResult[int | float]] labels: dict[str, EvaluationResult[str]] assertions: dict[str, EvaluationResult[bool]] task_duration: float total_duration: float # includes evaluator execution time trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" evaluator_failures: list[EvaluatorFailure] = field(default_factory=list[EvaluatorFailure]) ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### output ```python output: OutputT ``` The output of the task execution. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseFailure Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report that failed due to an error during task execution. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCaseFailure(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report that failed due to an error during task execution.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.dataset.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.dataset.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.dataset.Case.expected_output].""" error_message: str """The message of the exception that caused the failure.""" error_stacktrace: str """The stacktrace of the exception that caused the failure.""" trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### error_message ```python error_message: str ``` The message of the exception that caused the failure. #### error_stacktrace ```python error_stacktrace: str ``` The stacktrace of the exception that caused the failure. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseAggregate Bases: `BaseModel` A synthetic case that summarizes a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class ReportCaseAggregate(BaseModel): """A synthetic case that summarizes a set of cases.""" name: str scores: dict[str, float | int] labels: dict[str, dict[str, float]] metrics: dict[str, float | int] assertions: float | None task_duration: float total_duration: float @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` #### average ```python average(cases: list[ReportCase]) -> ReportCaseAggregate ``` Produce a synthetic "summary" case by averaging quantitative attributes. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` ### EvaluationReport Bases: `Generic[InputsT, OutputT, MetadataT]` A report of the results of evaluating a model on a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationReport(Generic[InputsT, OutputT, MetadataT]): """A report of the results of evaluating a model on a set of cases.""" name: str """The name of the report.""" cases: list[ReportCase[InputsT, OutputT, MetadataT]] """The cases in the report.""" failures: list[ReportCaseFailure[InputsT, OutputT, MetadataT]] = field( default_factory=list[ReportCaseFailure[InputsT, OutputT, MetadataT]] ) """The failures in the report. These are cases where task execution raised an exception.""" experiment_metadata: dict[str, Any] | None = None """Metadata associated with the specific experiment represented by this report.""" trace_id: str | None = None """The trace ID of the evaluation.""" span_id: str | None = None """The span ID of the evaluation.""" def averages(self) -> ReportCaseAggregate | None: if self.cases: return ReportCaseAggregate.average(self.cases) return None def render( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> str: """Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ io_file = StringIO() console = Console(width=width, file=io_file) self.print( width=width, baseline=baseline, console=console, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_errors=include_errors, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) return io_file.getvalue() def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> None: """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ if console is None: # pragma: no branch console = Console(width=width) metadata_panel = self._metadata_panel(baseline=baseline) renderable: RenderableType = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, with_title=not metadata_panel, ) # Wrap table with experiment metadata panel if present if metadata_panel: renderable = Group(metadata_panel, renderable) console.print(renderable) if include_errors and self.failures: # pragma: no cover failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') # TODO(DavidM): in v2, change the return type here to RenderableType def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, with_title: bool = True, ) -> Table: """Return a table containing the data from this report. If a baseline is provided, returns a diff between this report and the baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self, with_title=with_title) else: return renderer.build_diff_table(self, baseline, with_title=with_title) def _metadata_panel( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None ) -> RenderableType | None: """Wrap a table with an experiment metadata panel if metadata exists. Args: table: The table to wrap baseline: Optional baseline report for diff metadata Returns: Either the table unchanged or a Group with Panel and Table """ if baseline is None: # Single report - show metadata if present if self.experiment_metadata: metadata_text = Text() items = list(self.experiment_metadata.items()) for i, (key, value) in enumerate(items): metadata_text.append(f'{key}: {value}', style='dim') if i < len(items) - 1: metadata_text.append('\n') return Panel( metadata_text, title=f'Evaluation Summary: {self.name}', title_align='left', border_style='dim', padding=(0, 1), expand=False, ) else: # Diff report - show metadata diff if either has metadata if self.experiment_metadata or baseline.experiment_metadata: diff_name = baseline.name if baseline.name == self.name else f'{baseline.name} → {self.name}' metadata_text = Text() lines_styles: list[tuple[str, str]] = [] if baseline.experiment_metadata and self.experiment_metadata: # Collect all keys from both all_keys = sorted(set(baseline.experiment_metadata.keys()) | set(self.experiment_metadata.keys())) for key in all_keys: baseline_val = baseline.experiment_metadata.get(key) report_val = self.experiment_metadata.get(key) if baseline_val == report_val: lines_styles.append((f'{key}: {report_val}', 'dim')) elif baseline_val is None: lines_styles.append((f'+ {key}: {report_val}', 'green')) elif report_val is None: lines_styles.append((f'- {key}: {baseline_val}', 'red')) else: lines_styles.append((f'{key}: {baseline_val} → {report_val}', 'yellow')) elif self.experiment_metadata: lines_styles = [(f'+ {k}: {v}', 'green') for k, v in self.experiment_metadata.items()] else: # baseline.experiment_metadata only assert baseline.experiment_metadata is not None lines_styles = [(f'- {k}: {v}', 'red') for k, v in baseline.experiment_metadata.items()] for i, (line, style) in enumerate(lines_styles): metadata_text.append(line, style=style) if i < len(lines_styles) - 1: metadata_text.append('\n') return Panel( metadata_text, title=f'Evaluation Diff: {diff_name}', title_align='left', border_style='dim', padding=(0, 1), expand=False, ) return None # TODO(DavidM): in v2, change the return type here to RenderableType def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" return self.render() ``` #### name ```python name: str ``` The name of the report. #### cases ```python cases: list[ReportCase[InputsT, OutputT, MetadataT]] ``` The cases in the report. #### failures ```python failures: list[ ReportCaseFailure[InputsT, OutputT, MetadataT] ] = field( default_factory=list[ ReportCaseFailure[InputsT, OutputT, MetadataT] ] ) ``` The failures in the report. These are cases where task execution raised an exception. #### experiment_metadata ```python experiment_metadata: dict[str, Any] | None = None ``` Metadata associated with the specific experiment represented by this report. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the evaluation. #### span_id ```python span_id: str | None = None ``` The span ID of the evaluation. #### render ```python render( width: int | None = None, baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> str ``` Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def render( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> str: """Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ io_file = StringIO() console = Console(width=width, file=io_file) self.print( width=width, baseline=baseline, console=console, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_errors=include_errors, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) return io_file.getvalue() ``` #### print ```python print( width: int | None = None, baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> None ``` Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> None: """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ if console is None: # pragma: no branch console = Console(width=width) metadata_panel = self._metadata_panel(baseline=baseline) renderable: RenderableType = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, with_title=not metadata_panel, ) # Wrap table with experiment metadata panel if present if metadata_panel: renderable = Group(metadata_panel, renderable) console.print(renderable) if include_errors and self.failures: # pragma: no cover failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') ``` #### console_table ```python console_table( baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, with_title: bool = True ) -> Table ``` Return a table containing the data from this report. If a baseline is provided, returns a diff between this report and the baseline report. Optionally include input and output details. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, with_title: bool = True, ) -> Table: """Return a table containing the data from this report. If a baseline is provided, returns a diff between this report and the baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self, with_title=with_title) else: return renderer.build_diff_table(self, baseline, with_title=with_title) ``` #### failures_table ```python failures_table( *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None ) -> Table ``` Return a table containing the failures in this report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) ``` #### __str__ ```python __str__() -> str ``` Return a string representation of the report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" return self.render() ``` ### RenderValueConfig Bases: `TypedDict` A configuration for rendering a values in an Evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderValueConfig(TypedDict, total=False): """A configuration for rendering a values in an Evaluation report.""" value_formatter: str | Callable[[Any], str] diff_checker: Callable[[Any, Any], bool] | None diff_formatter: Callable[[Any, Any], str | None] | None diff_style: str ``` ### RenderNumberConfig Bases: `TypedDict` A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderNumberConfig(TypedDict, total=False): """A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. """ value_formatter: str | Callable[[float | int], str] """The logic to use for formatting values. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. * You can also use a custom string format spec, e.g. '{:.3f}' * You can also use a custom function, e.g. lambda x: f'{x:.3f}' """ diff_formatter: str | Callable[[float | int, float | int], str | None] | None """The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. * You can also use a custom string format spec, e.g. '{:+.3f}' * You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. * You can also use None to never generate extra diff text. """ diff_atol: float """The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. """ diff_rtol: float """The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. """ diff_increase_style: str """The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ diff_decrease_style: str """The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ ``` #### value_formatter ```python value_formatter: str | Callable[[float | int], str] ``` The logic to use for formatting values. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. - You can also use a custom string format spec, e.g. '{:.3f}' - You can also use a custom function, e.g. lambda x: f'{x:.3f}' #### diff_formatter ```python diff_formatter: ( str | Callable[[float | int, float | int], str | None] | None ) ``` The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. - You can also use a custom string format spec, e.g. '{:+.3f}' - You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. - You can also use None to never generate extra diff text. #### diff_atol ```python diff_atol: float ``` The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. #### diff_rtol ```python diff_rtol: float ``` The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. #### diff_increase_style ```python diff_increase_style: str ``` The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". #### diff_decrease_style ```python diff_decrease_style: str ``` The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". ### ReportCaseRenderer Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCaseRenderer: include_input: bool include_metadata: bool include_expected_output: bool include_output: bool include_scores: bool include_labels: bool include_metrics: bool include_assertions: bool include_reasons: bool include_durations: bool include_total_duration: bool include_error_message: bool include_error_stacktrace: bool include_evaluator_failures: bool input_renderer: _ValueRenderer metadata_renderer: _ValueRenderer output_renderer: _ValueRenderer score_renderers: Mapping[str, _NumberRenderer] label_renderers: Mapping[str, _ValueRenderer] metric_renderers: Mapping[str, _NumberRenderer] duration_renderer: _NumberRenderer # TODO(DavidM): in v2, change the return type here to RenderableType def build_base_table(self, title: str) -> Table: """Build and return a Rich Table for the diff output.""" table = Table(title=title, show_lines=True) table.add_column('Case ID', style='bold') if self.include_input: table.add_column('Inputs', overflow='fold') if self.include_metadata: table.add_column('Metadata', overflow='fold') if self.include_expected_output: table.add_column('Expected Output', overflow='fold') if self.include_output: table.add_column('Outputs', overflow='fold') if self.include_scores: table.add_column('Scores', overflow='fold') if self.include_labels: table.add_column('Labels', overflow='fold') if self.include_metrics: table.add_column('Metrics', overflow='fold') if self.include_assertions: table.add_column('Assertions', overflow='fold') if self.include_evaluator_failures: table.add_column('Evaluator Failures', overflow='fold') if self.include_durations: table.add_column('Durations' if self.include_total_duration else 'Duration', justify='right') return table # TODO(DavidM): in v2, change the return type here to RenderableType def build_failures_table(self, title: str) -> Table: """Build and return a Rich Table for the failures output.""" table = Table(title=title, show_lines=True) table.add_column('Case ID', style='bold') if self.include_input: table.add_column('Inputs', overflow='fold') if self.include_metadata: table.add_column('Metadata', overflow='fold') if self.include_expected_output: table.add_column('Expected Output', overflow='fold') if self.include_error_message: table.add_column('Error Message', overflow='fold') if self.include_error_stacktrace: table.add_column('Error Stacktrace', overflow='fold') return table def build_row(self, case: ReportCase) -> list[str]: """Build a table row for a single case.""" row = [case.name] if self.include_input: row.append(self.input_renderer.render_value(None, case.inputs) or EMPTY_CELL_STR) if self.include_metadata: row.append(self.metadata_renderer.render_value(None, case.metadata) or EMPTY_CELL_STR) if self.include_expected_output: row.append(self.output_renderer.render_value(None, case.expected_output) or EMPTY_CELL_STR) if self.include_output: row.append(self.output_renderer.render_value(None, case.output) or EMPTY_CELL_STR) if self.include_scores: row.append(self._render_dict({k: v for k, v in case.scores.items()}, self.score_renderers)) if self.include_labels: row.append(self._render_dict({k: v for k, v in case.labels.items()}, self.label_renderers)) if self.include_metrics: row.append(self._render_dict(case.metrics, self.metric_renderers)) if self.include_assertions: row.append(self._render_assertions(list(case.assertions.values()))) if self.include_evaluator_failures: row.append(self._render_evaluator_failures(case.evaluator_failures)) if self.include_durations: row.append(self._render_durations(case)) return row def build_aggregate_row(self, aggregate: ReportCaseAggregate) -> list[str]: """Build a table row for an aggregated case.""" row = [f'[b i]{aggregate.name}[/]'] if self.include_input: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_metadata: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_expected_output: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_output: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_scores: row.append(self._render_dict(aggregate.scores, self.score_renderers)) if self.include_labels: row.append(self._render_dict(aggregate.labels, self.label_renderers)) if self.include_metrics: row.append(self._render_dict(aggregate.metrics, self.metric_renderers)) if self.include_assertions: row.append(self._render_aggregate_assertions(aggregate.assertions)) if self.include_evaluator_failures: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_durations: row.append(self._render_durations(aggregate)) return row def build_diff_row( self, new_case: ReportCase, baseline: ReportCase, ) -> list[str]: """Build a table row for a given case ID.""" assert baseline.name == new_case.name, 'This should only be called for matching case IDs' row = [baseline.name] if self.include_input: # pragma: no branch input_diff = self.input_renderer.render_diff(None, baseline.inputs, new_case.inputs) or EMPTY_CELL_STR row.append(input_diff) if self.include_metadata: # pragma: no branch metadata_diff = ( self.metadata_renderer.render_diff(None, baseline.metadata, new_case.metadata) or EMPTY_CELL_STR ) row.append(metadata_diff) if self.include_expected_output: # pragma: no branch expected_output_diff = ( self.output_renderer.render_diff(None, baseline.expected_output, new_case.expected_output) or EMPTY_CELL_STR ) row.append(expected_output_diff) if self.include_output: # pragma: no branch output_diff = self.output_renderer.render_diff(None, baseline.output, new_case.output) or EMPTY_CELL_STR row.append(output_diff) if self.include_scores: # pragma: no branch scores_diff = self._render_dicts_diff( {k: v.value for k, v in baseline.scores.items()}, {k: v.value for k, v in new_case.scores.items()}, self.score_renderers, ) row.append(scores_diff) if self.include_labels: # pragma: no branch labels_diff = self._render_dicts_diff( {k: v.value for k, v in baseline.labels.items()}, {k: v.value for k, v in new_case.labels.items()}, self.label_renderers, ) row.append(labels_diff) if self.include_metrics: # pragma: no branch metrics_diff = self._render_dicts_diff(baseline.metrics, new_case.metrics, self.metric_renderers) row.append(metrics_diff) if self.include_assertions: # pragma: no branch assertions_diff = self._render_assertions_diff( list(baseline.assertions.values()), list(new_case.assertions.values()) ) row.append(assertions_diff) if self.include_evaluator_failures: # pragma: no branch evaluator_failures_diff = self._render_evaluator_failures_diff( baseline.evaluator_failures, new_case.evaluator_failures ) row.append(evaluator_failures_diff) if self.include_durations: # pragma: no branch durations_diff = self._render_durations_diff(baseline, new_case) row.append(durations_diff) return row def build_diff_aggregate_row( self, new: ReportCaseAggregate, baseline: ReportCaseAggregate, ) -> list[str]: """Build a table row for a given case ID.""" assert baseline.name == new.name, 'This should only be called for aggregates with matching names' row = [f'[b i]{baseline.name}[/]'] if self.include_input: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_metadata: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_expected_output: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_output: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_scores: # pragma: no branch scores_diff = self._render_dicts_diff(baseline.scores, new.scores, self.score_renderers) row.append(scores_diff) if self.include_labels: # pragma: no branch labels_diff = self._render_dicts_diff(baseline.labels, new.labels, self.label_renderers) row.append(labels_diff) if self.include_metrics: # pragma: no branch metrics_diff = self._render_dicts_diff(baseline.metrics, new.metrics, self.metric_renderers) row.append(metrics_diff) if self.include_assertions: # pragma: no branch assertions_diff = self._render_aggregate_assertions_diff(baseline.assertions, new.assertions) row.append(assertions_diff) if self.include_evaluator_failures: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_durations: # pragma: no branch durations_diff = self._render_durations_diff(baseline, new) row.append(durations_diff) return row def build_failure_row(self, case: ReportCaseFailure) -> list[str]: """Build a table row for a single case failure.""" row = [case.name] if self.include_input: row.append(self.input_renderer.render_value(None, case.inputs) or EMPTY_CELL_STR) if self.include_metadata: row.append(self.metadata_renderer.render_value(None, case.metadata) or EMPTY_CELL_STR) if self.include_expected_output: row.append(self.output_renderer.render_value(None, case.expected_output) or EMPTY_CELL_STR) if self.include_error_message: row.append(case.error_message or EMPTY_CELL_STR) if self.include_error_stacktrace: row.append(case.error_stacktrace or EMPTY_CELL_STR) return row def _render_durations(self, case: ReportCase | ReportCaseAggregate) -> str: """Build the diff string for a duration value.""" case_durations: dict[str, float] = {'task': case.task_duration} if self.include_total_duration: case_durations['total'] = case.total_duration return self._render_dict( case_durations, {'task': self.duration_renderer, 'total': self.duration_renderer}, include_names=self.include_total_duration, ) def _render_durations_diff( self, base_case: ReportCase | ReportCaseAggregate, new_case: ReportCase | ReportCaseAggregate, ) -> str: """Build the diff string for a duration value.""" base_case_durations: dict[str, float] = {'task': base_case.task_duration} new_case_durations: dict[str, float] = {'task': new_case.task_duration} if self.include_total_duration: # pragma: no branch base_case_durations['total'] = base_case.total_duration new_case_durations['total'] = new_case.total_duration return self._render_dicts_diff( base_case_durations, new_case_durations, {'task': self.duration_renderer, 'total': self.duration_renderer}, include_names=self.include_total_duration, ) @staticmethod def _render_dicts_diff( baseline_dict: dict[str, T], new_dict: dict[str, T], renderers: Mapping[str, _AbstractRenderer[T]], *, include_names: bool = True, ) -> str: keys: set[str] = set() keys.update(baseline_dict.keys()) keys.update(new_dict.keys()) diff_lines: list[str] = [] for key in sorted(keys): old_val = baseline_dict.get(key) new_val = new_dict.get(key) rendered = renderers[key].render_diff(key if include_names else None, old_val, new_val) diff_lines.append(rendered) return '\n'.join(diff_lines) if diff_lines else EMPTY_CELL_STR def _render_dict( self, case_dict: Mapping[str, EvaluationResult[T] | T], renderers: Mapping[str, _AbstractRenderer[T]], *, include_names: bool = True, ) -> str: diff_lines: list[str] = [] for key, val in case_dict.items(): value = cast(EvaluationResult[T], val).value if isinstance(val, EvaluationResult) else val rendered = renderers[key].render_value(key if include_names else None, value) if self.include_reasons and isinstance(val, EvaluationResult) and (reason := val.reason): rendered += f'\n Reason: {reason}\n' diff_lines.append(rendered) return '\n'.join(diff_lines) if diff_lines else EMPTY_CELL_STR def _render_assertions( self, assertions: list[EvaluationResult[bool]], ) -> str: if not assertions: return EMPTY_CELL_STR lines: list[str] = [] for a in assertions: line = '[green]✔[/]' if a.value else '[red]✗[/]' if self.include_reasons: line = f'{a.name}: {line}\n' line = f'{line} Reason: {a.reason}\n\n' if a.reason else line lines.append(line) return ''.join(lines) @staticmethod def _render_aggregate_assertions( assertions: float | None, ) -> str: return ( default_render_percentage(assertions) + ' [green]✔[/]' if assertions is not None else EMPTY_AGGREGATE_CELL_STR ) @staticmethod def _render_assertions_diff( assertions: list[EvaluationResult[bool]], new_assertions: list[EvaluationResult[bool]] ) -> str: if not assertions and not new_assertions: # pragma: no cover return EMPTY_CELL_STR old = ''.join(['[green]✔[/]' if a.value else '[red]✗[/]' for a in assertions]) new = ''.join(['[green]✔[/]' if a.value else '[red]✗[/]' for a in new_assertions]) return old if old == new else f'{old} → {new}' @staticmethod def _render_aggregate_assertions_diff( baseline: float | None, new: float | None, ) -> str: if baseline is None and new is None: # pragma: no cover return EMPTY_AGGREGATE_CELL_STR rendered_baseline = ( default_render_percentage(baseline) + ' [green]✔[/]' if baseline is not None else EMPTY_CELL_STR ) rendered_new = default_render_percentage(new) + ' [green]✔[/]' if new is not None else EMPTY_CELL_STR return rendered_new if rendered_baseline == rendered_new else f'{rendered_baseline} → {rendered_new}' def _render_evaluator_failures( self, failures: list[EvaluatorFailure], ) -> str: if not failures: return EMPTY_CELL_STR # pragma: no cover lines: list[str] = [] for failure in failures: line = f'[red]{failure.name}[/]' if failure.error_message: line += f': {failure.error_message}' lines.append(line) return '\n'.join(lines) def _render_evaluator_failures_diff( self, baseline_failures: list[EvaluatorFailure], new_failures: list[EvaluatorFailure], ) -> str: baseline_str = self._render_evaluator_failures(baseline_failures) new_str = self._render_evaluator_failures(new_failures) if baseline_str == new_str: return baseline_str # pragma: no cover return f'{baseline_str}\n→\n{new_str}' ``` #### build_base_table ```python build_base_table(title: str) -> Table ``` Build and return a Rich Table for the diff output. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_base_table(self, title: str) -> Table: """Build and return a Rich Table for the diff output.""" table = Table(title=title, show_lines=True) table.add_column('Case ID', style='bold') if self.include_input: table.add_column('Inputs', overflow='fold') if self.include_metadata: table.add_column('Metadata', overflow='fold') if self.include_expected_output: table.add_column('Expected Output', overflow='fold') if self.include_output: table.add_column('Outputs', overflow='fold') if self.include_scores: table.add_column('Scores', overflow='fold') if self.include_labels: table.add_column('Labels', overflow='fold') if self.include_metrics: table.add_column('Metrics', overflow='fold') if self.include_assertions: table.add_column('Assertions', overflow='fold') if self.include_evaluator_failures: table.add_column('Evaluator Failures', overflow='fold') if self.include_durations: table.add_column('Durations' if self.include_total_duration else 'Duration', justify='right') return table ``` #### build_failures_table ```python build_failures_table(title: str) -> Table ``` Build and return a Rich Table for the failures output. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_failures_table(self, title: str) -> Table: """Build and return a Rich Table for the failures output.""" table = Table(title=title, show_lines=True) table.add_column('Case ID', style='bold') if self.include_input: table.add_column('Inputs', overflow='fold') if self.include_metadata: table.add_column('Metadata', overflow='fold') if self.include_expected_output: table.add_column('Expected Output', overflow='fold') if self.include_error_message: table.add_column('Error Message', overflow='fold') if self.include_error_stacktrace: table.add_column('Error Stacktrace', overflow='fold') return table ``` #### build_row ```python build_row(case: ReportCase) -> list[str] ``` Build a table row for a single case. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_row(self, case: ReportCase) -> list[str]: """Build a table row for a single case.""" row = [case.name] if self.include_input: row.append(self.input_renderer.render_value(None, case.inputs) or EMPTY_CELL_STR) if self.include_metadata: row.append(self.metadata_renderer.render_value(None, case.metadata) or EMPTY_CELL_STR) if self.include_expected_output: row.append(self.output_renderer.render_value(None, case.expected_output) or EMPTY_CELL_STR) if self.include_output: row.append(self.output_renderer.render_value(None, case.output) or EMPTY_CELL_STR) if self.include_scores: row.append(self._render_dict({k: v for k, v in case.scores.items()}, self.score_renderers)) if self.include_labels: row.append(self._render_dict({k: v for k, v in case.labels.items()}, self.label_renderers)) if self.include_metrics: row.append(self._render_dict(case.metrics, self.metric_renderers)) if self.include_assertions: row.append(self._render_assertions(list(case.assertions.values()))) if self.include_evaluator_failures: row.append(self._render_evaluator_failures(case.evaluator_failures)) if self.include_durations: row.append(self._render_durations(case)) return row ``` #### build_aggregate_row ```python build_aggregate_row( aggregate: ReportCaseAggregate, ) -> list[str] ``` Build a table row for an aggregated case. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_aggregate_row(self, aggregate: ReportCaseAggregate) -> list[str]: """Build a table row for an aggregated case.""" row = [f'[b i]{aggregate.name}[/]'] if self.include_input: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_metadata: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_expected_output: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_output: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_scores: row.append(self._render_dict(aggregate.scores, self.score_renderers)) if self.include_labels: row.append(self._render_dict(aggregate.labels, self.label_renderers)) if self.include_metrics: row.append(self._render_dict(aggregate.metrics, self.metric_renderers)) if self.include_assertions: row.append(self._render_aggregate_assertions(aggregate.assertions)) if self.include_evaluator_failures: row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_durations: row.append(self._render_durations(aggregate)) return row ``` #### build_diff_row ```python build_diff_row( new_case: ReportCase, baseline: ReportCase ) -> list[str] ``` Build a table row for a given case ID. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_diff_row( self, new_case: ReportCase, baseline: ReportCase, ) -> list[str]: """Build a table row for a given case ID.""" assert baseline.name == new_case.name, 'This should only be called for matching case IDs' row = [baseline.name] if self.include_input: # pragma: no branch input_diff = self.input_renderer.render_diff(None, baseline.inputs, new_case.inputs) or EMPTY_CELL_STR row.append(input_diff) if self.include_metadata: # pragma: no branch metadata_diff = ( self.metadata_renderer.render_diff(None, baseline.metadata, new_case.metadata) or EMPTY_CELL_STR ) row.append(metadata_diff) if self.include_expected_output: # pragma: no branch expected_output_diff = ( self.output_renderer.render_diff(None, baseline.expected_output, new_case.expected_output) or EMPTY_CELL_STR ) row.append(expected_output_diff) if self.include_output: # pragma: no branch output_diff = self.output_renderer.render_diff(None, baseline.output, new_case.output) or EMPTY_CELL_STR row.append(output_diff) if self.include_scores: # pragma: no branch scores_diff = self._render_dicts_diff( {k: v.value for k, v in baseline.scores.items()}, {k: v.value for k, v in new_case.scores.items()}, self.score_renderers, ) row.append(scores_diff) if self.include_labels: # pragma: no branch labels_diff = self._render_dicts_diff( {k: v.value for k, v in baseline.labels.items()}, {k: v.value for k, v in new_case.labels.items()}, self.label_renderers, ) row.append(labels_diff) if self.include_metrics: # pragma: no branch metrics_diff = self._render_dicts_diff(baseline.metrics, new_case.metrics, self.metric_renderers) row.append(metrics_diff) if self.include_assertions: # pragma: no branch assertions_diff = self._render_assertions_diff( list(baseline.assertions.values()), list(new_case.assertions.values()) ) row.append(assertions_diff) if self.include_evaluator_failures: # pragma: no branch evaluator_failures_diff = self._render_evaluator_failures_diff( baseline.evaluator_failures, new_case.evaluator_failures ) row.append(evaluator_failures_diff) if self.include_durations: # pragma: no branch durations_diff = self._render_durations_diff(baseline, new_case) row.append(durations_diff) return row ``` #### build_diff_aggregate_row ```python build_diff_aggregate_row( new: ReportCaseAggregate, baseline: ReportCaseAggregate ) -> list[str] ``` Build a table row for a given case ID. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_diff_aggregate_row( self, new: ReportCaseAggregate, baseline: ReportCaseAggregate, ) -> list[str]: """Build a table row for a given case ID.""" assert baseline.name == new.name, 'This should only be called for aggregates with matching names' row = [f'[b i]{baseline.name}[/]'] if self.include_input: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_metadata: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_expected_output: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_output: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_scores: # pragma: no branch scores_diff = self._render_dicts_diff(baseline.scores, new.scores, self.score_renderers) row.append(scores_diff) if self.include_labels: # pragma: no branch labels_diff = self._render_dicts_diff(baseline.labels, new.labels, self.label_renderers) row.append(labels_diff) if self.include_metrics: # pragma: no branch metrics_diff = self._render_dicts_diff(baseline.metrics, new.metrics, self.metric_renderers) row.append(metrics_diff) if self.include_assertions: # pragma: no branch assertions_diff = self._render_aggregate_assertions_diff(baseline.assertions, new.assertions) row.append(assertions_diff) if self.include_evaluator_failures: # pragma: no branch row.append(EMPTY_AGGREGATE_CELL_STR) if self.include_durations: # pragma: no branch durations_diff = self._render_durations_diff(baseline, new) row.append(durations_diff) return row ``` #### build_failure_row ```python build_failure_row(case: ReportCaseFailure) -> list[str] ``` Build a table row for a single case failure. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_failure_row(self, case: ReportCaseFailure) -> list[str]: """Build a table row for a single case failure.""" row = [case.name] if self.include_input: row.append(self.input_renderer.render_value(None, case.inputs) or EMPTY_CELL_STR) if self.include_metadata: row.append(self.metadata_renderer.render_value(None, case.metadata) or EMPTY_CELL_STR) if self.include_expected_output: row.append(self.output_renderer.render_value(None, case.expected_output) or EMPTY_CELL_STR) if self.include_error_message: row.append(case.error_message or EMPTY_CELL_STR) if self.include_error_stacktrace: row.append(case.error_stacktrace or EMPTY_CELL_STR) return row ``` ### EvaluationRenderer A class for rendering an EvalReport or the diff between two EvalReports. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationRenderer: """A class for rendering an EvalReport or the diff between two EvalReports.""" # Columns to include include_input: bool include_metadata: bool include_expected_output: bool include_output: bool include_durations: bool include_total_duration: bool # Rows to include include_removed_cases: bool include_averages: bool input_config: RenderValueConfig metadata_config: RenderValueConfig output_config: RenderValueConfig score_configs: dict[str, RenderNumberConfig] label_configs: dict[str, RenderValueConfig] metric_configs: dict[str, RenderNumberConfig] duration_config: RenderNumberConfig # Data to include include_reasons: bool # only applies to reports, not to diffs include_error_message: bool include_error_stacktrace: bool include_evaluator_failures: bool def include_scores(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.scores for case in self._all_cases(report, baseline)) def include_labels(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.labels for case in self._all_cases(report, baseline)) def include_metrics(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.metrics for case in self._all_cases(report, baseline)) def include_assertions(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.assertions for case in self._all_cases(report, baseline)) def include_evaluator_failures_column(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return self.include_evaluator_failures and any( case.evaluator_failures for case in self._all_cases(report, baseline) ) def _all_cases(self, report: EvaluationReport, baseline: EvaluationReport | None) -> list[ReportCase]: if not baseline: return report.cases else: return report.cases + self._baseline_cases_to_include(report, baseline) def _baseline_cases_to_include(self, report: EvaluationReport, baseline: EvaluationReport) -> list[ReportCase]: if self.include_removed_cases: return baseline.cases report_case_names = {case.name for case in report.cases} return [case for case in baseline.cases if case.name in report_case_names] def _get_case_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None = None ) -> ReportCaseRenderer: input_renderer = _ValueRenderer.from_config(self.input_config) metadata_renderer = _ValueRenderer.from_config(self.metadata_config) output_renderer = _ValueRenderer.from_config(self.output_config) score_renderers = self._infer_score_renderers(report, baseline) label_renderers = self._infer_label_renderers(report, baseline) metric_renderers = self._infer_metric_renderers(report, baseline) duration_renderer = _NumberRenderer.infer_from_config( self.duration_config, 'duration', [x.task_duration for x in self._all_cases(report, baseline)] ) return ReportCaseRenderer( include_input=self.include_input, include_metadata=self.include_metadata, include_expected_output=self.include_expected_output, include_output=self.include_output, include_scores=self.include_scores(report, baseline), include_labels=self.include_labels(report, baseline), include_metrics=self.include_metrics(report, baseline), include_assertions=self.include_assertions(report, baseline), include_reasons=self.include_reasons, include_durations=self.include_durations, include_total_duration=self.include_total_duration, include_error_message=self.include_error_message, include_error_stacktrace=self.include_error_stacktrace, include_evaluator_failures=self.include_evaluator_failures_column(report, baseline), input_renderer=input_renderer, metadata_renderer=metadata_renderer, output_renderer=output_renderer, score_renderers=score_renderers, label_renderers=label_renderers, metric_renderers=metric_renderers, duration_renderer=duration_renderer, ) # TODO(DavidM): in v2, change the return type here to RenderableType def build_table(self, report: EvaluationReport, *, with_title: bool = True) -> Table: """Build a table for the report. Args: report: The evaluation report to render with_title: Whether to include the title in the table (default True) Returns: A Rich Table object """ case_renderer = self._get_case_renderer(report) title = f'Evaluation Summary: {report.name}' if with_title else '' table = case_renderer.build_base_table(title) for case in report.cases: table.add_row(*case_renderer.build_row(case)) if self.include_averages: # pragma: no branch average = report.averages() if average: # pragma: no branch table.add_row(*case_renderer.build_aggregate_row(average)) return table # TODO(DavidM): in v2, change the return type here to RenderableType def build_diff_table( self, report: EvaluationReport, baseline: EvaluationReport, *, with_title: bool = True ) -> Table: """Build a diff table comparing report to baseline. Args: report: The evaluation report to compare baseline: The baseline report to compare against with_title: Whether to include the title in the table (default True) Returns: A Rich Table object """ report_cases = report.cases baseline_cases = self._baseline_cases_to_include(report, baseline) report_cases_by_id = {case.name: case for case in report_cases} baseline_cases_by_id = {case.name: case for case in baseline_cases} diff_cases: list[tuple[ReportCase, ReportCase]] = [] removed_cases: list[ReportCase] = [] added_cases: list[ReportCase] = [] for case_id in sorted(set(baseline_cases_by_id.keys()) | set(report_cases_by_id.keys())): maybe_baseline_case = baseline_cases_by_id.get(case_id) maybe_report_case = report_cases_by_id.get(case_id) if maybe_baseline_case and maybe_report_case: diff_cases.append((maybe_baseline_case, maybe_report_case)) elif maybe_baseline_case: removed_cases.append(maybe_baseline_case) elif maybe_report_case: added_cases.append(maybe_report_case) else: # pragma: no cover assert False, 'This should be unreachable' case_renderer = self._get_case_renderer(report, baseline) diff_name = baseline.name if baseline.name == report.name else f'{baseline.name} → {report.name}' title = f'Evaluation Diff: {diff_name}' if with_title else '' table = case_renderer.build_base_table(title) for baseline_case, new_case in diff_cases: table.add_row(*case_renderer.build_diff_row(new_case, baseline_case)) for case in added_cases: row = case_renderer.build_row(case) row[0] = f'[green]+ Added Case[/]\n{row[0]}' table.add_row(*row) for case in removed_cases: row = case_renderer.build_row(case) row[0] = f'[red]- Removed Case[/]\n{row[0]}' table.add_row(*row) if self.include_averages: # pragma: no branch report_average = ReportCaseAggregate.average(report_cases) baseline_average = ReportCaseAggregate.average(baseline_cases) table.add_row(*case_renderer.build_diff_aggregate_row(report_average, baseline_average)) return table # TODO(DavidM): in v2, change the return type here to RenderableType def build_failures_table(self, report: EvaluationReport) -> Table: case_renderer = self._get_case_renderer(report) table = case_renderer.build_failures_table('Case Failures') for case in report.failures: table.add_row(*case_renderer.build_failure_row(case)) return table def _infer_score_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, score in case.scores.items(): values_by_name.setdefault(k, []).append(score.value) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.score_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'score', values) return all_renderers def _infer_label_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _ValueRenderer]: all_cases = self._all_cases(report, baseline) all_names: set[str] = set() for case in all_cases: for k in case.labels: all_names.add(k) all_renderers: dict[str, _ValueRenderer] = {} for name in all_names: merged_config = _DEFAULT_VALUE_CONFIG.copy() merged_config.update(self.label_configs.get(name, {})) all_renderers[name] = _ValueRenderer.from_config(merged_config) return all_renderers def _infer_metric_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, v in case.metrics.items(): values_by_name.setdefault(k, []).append(v) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.metric_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'metric', values) return all_renderers def _infer_duration_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> _NumberRenderer: # pragma: no cover all_cases = self._all_cases(report, baseline) all_durations = [x.task_duration for x in all_cases] if self.include_total_duration: all_durations += [x.total_duration for x in all_cases] return _NumberRenderer.infer_from_config(self.duration_config, 'duration', all_durations) ``` #### build_table ```python build_table( report: EvaluationReport, *, with_title: bool = True ) -> Table ``` Build a table for the report. Parameters: | Name | Type | Description | Default | | ------------ | ------------------ | -------------------------------------------------------- | ---------- | | `report` | `EvaluationReport` | The evaluation report to render | *required* | | `with_title` | `bool` | Whether to include the title in the table (default True) | `True` | Returns: | Type | Description | | ------- | ------------------- | | `Table` | A Rich Table object | Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_table(self, report: EvaluationReport, *, with_title: bool = True) -> Table: """Build a table for the report. Args: report: The evaluation report to render with_title: Whether to include the title in the table (default True) Returns: A Rich Table object """ case_renderer = self._get_case_renderer(report) title = f'Evaluation Summary: {report.name}' if with_title else '' table = case_renderer.build_base_table(title) for case in report.cases: table.add_row(*case_renderer.build_row(case)) if self.include_averages: # pragma: no branch average = report.averages() if average: # pragma: no branch table.add_row(*case_renderer.build_aggregate_row(average)) return table ``` #### build_diff_table ```python build_diff_table( report: EvaluationReport, baseline: EvaluationReport, *, with_title: bool = True ) -> Table ``` Build a diff table comparing report to baseline. Parameters: | Name | Type | Description | Default | | ------------ | ------------------ | -------------------------------------------------------- | ---------- | | `report` | `EvaluationReport` | The evaluation report to compare | *required* | | `baseline` | `EvaluationReport` | The baseline report to compare against | *required* | | `with_title` | `bool` | Whether to include the title in the table (default True) | `True` | Returns: | Type | Description | | ------- | ------------------- | | `Table` | A Rich Table object | Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def build_diff_table( self, report: EvaluationReport, baseline: EvaluationReport, *, with_title: bool = True ) -> Table: """Build a diff table comparing report to baseline. Args: report: The evaluation report to compare baseline: The baseline report to compare against with_title: Whether to include the title in the table (default True) Returns: A Rich Table object """ report_cases = report.cases baseline_cases = self._baseline_cases_to_include(report, baseline) report_cases_by_id = {case.name: case for case in report_cases} baseline_cases_by_id = {case.name: case for case in baseline_cases} diff_cases: list[tuple[ReportCase, ReportCase]] = [] removed_cases: list[ReportCase] = [] added_cases: list[ReportCase] = [] for case_id in sorted(set(baseline_cases_by_id.keys()) | set(report_cases_by_id.keys())): maybe_baseline_case = baseline_cases_by_id.get(case_id) maybe_report_case = report_cases_by_id.get(case_id) if maybe_baseline_case and maybe_report_case: diff_cases.append((maybe_baseline_case, maybe_report_case)) elif maybe_baseline_case: removed_cases.append(maybe_baseline_case) elif maybe_report_case: added_cases.append(maybe_report_case) else: # pragma: no cover assert False, 'This should be unreachable' case_renderer = self._get_case_renderer(report, baseline) diff_name = baseline.name if baseline.name == report.name else f'{baseline.name} → {report.name}' title = f'Evaluation Diff: {diff_name}' if with_title else '' table = case_renderer.build_base_table(title) for baseline_case, new_case in diff_cases: table.add_row(*case_renderer.build_diff_row(new_case, baseline_case)) for case in added_cases: row = case_renderer.build_row(case) row[0] = f'[green]+ Added Case[/]\n{row[0]}' table.add_row(*row) for case in removed_cases: row = case_renderer.build_row(case) row[0] = f'[red]- Removed Case[/]\n{row[0]}' table.add_row(*row) if self.include_averages: # pragma: no branch report_average = ReportCaseAggregate.average(report_cases) baseline_average = ReportCaseAggregate.average(baseline_cases) table.add_row(*case_renderer.build_diff_aggregate_row(report_average, baseline_average)) return table ``` # `pydantic_graph.beta` The next version of the pydantic-graph framework with enhanced graph execution capabilities. This module provides a parallel control flow graph execution framework with support for: - 'Step' nodes for task execution - 'Decision' nodes for conditional branching - 'Fork' nodes for parallel execution coordination - 'Join' nodes and 'Reducer's for re-joining parallel executions - Mermaid diagram generation for graph visualization ### Graph Bases: `Generic[StateT, DepsT, InputT, OutputT]` A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass(repr=False) class Graph(Generic[StateT, DepsT, InputT, OutputT]): """A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data OutputT: The type of the output data """ name: str | None """Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.""" state_type: type[StateT] """The type of the graph state.""" deps_type: type[DepsT] """The type of the dependencies.""" input_type: type[InputT] """The type of the input data.""" output_type: type[OutputT] """The type of the output data.""" auto_instrument: bool """Whether to automatically create instrumentation spans.""" nodes: dict[NodeID, AnyNode] """All nodes in the graph indexed by their ID.""" edges_by_source: dict[NodeID, list[Path]] """Outgoing paths from each source node.""" parent_forks: dict[JoinID, ParentFork[NodeID]] """Parent fork information for each join node.""" intermediate_join_nodes: dict[JoinID, set[JoinID]] """For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream.""" def get_parent_fork(self, join_id: JoinID) -> ParentFork[NodeID]: """Get the parent fork information for a join node. Args: join_id: The ID of the join node Returns: The parent fork information for the join Raises: RuntimeError: If the join ID is not found or has no parent fork """ result = self.parent_forks.get(join_id) if result is None: raise RuntimeError(f'Node {join_id} is not a join node or did not have a dominating fork (this is a bug)') return result def is_final_join(self, join_id: JoinID) -> bool: """Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Args: join_id: The ID of the join node Returns: True if the join is final, False if it's non-final """ # Check if this join appears in any other join's intermediate_join_nodes for intermediate_joins in self.intermediate_join_nodes.values(): if join_id in intermediate_joins: return False return True async def run( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT: """Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Returns: The final output from the graph execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=2) if inferred_name is not None: # pragma: no branch self.name = inferred_name async with self.iter(state=state, deps=deps, inputs=inputs, span=span, infer_name=False) as graph_run: # Note: This would probably be better using `async for _ in graph_run`, but this tests the `next` method, # which I'm less confident will be implemented correctly if not used on the critical path. We can change it # once we have tests, etc. event: Any = None while True: try: event = await graph_run.next(event) except StopAsyncIteration: assert isinstance(event, EndMarker), 'Graph run should end with an EndMarker.' return cast(EndMarker[OutputT], event).value @asynccontextmanager async def iter( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]: """Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Yields: A GraphRun instance that can be iterated for step-by-step execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=3) # depth=3 because asynccontextmanager adds one if inferred_name is not None: # pragma: no branch self.name = inferred_name with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) async with GraphRun[StateT, DepsT, OutputT]( graph=self, state=state, deps=deps, inputs=inputs, traceparent=traceparent, ) as graph_run: yield graph_run def render(self, *, title: str | None = None, direction: StateDiagramDirection | None = None) -> str: """Render the graph as a Mermaid diagram string. Args: title: Optional title for the diagram direction: Optional direction for the diagram layout Returns: A string containing the Mermaid diagram representation """ from pydantic_graph.beta.mermaid import build_mermaid_graph return build_mermaid_graph(self.nodes, self.edges_by_source).render(title=title, direction=direction) def __repr__(self) -> str: super_repr = super().__repr__() # include class and memory address # Insert the result of calling `__str__` before the final '>' in the repr return f'{super_repr[:-1]}\n{self}\n{super_repr[-1]}' def __str__(self) -> str: """Return a Mermaid diagram representation of the graph. Returns: A string containing the Mermaid diagram of the graph """ return self.render() ``` #### name ```python name: str | None ``` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. #### state_type ```python state_type: type[StateT] ``` The type of the graph state. #### deps_type ```python deps_type: type[DepsT] ``` The type of the dependencies. #### input_type ```python input_type: type[InputT] ``` The type of the input data. #### output_type ```python output_type: type[OutputT] ``` The type of the output data. #### auto_instrument ```python auto_instrument: bool ``` Whether to automatically create instrumentation spans. #### nodes ```python nodes: dict[NodeID, AnyNode] ``` All nodes in the graph indexed by their ID. #### edges_by_source ```python edges_by_source: dict[NodeID, list[Path]] ``` Outgoing paths from each source node. #### parent_forks ```python parent_forks: dict[JoinID, ParentFork[NodeID]] ``` Parent fork information for each join node. #### intermediate_join_nodes ```python intermediate_join_nodes: dict[JoinID, set[JoinID]] ``` For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream. #### get_parent_fork ```python get_parent_fork(join_id: JoinID) -> ParentFork[NodeID] ``` Get the parent fork information for a join node. Parameters: | Name | Type | Description | Default | | --------- | -------- | ----------------------- | ---------- | | `join_id` | `JoinID` | The ID of the join node | *required* | Returns: | Type | Description | | -------------------- | ---------------------------------------- | | `ParentFork[NodeID]` | The parent fork information for the join | Raises: | Type | Description | | -------------- | ------------------------------------------------- | | `RuntimeError` | If the join ID is not found or has no parent fork | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def get_parent_fork(self, join_id: JoinID) -> ParentFork[NodeID]: """Get the parent fork information for a join node. Args: join_id: The ID of the join node Returns: The parent fork information for the join Raises: RuntimeError: If the join ID is not found or has no parent fork """ result = self.parent_forks.get(join_id) if result is None: raise RuntimeError(f'Node {join_id} is not a join node or did not have a dominating fork (this is a bug)') return result ``` #### is_final_join ```python is_final_join(join_id: JoinID) -> bool ``` Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Parameters: | Name | Type | Description | Default | | --------- | -------- | ----------------------- | ---------- | | `join_id` | `JoinID` | The ID of the join node | *required* | Returns: | Type | Description | | ------ | -------------------------------------------------- | | `bool` | True if the join is final, False if it's non-final | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def is_final_join(self, join_id: JoinID) -> bool: """Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Args: join_id: The ID of the join node Returns: True if the join is final, False if it's non-final """ # Check if this join appears in any other join's intermediate_join_nodes for intermediate_joins in self.intermediate_join_nodes.values(): if join_id in intermediate_joins: return False return True ``` #### run ```python run( *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> OutputT ``` Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------------------- | ------------------------------------------------------- | ----------------------------------------- | | `state` | `StateT` | The graph state instance | `None` | | `deps` | `DepsT` | The dependencies instance | `None` | | `inputs` | `InputT` | The input data for the graph | `None` | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | Optional span for tracing/instrumentation | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --------- | ----------------------------------------- | | `OutputT` | The final output from the graph execution | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python async def run( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT: """Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Returns: The final output from the graph execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=2) if inferred_name is not None: # pragma: no branch self.name = inferred_name async with self.iter(state=state, deps=deps, inputs=inputs, span=span, infer_name=False) as graph_run: # Note: This would probably be better using `async for _ in graph_run`, but this tests the `next` method, # which I'm less confident will be implemented correctly if not used on the critical path. We can change it # once we have tests, etc. event: Any = None while True: try: event = await graph_run.next(event) except StopAsyncIteration: assert isinstance(event, EndMarker), 'Graph run should end with an EndMarker.' return cast(EndMarker[OutputT], event).value ``` #### iter ```python iter( *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]] ``` Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------------------- | ------------------------------------------------------- | ----------------------------------------- | | `state` | `StateT` | The graph state instance | `None` | | `deps` | `DepsT` | The dependencies instance | `None` | | `inputs` | `InputT` | The input data for the graph | `None` | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | Optional span for tracing/instrumentation | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Yields: | Type | Description | | ------------------------------------------------- | ------------------------------------------------------------------- | | `AsyncIterator[GraphRun[StateT, DepsT, OutputT]]` | A GraphRun instance that can be iterated for step-by-step execution | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @asynccontextmanager async def iter( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]: """Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Yields: A GraphRun instance that can be iterated for step-by-step execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=3) # depth=3 because asynccontextmanager adds one if inferred_name is not None: # pragma: no branch self.name = inferred_name with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) async with GraphRun[StateT, DepsT, OutputT]( graph=self, state=state, deps=deps, inputs=inputs, traceparent=traceparent, ) as graph_run: yield graph_run ``` #### render ```python render( *, title: str | None = None, direction: StateDiagramDirection | None = None ) -> str ``` Render the graph as a Mermaid diagram string. Parameters: | Name | Type | Description | Default | | ----------- | ----------------------- | ----------- | ----------------------------------------- | | `title` | \`str | None\` | Optional title for the diagram | | `direction` | \`StateDiagramDirection | None\` | Optional direction for the diagram layout | Returns: | Type | Description | | ----- | ------------------------------------------------------ | | `str` | A string containing the Mermaid diagram representation | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def render(self, *, title: str | None = None, direction: StateDiagramDirection | None = None) -> str: """Render the graph as a Mermaid diagram string. Args: title: Optional title for the diagram direction: Optional direction for the diagram layout Returns: A string containing the Mermaid diagram representation """ from pydantic_graph.beta.mermaid import build_mermaid_graph return build_mermaid_graph(self.nodes, self.edges_by_source).render(title=title, direction=direction) ``` #### __str__ ```python __str__() -> str ``` Return a Mermaid diagram representation of the graph. Returns: | Type | Description | | ----- | ---------------------------------------------------- | | `str` | A string containing the Mermaid diagram of the graph | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def __str__(self) -> str: """Return a Mermaid diagram representation of the graph. Returns: A string containing the Mermaid diagram of the graph """ return self.render() ``` ### GraphBuilder Bases: `Generic[StateT, DepsT, GraphInputT, GraphOutputT]` A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | -------------- | -------------------- | --------------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `GraphInputT` | | The type of the graph input data | *required* | | `GraphOutputT` | | The type of the graph output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python @dataclass(init=False) class GraphBuilder(Generic[StateT, DepsT, GraphInputT, GraphOutputT]): """A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies GraphInputT: The type of the graph input data GraphOutputT: The type of the graph output data """ name: str | None """Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.""" state_type: TypeOrTypeExpression[StateT] """The type of the graph state.""" deps_type: TypeOrTypeExpression[DepsT] """The type of the dependencies.""" input_type: TypeOrTypeExpression[GraphInputT] """The type of the graph input data.""" output_type: TypeOrTypeExpression[GraphOutputT] """The type of the graph output data.""" auto_instrument: bool """Whether to automatically create instrumentation spans.""" _nodes: dict[NodeID, AnyNode] """Internal storage for nodes in the graph.""" _edges_by_source: dict[NodeID, list[Path]] """Internal storage for edges by source node.""" _decision_index: int """Counter for generating unique decision node IDs.""" Source = TypeAliasType('Source', SourceNode[StateT, DepsT, OutputT], type_params=(OutputT,)) Destination = TypeAliasType('Destination', DestinationNode[StateT, DepsT, InputT], type_params=(InputT,)) def __init__( self, *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ): """Initialize a graph builder. Args: name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the graph state deps_type: The type of the dependencies input_type: The type of the graph input data output_type: The type of the graph output data auto_instrument: Whether to automatically create instrumentation spans """ self.name = name self.state_type = state_type self.deps_type = deps_type self.input_type = input_type self.output_type = output_type self.auto_instrument = auto_instrument self._nodes = {} self._edges_by_source = defaultdict(list) self._decision_index = 1 self._start_node = StartNode[GraphInputT]() self._end_node = EndNode[GraphOutputT]() # Node building @property def start_node(self) -> StartNode[GraphInputT]: """Get the start node for the graph. Returns: The start node that receives the initial graph input """ return self._start_node @property def end_node(self) -> EndNode[GraphOutputT]: """Get the end node for the graph. Returns: The end node that produces the final graph output """ return self._end_node @overload def step( self, *, node_id: str | None = None, label: str | None = None, ) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]]: ... @overload def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, OutputT]: ... def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] ): """Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StepFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, OutputT]: return self.step(call=func, node_id=node_id, label=label) return decorator node_id = node_id or get_callable_name(call) step = Step[StateT, DepsT, InputT, OutputT](id=NodeID(node_id), call=call, label=label) return step @overload def stream( self, *, node_id: str | None = None, label: str | None = None, ) -> Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] ]: ... @overload def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: ... @overload def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): ... def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): """Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StreamFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: return self.stream(call=func, node_id=node_id, label=label) return decorator # We need to wrap the call so that we can call `await` even though the result is an async iterator async def wrapper(ctx: StepContext[StateT, DepsT, InputT]): return call(ctx) node_id = node_id or get_callable_name(call) return self.step(call=wrapper, node_id=node_id, label=label) @overload def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial: OutputT, node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: ... @overload def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial_factory: Callable[[], OutputT], node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: ... def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial: OutputT | Unset = UNSET, initial_factory: Callable[[], OutputT] | Unset = UNSET, node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: if initial_factory is UNSET: initial_factory = lambda: initial # pyright: ignore[reportAssignmentType] # noqa: E731 return Join[StateT, DepsT, InputT, OutputT]( id=JoinID(NodeID(node_id or generate_placeholder_node_id(get_callable_name(reducer)))), reducer=reducer, initial_factory=cast(Callable[[], OutputT], initial_factory), parent_fork_id=ForkID(parent_fork_id) if parent_fork_id is not None else None, preferred_parent_fork=preferred_parent_fork, ) # Edge building def add(self, *edges: EdgePath[StateT, DepsT]) -> None: # noqa: C901 """Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Args: *edges: The edge paths to add to the graph """ def _handle_path(p: Path): """Process a path and create necessary fork nodes. Args: p: The path to process """ for item in p.items: if isinstance(item, BroadcastMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=False, downstream_join_id=None) self._insert_node(new_node) for path in item.paths: _handle_path(Path(items=[*path.items])) elif isinstance(item, MapMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=True, downstream_join_id=item.downstream_join_id) self._insert_node(new_node) elif isinstance(item, DestinationMarker): pass def _handle_destination_node(d: AnyDestinationNode): if id(d) in destination_ids: return # prevent infinite recursion if there is a cycle of decisions destination_ids.add(id(d)) destinations.append(d) self._insert_node(d) if isinstance(d, Decision): for branch in d.branches: _handle_path(branch.path) for d2 in branch.destinations: _handle_destination_node(d2) destination_ids = set[int]() destinations: list[AnyDestinationNode] = [] for edge in edges: for source_node in edge.sources: self._insert_node(source_node) self._edges_by_source[source_node.id].append(edge.path) for destination_node in edge.destinations: _handle_destination_node(destination_node) _handle_path(edge.path) # Automatically create edges from step function return hints including `BaseNode`s for destination in destinations: if not isinstance(destination, Step) or isinstance(destination, NodeStep): continue parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(destination.call, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError: pass else: edge = self._edge_from_return_hint(destination, return_hint) if edge is not None: self.add(edge) def add_edge(self, source: Source[T], destination: Destination[T], *, label: str | None = None) -> None: """Add a simple edge between two nodes. Args: source: The source node destination: The destination node label: Optional label for the edge """ builder = self.edge_from(source) if label is not None: builder = builder.label(label) self.add(builder.to(destination)) def add_mapping_edge( self, source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None: """Add an edge that maps iterable data across parallel paths. Args: source: The source node that produces iterable data map_to: The destination node that receives individual items pre_map_label: Optional label before the map operation post_map_label: Optional label after the map operation fork_id: Optional ID for the fork node produced for this map operation downstream_join_id: Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. """ builder = self.edge_from(source) if pre_map_label is not None: builder = builder.label(pre_map_label) builder = builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id) if post_map_label is not None: builder = builder.label(post_map_label) self.add(builder.to(map_to)) # TODO(DavidM): Support adding subgraphs; I think this behaves like a step with the same inputs/outputs but gets rendered as a subgraph in mermaid def edge_from(self, *sources: Source[SourceOutputT]) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]: """Create an edge path builder starting from the given source nodes. Args: *sources: The source nodes to start the edge path from Returns: An EdgePathBuilder for constructing the complete edge path """ return EdgePathBuilder[StateT, DepsT, SourceOutputT]( sources=sources, path_builder=PathBuilder(working_items=[]) ) def decision(self, *, note: str | None = None, node_id: str | None = None) -> Decision[StateT, DepsT, Never]: """Create a new decision node. Args: note: Optional note to describe the decision logic node_id: Optional ID for the node produced for this decision logic Returns: A new Decision node with no branches """ return Decision(id=NodeID(node_id or generate_placeholder_node_id('decision')), branches=[], note=note) def match( self, source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]: """Create a decision branch matcher. Args: source: The type or type expression to match against matches: Optional custom matching function Returns: A DecisionBranchBuilder for constructing the branch """ # Note, the following node_id really is just a placeholder and shouldn't end up in the final graph # This is why we don't expose a way for end users to override the value used here. node_id = NodeID(generate_placeholder_node_id('match_decision')) decision = Decision[StateT, DepsT, Never](id=node_id, branches=[], note=None) new_path_builder = PathBuilder[StateT, DepsT, SourceT](working_items=[]) return DecisionBranchBuilder(decision=decision, source=source, matches=matches, path_builder=new_path_builder) def match_node( self, source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT]: """Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Args: source: The BaseNode subclass to match against matches: Optional custom matching function Returns: A DecisionBranch for the BaseNode type """ node = NodeStep(source) path = Path(items=[DestinationMarker(node.id)]) return DecisionBranch(source=source, matches=matches, path=path, destinations=[node]) def node( self, node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT]: """Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Args: node_type: The BaseNode subclass to integrate Returns: An EdgePath representing the node and its connections Raises: GraphSetupError: If the node type is missing required type hints """ parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(node_type.run, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: # pragma: no cover raise exceptions.GraphSetupError( f'Node {node_type} is missing a return type hint on its `run` method' ) from e node = NodeStep(node_type) edge = self._edge_from_return_hint(node, return_hint) if not edge: # pragma: no cover raise exceptions.GraphSetupError(f'Node {node_type} is missing a return type hint on its `run` method') return edge # Helpers def _insert_node(self, node: AnyNode) -> None: """Insert a node into the graph, checking for ID conflicts. Args: node: The node to insert Raises: ValueError: If a different node with the same ID already exists """ existing = self._nodes.get(node.id) if existing is None: self._nodes[node.id] = node elif isinstance(existing, NodeStep) and isinstance(node, NodeStep) and existing.node_type is node.node_type: pass elif existing is not node: raise GraphBuildingError( f'All nodes must have unique node IDs. {node.id!r} was the ID for {existing} and {node}' ) def _edge_from_return_hint( self, node: SourceNode[StateT, DepsT, Any], return_hint: TypeOrTypeExpression[Any] ) -> EdgePath[StateT, DepsT] | None: """Create edges from a return type hint. This method analyzes return type hints from step functions or node methods to automatically create appropriate edges in the graph. Args: node: The source node return_hint: The return type hint to analyze Returns: An EdgePath if edges can be inferred, None otherwise Raises: GraphSetupError: If the return type hint is invalid or incomplete """ destinations: list[AnyDestinationNode] = [] union_args = _utils.get_union_args(return_hint) for return_type in union_args: return_type, annotations = _utils.unpack_annotated(return_type) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: destinations.append(self.end_node) elif return_type_origin is BaseNode: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a plain `BaseNode`. ' 'Edge inference requires each possible returned `BaseNode` subclass to be listed explicitly.' ) elif return_type_origin is StepNode: step = cast( Step[StateT, DepsT, Any, Any] | None, next((a for a in annotations if isinstance(a, Step)), None), # pyright: ignore[reportUnknownArgumentType] ) if step is None: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a `StepNode` without a `Step` annotation. ' 'When returning `my_step.as_node()`, use `Annotated[StepNode[StateT, DepsT], my_step]` as the return type hint.' ) destinations.append(step) elif return_type_origin is JoinNode: join = cast( Join[StateT, DepsT, Any, Any] | None, next((a for a in annotations if isinstance(a, Join)), None), # pyright: ignore[reportUnknownArgumentType] ) if join is None: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a `JoinNode` without a `Join` annotation. ' 'When returning `my_join.as_node()`, use `Annotated[JoinNode[StateT, DepsT], my_join]` as the return type hint.' ) destinations.append(join) elif inspect.isclass(return_type_origin) and issubclass(return_type_origin, BaseNode): destinations.append(NodeStep(return_type)) if len(destinations) < len(union_args): # Only build edges if all the return types are nodes return None edge = self.edge_from(node) if len(destinations) == 1: return edge.to(destinations[0]) else: decision = self.decision() for destination in destinations: # We don't actually use this decision mechanism, but we need to build the edges for parent-fork finding decision = decision.branch(self.match(NoneType).to(destination)) return edge.to(decision) # Graph building def build(self, validate_graph_structure: bool = True) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]: """Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Args: validate_graph_structure: whether to perform validation of the graph structure See the docstring of _validate_graph_structure below for more details. Returns: A complete Graph instance ready for execution Raises: ValueError: If the graph structure is invalid (e.g., join without parent fork) """ nodes = self._nodes edges_by_source = self._edges_by_source nodes, edges_by_source = _replace_placeholder_node_ids(nodes, edges_by_source) nodes, edges_by_source = _flatten_paths(nodes, edges_by_source) nodes, edges_by_source = _normalize_forks(nodes, edges_by_source) if validate_graph_structure: _validate_graph_structure(nodes, edges_by_source) parent_forks = _collect_dominating_forks(nodes, edges_by_source) intermediate_join_nodes = _compute_intermediate_join_nodes(nodes, parent_forks) return Graph[StateT, DepsT, GraphInputT, GraphOutputT]( name=self.name, state_type=unpack_type_expression(self.state_type), deps_type=unpack_type_expression(self.deps_type), input_type=unpack_type_expression(self.input_type), output_type=unpack_type_expression(self.output_type), nodes=nodes, edges_by_source=edges_by_source, parent_forks=parent_forks, intermediate_join_nodes=intermediate_join_nodes, auto_instrument=self.auto_instrument, ) ``` #### name ```python name: str | None = name ``` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. #### state_type ```python state_type: TypeOrTypeExpression[StateT] = state_type ``` The type of the graph state. #### deps_type ```python deps_type: TypeOrTypeExpression[DepsT] = deps_type ``` The type of the dependencies. #### input_type ```python input_type: TypeOrTypeExpression[GraphInputT] = input_type ``` The type of the graph input data. #### output_type ```python output_type: TypeOrTypeExpression[GraphOutputT] = ( output_type ) ``` The type of the graph output data. #### auto_instrument ```python auto_instrument: bool = auto_instrument ``` Whether to automatically create instrumentation spans. #### __init__ ```python __init__( *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[ GraphInputT ] = NoneType, output_type: TypeOrTypeExpression[ GraphOutputT ] = NoneType, auto_instrument: bool = True ) ``` Initialize a graph builder. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------------------------ | ----------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | | `name` | \`str | None\` | Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. | | `state_type` | `TypeOrTypeExpression[StateT]` | The type of the graph state | `NoneType` | | `deps_type` | `TypeOrTypeExpression[DepsT]` | The type of the dependencies | `NoneType` | | `input_type` | `TypeOrTypeExpression[GraphInputT]` | The type of the graph input data | `NoneType` | | `output_type` | `TypeOrTypeExpression[GraphOutputT]` | The type of the graph output data | `NoneType` | | `auto_instrument` | `bool` | Whether to automatically create instrumentation spans | `True` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def __init__( self, *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ): """Initialize a graph builder. Args: name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the graph state deps_type: The type of the dependencies input_type: The type of the graph input data output_type: The type of the graph output data auto_instrument: Whether to automatically create instrumentation spans """ self.name = name self.state_type = state_type self.deps_type = deps_type self.input_type = input_type self.output_type = output_type self.auto_instrument = auto_instrument self._nodes = {} self._edges_by_source = defaultdict(list) self._decision_index = 1 self._start_node = StartNode[GraphInputT]() self._end_node = EndNode[GraphOutputT]() ``` #### start_node ```python start_node: StartNode[GraphInputT] ``` Get the start node for the graph. Returns: | Type | Description | | ------------------------ | ---------------------------------------------------- | | `StartNode[GraphInputT]` | The start node that receives the initial graph input | #### end_node ```python end_node: EndNode[GraphOutputT] ``` Get the end node for the graph. Returns: | Type | Description | | ----------------------- | ------------------------------------------------- | | `EndNode[GraphOutputT]` | The end node that produces the final graph output | #### step ```python step( *, node_id: str | None = None, label: str | None = None ) -> Callable[ [StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT], ] ``` ```python step( call: StepFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None ) -> Step[StateT, DepsT, InputT, OutputT] ``` ```python step( call: ( StepFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[ [StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT], ] ) ``` Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Parameters: | Name | Type | Description | Default | | --------- | ---------------------------------------------- | ----------- | ----------------------------- | | `call` | \`StepFunction[StateT, DepsT, InputT, OutputT] | None\` | The step function to wrap | | `node_id` | \`str | None\` | Optional ID for the node | | `label` | \`str | None\` | Optional human-readable label | Returns: | Type | Description | | -------------------------------------- | ---------------------------------------------------------------------------------------------------- | | \`Step[StateT, DepsT, InputT, OutputT] | Callable\[\[StepFunction[StateT, DepsT, InputT, OutputT]\], Step[StateT, DepsT, InputT, OutputT]\]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] ): """Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StepFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, OutputT]: return self.step(call=func, node_id=node_id, label=label) return decorator node_id = node_id or get_callable_name(call) step = Step[StateT, DepsT, InputT, OutputT](id=NodeID(node_id), call=call, label=label) return step ``` #### stream ```python stream( *, node_id: str | None = None, label: str | None = None ) -> Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ``` ```python stream( call: StreamFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] ``` ```python stream( call: ( StreamFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ) ``` ```python stream( call: ( StreamFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ) ``` Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------------------------ | ----------- | ----------------------------- | | `call` | \`StreamFunction[StateT, DepsT, InputT, OutputT] | None\` | The step function to wrap | | `node_id` | \`str | None\` | Optional ID for the node | | `label` | \`str | None\` | Optional human-readable label | Returns: | Type | Description | | ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | \`Step\[StateT, DepsT, InputT, AsyncIterable[OutputT]\] | Callable\[\[StreamFunction[StateT, DepsT, InputT, OutputT]\], Step\[StateT, DepsT, InputT, AsyncIterable[OutputT]\]\]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): """Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StreamFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: return self.stream(call=func, node_id=node_id, label=label) return decorator # We need to wrap the call so that we can call `await` even though the result is an async iterator async def wrapper(ctx: StepContext[StateT, DepsT, InputT]): return call(ctx) node_id = node_id or get_callable_name(call) return self.step(call=wrapper, node_id=node_id, label=label) ``` #### add ```python add(*edges: EdgePath[StateT, DepsT]) -> None ``` Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Parameters: | Name | Type | Description | Default | | -------- | ------------------------- | ---------------------------------- | ------- | | `*edges` | `EdgePath[StateT, DepsT]` | The edge paths to add to the graph | `()` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add(self, *edges: EdgePath[StateT, DepsT]) -> None: # noqa: C901 """Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Args: *edges: The edge paths to add to the graph """ def _handle_path(p: Path): """Process a path and create necessary fork nodes. Args: p: The path to process """ for item in p.items: if isinstance(item, BroadcastMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=False, downstream_join_id=None) self._insert_node(new_node) for path in item.paths: _handle_path(Path(items=[*path.items])) elif isinstance(item, MapMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=True, downstream_join_id=item.downstream_join_id) self._insert_node(new_node) elif isinstance(item, DestinationMarker): pass def _handle_destination_node(d: AnyDestinationNode): if id(d) in destination_ids: return # prevent infinite recursion if there is a cycle of decisions destination_ids.add(id(d)) destinations.append(d) self._insert_node(d) if isinstance(d, Decision): for branch in d.branches: _handle_path(branch.path) for d2 in branch.destinations: _handle_destination_node(d2) destination_ids = set[int]() destinations: list[AnyDestinationNode] = [] for edge in edges: for source_node in edge.sources: self._insert_node(source_node) self._edges_by_source[source_node.id].append(edge.path) for destination_node in edge.destinations: _handle_destination_node(destination_node) _handle_path(edge.path) # Automatically create edges from step function return hints including `BaseNode`s for destination in destinations: if not isinstance(destination, Step) or isinstance(destination, NodeStep): continue parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(destination.call, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError: pass else: edge = self._edge_from_return_hint(destination, return_hint) if edge is not None: self.add(edge) ``` #### add_edge ```python add_edge( source: Source[T], destination: Destination[T], *, label: str | None = None ) -> None ``` Add a simple edge between two nodes. Parameters: | Name | Type | Description | Default | | ------------- | ---------------- | -------------------- | --------------------------- | | `source` | `Source[T]` | The source node | *required* | | `destination` | `Destination[T]` | The destination node | *required* | | `label` | \`str | None\` | Optional label for the edge | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add_edge(self, source: Source[T], destination: Destination[T], *, label: str | None = None) -> None: """Add a simple edge between two nodes. Args: source: The source node destination: The destination node label: Optional label for the edge """ builder = self.edge_from(source) if label is not None: builder = builder.label(label) self.add(builder.to(destination)) ``` #### add_mapping_edge ```python add_mapping_edge( source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None ) -> None ``` Add an edge that maps iterable data across parallel paths. Parameters: | Name | Type | Description | Default | | -------------------- | --------------------- | --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | `source` | `Source[Iterable[T]]` | The source node that produces iterable data | *required* | | `map_to` | `Destination[T]` | The destination node that receives individual items | *required* | | `pre_map_label` | \`str | None\` | Optional label before the map operation | | `post_map_label` | \`str | None\` | Optional label after the map operation | | `fork_id` | \`ForkID | None\` | Optional ID for the fork node produced for this map operation | | `downstream_join_id` | \`JoinID | None\` | Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add_mapping_edge( self, source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None: """Add an edge that maps iterable data across parallel paths. Args: source: The source node that produces iterable data map_to: The destination node that receives individual items pre_map_label: Optional label before the map operation post_map_label: Optional label after the map operation fork_id: Optional ID for the fork node produced for this map operation downstream_join_id: Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. """ builder = self.edge_from(source) if pre_map_label is not None: builder = builder.label(pre_map_label) builder = builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id) if post_map_label is not None: builder = builder.label(post_map_label) self.add(builder.to(map_to)) ``` #### edge_from ```python edge_from( *sources: Source[SourceOutputT], ) -> EdgePathBuilder[StateT, DepsT, SourceOutputT] ``` Create an edge path builder starting from the given source nodes. Parameters: | Name | Type | Description | Default | | ---------- | ----------------------- | -------------------------------------------- | ------- | | `*sources` | `Source[SourceOutputT]` | The source nodes to start the edge path from | `()` | Returns: | Type | Description | | ----------------------------------------------- | ---------------------------------------------------------- | | `EdgePathBuilder[StateT, DepsT, SourceOutputT]` | An EdgePathBuilder for constructing the complete edge path | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def edge_from(self, *sources: Source[SourceOutputT]) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]: """Create an edge path builder starting from the given source nodes. Args: *sources: The source nodes to start the edge path from Returns: An EdgePathBuilder for constructing the complete edge path """ return EdgePathBuilder[StateT, DepsT, SourceOutputT]( sources=sources, path_builder=PathBuilder(working_items=[]) ) ``` #### decision ```python decision( *, note: str | None = None, node_id: str | None = None ) -> Decision[StateT, DepsT, Never] ``` Create a new decision node. Parameters: | Name | Type | Description | Default | | --------- | ----- | ----------- | --------------------------------------------------------- | | `note` | \`str | None\` | Optional note to describe the decision logic | | `node_id` | \`str | None\` | Optional ID for the node produced for this decision logic | Returns: | Type | Description | | -------------------------------- | ------------------------------------ | | `Decision[StateT, DepsT, Never]` | A new Decision node with no branches | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def decision(self, *, note: str | None = None, node_id: str | None = None) -> Decision[StateT, DepsT, Never]: """Create a new decision node. Args: note: Optional note to describe the decision logic node_id: Optional ID for the node produced for this decision logic Returns: A new Decision node with no branches """ return Decision(id=NodeID(node_id or generate_placeholder_node_id('decision')), branches=[], note=note) ``` #### match ```python match( source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None ) -> DecisionBranchBuilder[ StateT, DepsT, SourceT, SourceT, Never ] ``` Create a decision branch matcher. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------- | -------------------------------------------- | --------------------------------- | | `source` | `TypeOrTypeExpression[SourceT]` | The type or type expression to match against | *required* | | `matches` | \`Callable\[[Any], bool\] | None\` | Optional custom matching function | Returns: | Type | Description | | --------------------------------------------------------------- | --------------------------------------------------- | | `DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]` | A DecisionBranchBuilder for constructing the branch | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def match( self, source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]: """Create a decision branch matcher. Args: source: The type or type expression to match against matches: Optional custom matching function Returns: A DecisionBranchBuilder for constructing the branch """ # Note, the following node_id really is just a placeholder and shouldn't end up in the final graph # This is why we don't expose a way for end users to override the value used here. node_id = NodeID(generate_placeholder_node_id('match_decision')) decision = Decision[StateT, DepsT, Never](id=node_id, branches=[], note=None) new_path_builder = PathBuilder[StateT, DepsT, SourceT](working_items=[]) return DecisionBranchBuilder(decision=decision, source=source, matches=matches, path_builder=new_path_builder) ``` #### match_node ```python match_node( source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None ) -> DecisionBranch[SourceNodeT] ``` Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Parameters: | Name | Type | Description | Default | | --------- | ------------------------- | -------------------------------------- | --------------------------------- | | `source` | `type[SourceNodeT]` | The BaseNode subclass to match against | *required* | | `matches` | \`Callable\[[Any], bool\] | None\` | Optional custom matching function | Returns: | Type | Description | | ----------------------------- | -------------------------------------- | | `DecisionBranch[SourceNodeT]` | A DecisionBranch for the BaseNode type | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def match_node( self, source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT]: """Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Args: source: The BaseNode subclass to match against matches: Optional custom matching function Returns: A DecisionBranch for the BaseNode type """ node = NodeStep(source) path = Path(items=[DestinationMarker(node.id)]) return DecisionBranch(source=source, matches=matches, path=path, destinations=[node]) ``` #### node ```python node( node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT] ``` Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Parameters: | Name | Type | Description | Default | | ----------- | --------------------------------------------- | ---------------------------------- | ---------- | | `node_type` | `type[BaseNode[StateT, DepsT, GraphOutputT]]` | The BaseNode subclass to integrate | *required* | Returns: | Type | Description | | ------------------------- | ----------------------------------------------------- | | `EdgePath[StateT, DepsT]` | An EdgePath representing the node and its connections | Raises: | Type | Description | | ----------------- | ----------------------------------------------- | | `GraphSetupError` | If the node type is missing required type hints | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def node( self, node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT]: """Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Args: node_type: The BaseNode subclass to integrate Returns: An EdgePath representing the node and its connections Raises: GraphSetupError: If the node type is missing required type hints """ parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(node_type.run, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: # pragma: no cover raise exceptions.GraphSetupError( f'Node {node_type} is missing a return type hint on its `run` method' ) from e node = NodeStep(node_type) edge = self._edge_from_return_hint(node, return_hint) if not edge: # pragma: no cover raise exceptions.GraphSetupError(f'Node {node_type} is missing a return type hint on its `run` method') return edge ``` #### build ```python build( validate_graph_structure: bool = True, ) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT] ``` Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Parameters: | Name | Type | Description | Default | | -------------------------- | ------ | ---------------------------------------------------------------------------------------------------------------------------- | ------- | | `validate_graph_structure` | `bool` | whether to perform validation of the graph structure See the docstring of \_validate_graph_structure below for more details. | `True` | Returns: | Type | Description | | ------------------------------------------------- | --------------------------------------------- | | `Graph[StateT, DepsT, GraphInputT, GraphOutputT]` | A complete Graph instance ready for execution | Raises: | Type | Description | | ------------ | ------------------------------------------------------------------ | | `ValueError` | If the graph structure is invalid (e.g., join without parent fork) | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def build(self, validate_graph_structure: bool = True) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]: """Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Args: validate_graph_structure: whether to perform validation of the graph structure See the docstring of _validate_graph_structure below for more details. Returns: A complete Graph instance ready for execution Raises: ValueError: If the graph structure is invalid (e.g., join without parent fork) """ nodes = self._nodes edges_by_source = self._edges_by_source nodes, edges_by_source = _replace_placeholder_node_ids(nodes, edges_by_source) nodes, edges_by_source = _flatten_paths(nodes, edges_by_source) nodes, edges_by_source = _normalize_forks(nodes, edges_by_source) if validate_graph_structure: _validate_graph_structure(nodes, edges_by_source) parent_forks = _collect_dominating_forks(nodes, edges_by_source) intermediate_join_nodes = _compute_intermediate_join_nodes(nodes, parent_forks) return Graph[StateT, DepsT, GraphInputT, GraphOutputT]( name=self.name, state_type=unpack_type_expression(self.state_type), deps_type=unpack_type_expression(self.deps_type), input_type=unpack_type_expression(self.input_type), output_type=unpack_type_expression(self.output_type), nodes=nodes, edges_by_source=edges_by_source, parent_forks=parent_forks, intermediate_join_nodes=intermediate_join_nodes, auto_instrument=self.auto_instrument, ) ``` ### EndNode Bases: `Generic[InputT]` Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. Source code in `pydantic_graph/pydantic_graph/beta/node.py` ```python class EndNode(Generic[InputT]): """Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. """ id = NodeID('__end__') """Fixed identifier for the end node.""" def _force_variance(self, inputs: InputT) -> None: # pragma: no cover """Force type variance for proper generic typing. This method exists solely for type checking purposes and should never be called. Args: inputs: Input data of type InputT. Raises: RuntimeError: Always, as this method should never be executed. """ raise RuntimeError('This method should never be called, it is just defined for typing purposes.') ``` #### id ```python id = NodeID('__end__') ``` Fixed identifier for the end node. ### StartNode Bases: `Generic[OutputT]` Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. Source code in `pydantic_graph/pydantic_graph/beta/node.py` ```python class StartNode(Generic[OutputT]): """Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. """ id = NodeID('__start__') """Fixed identifier for the start node.""" ``` #### id ```python id = NodeID('__start__') ``` Fixed identifier for the start node. ### StepContext Bases: `Generic[StateT, DepsT, InputT]` Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | -------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python @dataclass(init=False) class StepContext(Generic[StateT, DepsT, InputT]): """Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data """ _state: StateT """The current graph state.""" _deps: DepsT """The graph run dependencies.""" _inputs: InputT """The input data for this step.""" def __init__(self, *, state: StateT, deps: DepsT, inputs: InputT): self._state = state self._deps = deps self._inputs = inputs @property def state(self) -> StateT: return self._state @property def deps(self) -> DepsT: return self._deps @property def inputs(self) -> InputT: """The input data for this step. This must be a property to ensure correct variance behavior """ return self._inputs ``` #### inputs ```python inputs: InputT ``` The input data for this step. This must be a property to ensure correct variance behavior ### StepNode Bases: `BaseNode[StateT, DepsT, Any]` A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a Step with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python @dataclass class StepNode(BaseNode[StateT, DepsT, Any]): """A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`][pydantic_graph.beta.step.Step] with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. """ step: Step[StateT, DepsT, Any, Any] """The step to execute.""" inputs: Any """The inputs bound to this step.""" async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the step node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`StepNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` #### step ```python step: Step[StateT, DepsT, Any, Any] ``` The step to execute. #### inputs ```python inputs: Any ``` The inputs bound to this step. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the step node. Parameters: | Name | Type | Description | Default | | ----- | -------------------------------- | --------------------------- | ---------- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph execution context | *required* | Returns: | Type | Description | | ------------------------------ | ----------- | | \`BaseNode[StateT, DepsT, Any] | End[Any]\` | Raises: | Type | Description | | --------------------- | --------------------------------------------------------- | | `NotImplementedError` | Always raised as StepNode is not meant to be run directly | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the step node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`StepNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` ### TypeExpression Bases: `Generic[T]` A workaround for type checker limitations when using complex type expressions. ```text This class serves as a wrapper for types that cannot normally be used in positions ``` requiring `type[T]`, such as `Any`, `Union[...]`, or `Literal[...]`. It provides a way to pass these complex type expressions to functions expecting concrete types. Example Instead of `output_type=Union[str, int]` (which may cause type errors), use `output_type=TypeExpression[Union[str, int]]`. Note This is a workaround for the lack of TypeForm in the Python type system. Source code in `pydantic_graph/pydantic_graph/beta/util.py` ```python class TypeExpression(Generic[T]): """A workaround for type checker limitations when using complex type expressions. This class serves as a wrapper for types that cannot normally be used in positions requiring `type[T]`, such as `Any`, `Union[...]`, or `Literal[...]`. It provides a way to pass these complex type expressions to functions expecting concrete types. Example: Instead of `output_type=Union[str, int]` (which may cause type errors), use `output_type=TypeExpression[Union[str, int]]`. Note: This is a workaround for the lack of TypeForm in the Python type system. """ pass ``` # `pydantic_graph.beta.decision` Decision node implementation for conditional branching in graph execution. This module provides the Decision node type and related classes for implementing conditional branching logic in parallel control flow graphs. Decision nodes allow the graph to choose different execution paths based on runtime conditions. ### StateT ```python StateT = TypeVar('StateT', infer_variance=True) ``` Type variable for graph state. ### DepsT ```python DepsT = TypeVar('DepsT', infer_variance=True) ``` Type variable for graph dependencies. ### HandledT ```python HandledT = TypeVar('HandledT', infer_variance=True) ``` Type variable used to track types handled by the branches of a Decision. ### T ```python T = TypeVar('T', infer_variance=True) ``` Generic type variable. ### Decision Bases: `Generic[StateT, DepsT, HandledT]` Decision node for conditional branching in graph execution. A Decision node evaluates conditions and routes execution to different branches based on the input data type or custom matching logic. Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python @dataclass(kw_only=True) class Decision(Generic[StateT, DepsT, HandledT]): """Decision node for conditional branching in graph execution. A Decision node evaluates conditions and routes execution to different branches based on the input data type or custom matching logic. """ id: NodeID """Unique identifier for this decision node.""" branches: list[DecisionBranch[Any]] """List of branches that can be taken from this decision.""" note: str | None """Optional documentation note for this decision.""" def branch(self, branch: DecisionBranch[T]) -> Decision[StateT, DepsT, HandledT | T]: """Add a new branch to this decision. Args: branch: The branch to add to this decision. Returns: A new Decision with the additional branch. """ return Decision(id=self.id, branches=self.branches + [branch], note=self.note) def _force_handled_contravariant(self, inputs: HandledT) -> Never: # pragma: no cover """Forces this type to be contravariant in the HandledT type variable. This is an implementation detail of how we can type-check that all possible input types have been exhaustively covered. Args: inputs: Input data of handled types. Raises: RuntimeError: Always, as this method should never be executed. """ raise RuntimeError('This method should never be called, it is just defined for typing purposes.') ``` #### id ```python id: NodeID ``` Unique identifier for this decision node. #### branches ```python branches: list[DecisionBranch[Any]] ``` List of branches that can be taken from this decision. #### note ```python note: str | None ``` Optional documentation note for this decision. #### branch ```python branch( branch: DecisionBranch[T], ) -> Decision[StateT, DepsT, HandledT | T] ``` Add a new branch to this decision. Parameters: | Name | Type | Description | Default | | -------- | ------------------- | ----------------------------------- | ---------- | | `branch` | `DecisionBranch[T]` | The branch to add to this decision. | *required* | Returns: | Type | Description | | ----------------------------------- | ----------- | | \`Decision\[StateT, DepsT, HandledT | T\]\` | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def branch(self, branch: DecisionBranch[T]) -> Decision[StateT, DepsT, HandledT | T]: """Add a new branch to this decision. Args: branch: The branch to add to this decision. Returns: A new Decision with the additional branch. """ return Decision(id=self.id, branches=self.branches + [branch], note=self.note) ``` ### SourceT ```python SourceT = TypeVar('SourceT', infer_variance=True) ``` Type variable for source data for a DecisionBranch. ### DecisionBranch Bases: `Generic[SourceT]` Represents a single branch within a decision node. Each branch defines the conditions under which it should be taken and the path to follow when those conditions are met. Note: with the current design, it is actually *critical* that this class is invariant in SourceT for the sake of type-checking that inputs to a Decision are actually handled. See the `# type: ignore` comment in `tests.graph.beta.test_graph_edge_cases.test_decision_no_matching_branch` for an example of how this works. Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python @dataclass class DecisionBranch(Generic[SourceT]): """Represents a single branch within a decision node. Each branch defines the conditions under which it should be taken and the path to follow when those conditions are met. Note: with the current design, it is actually _critical_ that this class is invariant in SourceT for the sake of type-checking that inputs to a Decision are actually handled. See the `# type: ignore` comment in `tests.graph.beta.test_graph_edge_cases.test_decision_no_matching_branch` for an example of how this works. """ source: TypeOrTypeExpression[SourceT] """The expected type of data for this branch. This is necessary for exhaustiveness-checking when handling the inputs to a decision node.""" matches: Callable[[Any], bool] | None """An optional predicate function used to determine whether input data matches this branch. If `None`, default logic is used which attempts to check the value for type-compatibility with the `source` type: * If `source` is `Any` or `object`, the branch will always match * If `source` is a `Literal` type, this branch will match if the value is one of the parametrizing literal values * If `source` is any other type, the value will be checked for matching using `isinstance` Inputs are tested against each branch of a decision node in order, and the path of the first matching branch is used to handle the input value. """ path: Path """The execution path to follow when an input value matches this branch of a decision node. This can include transforming, mapping, and broadcasting the output before sending to the next node or nodes. The path can also include position-aware labels which are used when generating mermaid diagrams.""" destinations: list[AnyDestinationNode] """The destination nodes that can be referenced by DestinationMarker in the path.""" ``` #### source ```python source: TypeOrTypeExpression[SourceT] ``` The expected type of data for this branch. This is necessary for exhaustiveness-checking when handling the inputs to a decision node. #### matches ```python matches: Callable[[Any], bool] | None ``` An optional predicate function used to determine whether input data matches this branch. If `None`, default logic is used which attempts to check the value for type-compatibility with the `source` type: * If `source` is `Any` or `object`, the branch will always match * If `source` is a `Literal` type, this branch will match if the value is one of the parametrizing literal values * If `source` is any other type, the value will be checked for matching using `isinstance` Inputs are tested against each branch of a decision node in order, and the path of the first matching branch is used to handle the input value. #### path ```python path: Path ``` The execution path to follow when an input value matches this branch of a decision node. This can include transforming, mapping, and broadcasting the output before sending to the next node or nodes. The path can also include position-aware labels which are used when generating mermaid diagrams. #### destinations ```python destinations: list[AnyDestinationNode] ``` The destination nodes that can be referenced by DestinationMarker in the path. ### OutputT ```python OutputT = TypeVar('OutputT', infer_variance=True) ``` Type variable for the output data of a node. ### NewOutputT ```python NewOutputT = TypeVar('NewOutputT', infer_variance=True) ``` Type variable for transformed output. ### DecisionBranchBuilder Bases: `Generic[StateT, DepsT, OutputT, SourceT, HandledT]` Builder for constructing decision branches with fluent API. This builder provides methods to configure branches with destinations, forks, and transformations in a type-safe manner. Instances of this class should be created using GraphBuilder.match, not created directly. Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python @dataclass(init=False) class DecisionBranchBuilder(Generic[StateT, DepsT, OutputT, SourceT, HandledT]): """Builder for constructing decision branches with fluent API. This builder provides methods to configure branches with destinations, forks, and transformations in a type-safe manner. Instances of this class should be created using [`GraphBuilder.match`][pydantic_graph.beta.graph_builder.GraphBuilder], not created directly. """ _decision: Decision[StateT, DepsT, HandledT] """The parent decision node.""" _source: TypeOrTypeExpression[SourceT] """The expected source type for this branch.""" _matches: Callable[[Any], bool] | None """Optional matching predicate.""" _path_builder: PathBuilder[StateT, DepsT, OutputT] """Builder for the execution path.""" def __init__( self, *, decision: Decision[StateT, DepsT, HandledT], source: TypeOrTypeExpression[SourceT], matches: Callable[[Any], bool] | None, path_builder: PathBuilder[StateT, DepsT, OutputT], ): # This manually-defined initializer is necessary due to https://github.com/python/mypy/issues/17623. self._decision = decision self._source = source self._matches = matches self._path_builder = path_builder def to( self, destination: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]], /, *extra_destinations: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]], fork_id: str | None = None, ) -> DecisionBranch[SourceT]: """Set the destination(s) for this branch. Args: destination: The primary destination node. *extra_destinations: Additional destination nodes. fork_id: Optional node ID to use for the resulting broadcast fork if multiple destinations are provided. Returns: A completed DecisionBranch with the specified destinations. """ destination = get_origin(destination) or destination extra_destinations = tuple(get_origin(d) or d for d in extra_destinations) destinations = [(NodeStep(d) if inspect.isclass(d) else d) for d in (destination, *extra_destinations)] return DecisionBranch( source=self._source, matches=self._matches, path=self._path_builder.to(*destinations, fork_id=fork_id), destinations=destinations, ) def broadcast( self, get_forks: Callable[[Self], Sequence[DecisionBranch[SourceT]]], /, *, fork_id: str | None = None ) -> DecisionBranch[SourceT]: """Broadcast this decision branch into multiple destinations. Args: get_forks: The callback that will return a sequence of decision branches to broadcast to. fork_id: Optional node ID to use for the resulting broadcast fork. Returns: A completed DecisionBranch with the specified destinations. """ fork_decision_branches = get_forks(self) new_paths = [b.path for b in fork_decision_branches] if not new_paths: raise GraphBuildingError(f'The call to {get_forks} returned no branches, but must return at least one.') path = self._path_builder.broadcast(new_paths, fork_id=fork_id) destinations = [d for fdp in fork_decision_branches for d in fdp.destinations] return DecisionBranch(source=self._source, matches=self._matches, path=path, destinations=destinations) def transform( self, func: TransformFunction[StateT, DepsT, OutputT, NewOutputT], / ) -> DecisionBranchBuilder[StateT, DepsT, NewOutputT, SourceT, HandledT]: """Apply a transformation to the branch's output. Args: func: Transformation function to apply. Returns: A new DecisionBranchBuilder where the provided transform is applied prior to generating the final output. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.transform(func), ) def map( self: DecisionBranchBuilder[StateT, DepsT, Iterable[T], SourceT, HandledT] | DecisionBranchBuilder[StateT, DepsT, AsyncIterable[T], SourceT, HandledT], *, fork_id: str | None = None, downstream_join_id: str | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, T, SourceT, HandledT]: """Spread the branch's output. To do this, the current output must be iterable, and any subsequent steps in the path being built for this branch will be applied to each item of the current output in parallel. Args: fork_id: Optional ID for the fork, defaults to a generated value downstream_join_id: Optional ID of a downstream join node which is involved when mapping empty iterables Returns: A new DecisionBranchBuilder where mapping is performed prior to generating the final output. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id), ) def label(self, label: str) -> DecisionBranchBuilder[StateT, DepsT, OutputT, SourceT, HandledT]: """Apply a label to the branch at the current point in the path being built. These labels are only used in generated mermaid diagrams. Args: label: The label to apply. Returns: A new DecisionBranchBuilder where the label has been applied at the end of the current path being built. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.label(label), ) ``` #### to ```python to( destination: ( DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]] ), /, *extra_destinations: DestinationNode[ StateT, DepsT, OutputT ] | type[BaseNode[StateT, DepsT, Any]], fork_id: str | None = None, ) -> DecisionBranch[SourceT] ``` Set the destination(s) for this branch. Parameters: | Name | Type | Description | Default | | --------------------- | ----------------------------------------- | -------------------------------------- | ----------------------------------------------------------------------------------------------- | | `destination` | \`DestinationNode[StateT, DepsT, OutputT] | type\[BaseNode[StateT, DepsT, Any]\]\` | The primary destination node. | | `*extra_destinations` | \`DestinationNode[StateT, DepsT, OutputT] | type\[BaseNode[StateT, DepsT, Any]\]\` | Additional destination nodes. | | `fork_id` | \`str | None\` | Optional node ID to use for the resulting broadcast fork if multiple destinations are provided. | Returns: | Type | Description | | ------------------------- | ----------------------------------------------------------- | | `DecisionBranch[SourceT]` | A completed DecisionBranch with the specified destinations. | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def to( self, destination: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]], /, *extra_destinations: DestinationNode[StateT, DepsT, OutputT] | type[BaseNode[StateT, DepsT, Any]], fork_id: str | None = None, ) -> DecisionBranch[SourceT]: """Set the destination(s) for this branch. Args: destination: The primary destination node. *extra_destinations: Additional destination nodes. fork_id: Optional node ID to use for the resulting broadcast fork if multiple destinations are provided. Returns: A completed DecisionBranch with the specified destinations. """ destination = get_origin(destination) or destination extra_destinations = tuple(get_origin(d) or d for d in extra_destinations) destinations = [(NodeStep(d) if inspect.isclass(d) else d) for d in (destination, *extra_destinations)] return DecisionBranch( source=self._source, matches=self._matches, path=self._path_builder.to(*destinations, fork_id=fork_id), destinations=destinations, ) ``` #### broadcast ```python broadcast( get_forks: Callable[ [Self], Sequence[DecisionBranch[SourceT]] ], /, *, fork_id: str | None = None, ) -> DecisionBranch[SourceT] ``` Broadcast this decision branch into multiple destinations. Parameters: | Name | Type | Description | Default | | ----------- | ----------------------------------------------------- | ------------------------------------------------------------------------------ | --------------------------------------------------------- | | `get_forks` | `Callable[[Self], Sequence[DecisionBranch[SourceT]]]` | The callback that will return a sequence of decision branches to broadcast to. | *required* | | `fork_id` | \`str | None\` | Optional node ID to use for the resulting broadcast fork. | Returns: | Type | Description | | ------------------------- | ----------------------------------------------------------- | | `DecisionBranch[SourceT]` | A completed DecisionBranch with the specified destinations. | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def broadcast( self, get_forks: Callable[[Self], Sequence[DecisionBranch[SourceT]]], /, *, fork_id: str | None = None ) -> DecisionBranch[SourceT]: """Broadcast this decision branch into multiple destinations. Args: get_forks: The callback that will return a sequence of decision branches to broadcast to. fork_id: Optional node ID to use for the resulting broadcast fork. Returns: A completed DecisionBranch with the specified destinations. """ fork_decision_branches = get_forks(self) new_paths = [b.path for b in fork_decision_branches] if not new_paths: raise GraphBuildingError(f'The call to {get_forks} returned no branches, but must return at least one.') path = self._path_builder.broadcast(new_paths, fork_id=fork_id) destinations = [d for fdp in fork_decision_branches for d in fdp.destinations] return DecisionBranch(source=self._source, matches=self._matches, path=path, destinations=destinations) ``` #### transform ```python transform( func: TransformFunction[ StateT, DepsT, OutputT, NewOutputT ], ) -> DecisionBranchBuilder[ StateT, DepsT, NewOutputT, SourceT, HandledT ] ``` Apply a transformation to the branch's output. Parameters: | Name | Type | Description | Default | | ------ | ------------------------------------------------------- | --------------------------------- | ---------- | | `func` | `TransformFunction[StateT, DepsT, OutputT, NewOutputT]` | Transformation function to apply. | *required* | Returns: | Type | Description | | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- | | `DecisionBranchBuilder[StateT, DepsT, NewOutputT, SourceT, HandledT]` | A new DecisionBranchBuilder where the provided transform is applied prior to generating the final output. | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def transform( self, func: TransformFunction[StateT, DepsT, OutputT, NewOutputT], / ) -> DecisionBranchBuilder[StateT, DepsT, NewOutputT, SourceT, HandledT]: """Apply a transformation to the branch's output. Args: func: Transformation function to apply. Returns: A new DecisionBranchBuilder where the provided transform is applied prior to generating the final output. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.transform(func), ) ``` #### map ```python map( *, fork_id: str | None = None, downstream_join_id: str | None = None ) -> DecisionBranchBuilder[ StateT, DepsT, T, SourceT, HandledT ] ``` Spread the branch's output. To do this, the current output must be iterable, and any subsequent steps in the path being built for this branch will be applied to each item of the current output in parallel. Parameters: | Name | Type | Description | Default | | -------------------- | ----- | ----------- | ------------------------------------------------------------------------------------ | | `fork_id` | \`str | None\` | Optional ID for the fork, defaults to a generated value | | `downstream_join_id` | \`str | None\` | Optional ID of a downstream join node which is involved when mapping empty iterables | Returns: | Type | Description | | ------------------------------------------------------------ | -------------------------------------------------------------------------------------------- | | `DecisionBranchBuilder[StateT, DepsT, T, SourceT, HandledT]` | A new DecisionBranchBuilder where mapping is performed prior to generating the final output. | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def map( self: DecisionBranchBuilder[StateT, DepsT, Iterable[T], SourceT, HandledT] | DecisionBranchBuilder[StateT, DepsT, AsyncIterable[T], SourceT, HandledT], *, fork_id: str | None = None, downstream_join_id: str | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, T, SourceT, HandledT]: """Spread the branch's output. To do this, the current output must be iterable, and any subsequent steps in the path being built for this branch will be applied to each item of the current output in parallel. Args: fork_id: Optional ID for the fork, defaults to a generated value downstream_join_id: Optional ID of a downstream join node which is involved when mapping empty iterables Returns: A new DecisionBranchBuilder where mapping is performed prior to generating the final output. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id), ) ``` #### label ```python label( label: str, ) -> DecisionBranchBuilder[ StateT, DepsT, OutputT, SourceT, HandledT ] ``` Apply a label to the branch at the current point in the path being built. These labels are only used in generated mermaid diagrams. Parameters: | Name | Type | Description | Default | | ------- | ----- | ------------------- | ---------- | | `label` | `str` | The label to apply. | *required* | Returns: | Type | Description | | ------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- | | `DecisionBranchBuilder[StateT, DepsT, OutputT, SourceT, HandledT]` | A new DecisionBranchBuilder where the label has been applied at the end of the current path being built. | Source code in `pydantic_graph/pydantic_graph/beta/decision.py` ```python def label(self, label: str) -> DecisionBranchBuilder[StateT, DepsT, OutputT, SourceT, HandledT]: """Apply a label to the branch at the current point in the path being built. These labels are only used in generated mermaid diagrams. Args: label: The label to apply. Returns: A new DecisionBranchBuilder where the label has been applied at the end of the current path being built. """ return DecisionBranchBuilder( decision=self._decision, source=self._source, matches=self._matches, path_builder=self._path_builder.label(label), ) ``` # `pydantic_graph.beta.graph` Core graph execution engine for the next version of the pydantic-graph library. This module provides the main `Graph` class and `GraphRun` execution engine that handles the orchestration of nodes, edges, and parallel execution paths in the graph-based workflow system. ### StateT ```python StateT = TypeVar('StateT', infer_variance=True) ``` Type variable for graph state. ### DepsT ```python DepsT = TypeVar('DepsT', infer_variance=True) ``` Type variable for graph dependencies. ### InputT ```python InputT = TypeVar('InputT', infer_variance=True) ``` Type variable for graph inputs. ### OutputT ```python OutputT = TypeVar('OutputT', infer_variance=True) ``` Type variable for graph outputs. ### EndMarker Bases: `Generic[OutputT]` A marker indicating the end of graph execution with a final value. EndMarker is used internally to signal that the graph has completed execution and carries the final output value. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------------- | ---------- | | `OutputT` | | The type of the final output value | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass(init=False) class EndMarker(Generic[OutputT]): """A marker indicating the end of graph execution with a final value. EndMarker is used internally to signal that the graph has completed execution and carries the final output value. Type Parameters: OutputT: The type of the final output value """ _value: OutputT """The final output value from the graph execution.""" def __init__(self, value: OutputT): # This manually-defined initializer is necessary due to https://github.com/python/mypy/issues/17623. self._value = value @property def value(self) -> OutputT: return self._value ``` ### JoinItem An item representing data flowing into a join operation. JoinItem carries input data from a parallel execution path to a join node, along with metadata about which execution 'fork' it originated from. Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass class JoinItem: """An item representing data flowing into a join operation. JoinItem carries input data from a parallel execution path to a join node, along with metadata about which execution 'fork' it originated from. """ join_id: JoinID """The ID of the join node this item is targeting.""" inputs: Any """The input data for the join operation.""" fork_stack: ForkStack """The stack of ForkStackItems that led to producing this join item.""" ``` #### join_id ```python join_id: JoinID ``` The ID of the join node this item is targeting. #### inputs ```python inputs: Any ``` The input data for the join operation. #### fork_stack ```python fork_stack: ForkStack ``` The stack of ForkStackItems that led to producing this join item. ### Graph Bases: `Generic[StateT, DepsT, InputT, OutputT]` A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass(repr=False) class Graph(Generic[StateT, DepsT, InputT, OutputT]): """A complete graph definition ready for execution. The Graph class represents a complete workflow graph with typed inputs, outputs, state, and dependencies. It contains all nodes, edges, and metadata needed for execution. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data OutputT: The type of the output data """ name: str | None """Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.""" state_type: type[StateT] """The type of the graph state.""" deps_type: type[DepsT] """The type of the dependencies.""" input_type: type[InputT] """The type of the input data.""" output_type: type[OutputT] """The type of the output data.""" auto_instrument: bool """Whether to automatically create instrumentation spans.""" nodes: dict[NodeID, AnyNode] """All nodes in the graph indexed by their ID.""" edges_by_source: dict[NodeID, list[Path]] """Outgoing paths from each source node.""" parent_forks: dict[JoinID, ParentFork[NodeID]] """Parent fork information for each join node.""" intermediate_join_nodes: dict[JoinID, set[JoinID]] """For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream.""" def get_parent_fork(self, join_id: JoinID) -> ParentFork[NodeID]: """Get the parent fork information for a join node. Args: join_id: The ID of the join node Returns: The parent fork information for the join Raises: RuntimeError: If the join ID is not found or has no parent fork """ result = self.parent_forks.get(join_id) if result is None: raise RuntimeError(f'Node {join_id} is not a join node or did not have a dominating fork (this is a bug)') return result def is_final_join(self, join_id: JoinID) -> bool: """Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Args: join_id: The ID of the join node Returns: True if the join is final, False if it's non-final """ # Check if this join appears in any other join's intermediate_join_nodes for intermediate_joins in self.intermediate_join_nodes.values(): if join_id in intermediate_joins: return False return True async def run( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT: """Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Returns: The final output from the graph execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=2) if inferred_name is not None: # pragma: no branch self.name = inferred_name async with self.iter(state=state, deps=deps, inputs=inputs, span=span, infer_name=False) as graph_run: # Note: This would probably be better using `async for _ in graph_run`, but this tests the `next` method, # which I'm less confident will be implemented correctly if not used on the critical path. We can change it # once we have tests, etc. event: Any = None while True: try: event = await graph_run.next(event) except StopAsyncIteration: assert isinstance(event, EndMarker), 'Graph run should end with an EndMarker.' return cast(EndMarker[OutputT], event).value @asynccontextmanager async def iter( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]: """Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Yields: A GraphRun instance that can be iterated for step-by-step execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=3) # depth=3 because asynccontextmanager adds one if inferred_name is not None: # pragma: no branch self.name = inferred_name with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) async with GraphRun[StateT, DepsT, OutputT]( graph=self, state=state, deps=deps, inputs=inputs, traceparent=traceparent, ) as graph_run: yield graph_run def render(self, *, title: str | None = None, direction: StateDiagramDirection | None = None) -> str: """Render the graph as a Mermaid diagram string. Args: title: Optional title for the diagram direction: Optional direction for the diagram layout Returns: A string containing the Mermaid diagram representation """ from pydantic_graph.beta.mermaid import build_mermaid_graph return build_mermaid_graph(self.nodes, self.edges_by_source).render(title=title, direction=direction) def __repr__(self) -> str: super_repr = super().__repr__() # include class and memory address # Insert the result of calling `__str__` before the final '>' in the repr return f'{super_repr[:-1]}\n{self}\n{super_repr[-1]}' def __str__(self) -> str: """Return a Mermaid diagram representation of the graph. Returns: A string containing the Mermaid diagram of the graph """ return self.render() ``` #### name ```python name: str | None ``` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. #### state_type ```python state_type: type[StateT] ``` The type of the graph state. #### deps_type ```python deps_type: type[DepsT] ``` The type of the dependencies. #### input_type ```python input_type: type[InputT] ``` The type of the input data. #### output_type ```python output_type: type[OutputT] ``` The type of the output data. #### auto_instrument ```python auto_instrument: bool ``` Whether to automatically create instrumentation spans. #### nodes ```python nodes: dict[NodeID, AnyNode] ``` All nodes in the graph indexed by their ID. #### edges_by_source ```python edges_by_source: dict[NodeID, list[Path]] ``` Outgoing paths from each source node. #### parent_forks ```python parent_forks: dict[JoinID, ParentFork[NodeID]] ``` Parent fork information for each join node. #### intermediate_join_nodes ```python intermediate_join_nodes: dict[JoinID, set[JoinID]] ``` For each join, the set of other joins that appear between it and its parent fork. Used to determine which joins are "final" (have no other joins as intermediates) and which joins should preserve fork stacks when proceeding downstream. #### get_parent_fork ```python get_parent_fork(join_id: JoinID) -> ParentFork[NodeID] ``` Get the parent fork information for a join node. Parameters: | Name | Type | Description | Default | | --------- | -------- | ----------------------- | ---------- | | `join_id` | `JoinID` | The ID of the join node | *required* | Returns: | Type | Description | | -------------------- | ---------------------------------------- | | `ParentFork[NodeID]` | The parent fork information for the join | Raises: | Type | Description | | -------------- | ------------------------------------------------- | | `RuntimeError` | If the join ID is not found or has no parent fork | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def get_parent_fork(self, join_id: JoinID) -> ParentFork[NodeID]: """Get the parent fork information for a join node. Args: join_id: The ID of the join node Returns: The parent fork information for the join Raises: RuntimeError: If the join ID is not found or has no parent fork """ result = self.parent_forks.get(join_id) if result is None: raise RuntimeError(f'Node {join_id} is not a join node or did not have a dominating fork (this is a bug)') return result ``` #### is_final_join ```python is_final_join(join_id: JoinID) -> bool ``` Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Parameters: | Name | Type | Description | Default | | --------- | -------- | ----------------------- | ---------- | | `join_id` | `JoinID` | The ID of the join node | *required* | Returns: | Type | Description | | ------ | -------------------------------------------------- | | `bool` | True if the join is final, False if it's non-final | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def is_final_join(self, join_id: JoinID) -> bool: """Check if a join is 'final' (has no downstream joins with the same parent fork). A join is non-final if it appears as an intermediate node for another join with the same parent fork. Args: join_id: The ID of the join node Returns: True if the join is final, False if it's non-final """ # Check if this join appears in any other join's intermediate_join_nodes for intermediate_joins in self.intermediate_join_nodes.values(): if join_id in intermediate_joins: return False return True ``` #### run ```python run( *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> OutputT ``` Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------------------- | ------------------------------------------------------- | ----------------------------------------- | | `state` | `StateT` | The graph state instance | `None` | | `deps` | `DepsT` | The dependencies instance | `None` | | `inputs` | `InputT` | The input data for the graph | `None` | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | Optional span for tracing/instrumentation | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --------- | ----------------------------------------- | | `OutputT` | The final output from the graph execution | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python async def run( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> OutputT: """Execute the graph and return the final output. This is the main entry point for graph execution. It runs the graph to completion and returns the final output value. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Returns: The final output from the graph execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=2) if inferred_name is not None: # pragma: no branch self.name = inferred_name async with self.iter(state=state, deps=deps, inputs=inputs, span=span, infer_name=False) as graph_run: # Note: This would probably be better using `async for _ in graph_run`, but this tests the `next` method, # which I'm less confident will be implemented correctly if not used on the critical path. We can change it # once we have tests, etc. event: Any = None while True: try: event = await graph_run.next(event) except StopAsyncIteration: assert isinstance(event, EndMarker), 'Graph run should end with an EndMarker.' return cast(EndMarker[OutputT], event).value ``` #### iter ```python iter( *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]] ``` Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Parameters: | Name | Type | Description | Default | | ------------ | -------------------------------------- | ------------------------------------------------------- | ----------------------------------------- | | `state` | `StateT` | The graph state instance | `None` | | `deps` | `DepsT` | The dependencies instance | `None` | | `inputs` | `InputT` | The input data for the graph | `None` | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | Optional span for tracing/instrumentation | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Yields: | Type | Description | | ------------------------------------------------- | ------------------------------------------------------------------- | | `AsyncIterator[GraphRun[StateT, DepsT, OutputT]]` | A GraphRun instance that can be iterated for step-by-step execution | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @asynccontextmanager async def iter( self, *, state: StateT = None, deps: DepsT = None, inputs: InputT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, OutputT]]: """Create an iterator for step-by-step graph execution. This method allows for more fine-grained control over graph execution, enabling inspection of intermediate states and results. Args: state: The graph state instance deps: The dependencies instance inputs: The input data for the graph span: Optional span for tracing/instrumentation infer_name: Whether to infer the graph name from the calling frame. Yields: A GraphRun instance that can be iterated for step-by-step execution """ if infer_name and self.name is None: inferred_name = infer_obj_name(self, depth=3) # depth=3 because asynccontextmanager adds one if inferred_name is not None: # pragma: no branch self.name = inferred_name with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) async with GraphRun[StateT, DepsT, OutputT]( graph=self, state=state, deps=deps, inputs=inputs, traceparent=traceparent, ) as graph_run: yield graph_run ``` #### render ```python render( *, title: str | None = None, direction: StateDiagramDirection | None = None ) -> str ``` Render the graph as a Mermaid diagram string. Parameters: | Name | Type | Description | Default | | ----------- | ----------------------- | ----------- | ----------------------------------------- | | `title` | \`str | None\` | Optional title for the diagram | | `direction` | \`StateDiagramDirection | None\` | Optional direction for the diagram layout | Returns: | Type | Description | | ----- | ------------------------------------------------------ | | `str` | A string containing the Mermaid diagram representation | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def render(self, *, title: str | None = None, direction: StateDiagramDirection | None = None) -> str: """Render the graph as a Mermaid diagram string. Args: title: Optional title for the diagram direction: Optional direction for the diagram layout Returns: A string containing the Mermaid diagram representation """ from pydantic_graph.beta.mermaid import build_mermaid_graph return build_mermaid_graph(self.nodes, self.edges_by_source).render(title=title, direction=direction) ``` #### __str__ ```python __str__() -> str ``` Return a Mermaid diagram representation of the graph. Returns: | Type | Description | | ----- | ---------------------------------------------------- | | `str` | A string containing the Mermaid diagram of the graph | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def __str__(self) -> str: """Return a Mermaid diagram representation of the graph. Returns: A string containing the Mermaid diagram of the graph """ return self.render() ``` ### GraphTaskRequest A request to run a task representing the execution of a node in the graph. GraphTaskRequest encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within. Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass class GraphTaskRequest: """A request to run a task representing the execution of a node in the graph. GraphTaskRequest encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within. """ node_id: NodeID """The ID of the node to execute.""" inputs: Any """The input data for the node.""" fork_stack: ForkStack = field(repr=False) """Stack of forks that have been entered. Used by the GraphRun to decide when to proceed through joins. """ ``` #### node_id ```python node_id: NodeID ``` The ID of the node to execute. #### inputs ```python inputs: Any ``` The input data for the node. #### fork_stack ```python fork_stack: ForkStack = field(repr=False) ``` Stack of forks that have been entered. Used by the GraphRun to decide when to proceed through joins. ### GraphTask Bases: `GraphTaskRequest` A task representing the execution of a node in the graph. GraphTask encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within, and has a unique ID to identify the task within the graph run. Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python @dataclass class GraphTask(GraphTaskRequest): """A task representing the execution of a node in the graph. GraphTask encapsulates all the information needed to execute a specific node, including its inputs and the fork context it's executing within, and has a unique ID to identify the task within the graph run. """ task_id: TaskID = field(repr=False) """Unique identifier for this task.""" @staticmethod def from_request(request: GraphTaskRequest, get_task_id: Callable[[], TaskID]) -> GraphTask: # Don't call the get_task_id callable, this is already a task if isinstance(request, GraphTask): return request return GraphTask(request.node_id, request.inputs, request.fork_stack, get_task_id()) ``` #### task_id ```python task_id: TaskID = field(repr=False) ``` Unique identifier for this task. ### GraphRun Bases: `Generic[StateT, DepsT, OutputT]` A single execution instance of a graph. GraphRun manages the execution state for a single run of a graph, including task scheduling, fork/join coordination, and result tracking. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python class GraphRun(Generic[StateT, DepsT, OutputT]): """A single execution instance of a graph. GraphRun manages the execution state for a single run of a graph, including task scheduling, fork/join coordination, and result tracking. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies OutputT: The type of the output data """ def __init__( self, graph: Graph[StateT, DepsT, InputT, OutputT], *, state: StateT, deps: DepsT, inputs: InputT, traceparent: str | None, ): """Initialize a graph run. Args: graph: The graph to execute state: The graph state instance deps: The dependencies instance inputs: The input data for the graph traceparent: Optional trace parent for instrumentation """ self.graph = graph """The graph being executed.""" self.state = state """The graph state instance.""" self.deps = deps """The dependencies instance.""" self.inputs = inputs """The initial input data.""" self._active_reducers: dict[tuple[JoinID, NodeRunID], JoinState] = {} """Active reducers for join operations.""" self._next: EndMarker[OutputT] | Sequence[GraphTask] | None = None """The next item to be processed.""" self._next_task_id = 0 self._next_node_run_id = 0 initial_fork_stack: ForkStack = (ForkStackItem(StartNode.id, self._get_next_node_run_id(), 0),) self._first_task = GraphTask( node_id=StartNode.id, inputs=inputs, fork_stack=initial_fork_stack, task_id=self._get_next_task_id() ) self._iterator_task_group = create_task_group() self._iterator_instance = _GraphIterator[StateT, DepsT, OutputT]( self.graph, self.state, self.deps, self._iterator_task_group, self._get_next_node_run_id, self._get_next_task_id, ) self._iterator = self._iterator_instance.iter_graph(self._first_task) self.__traceparent = traceparent self._async_exit_stack = AsyncExitStack() async def __aenter__(self): self._async_exit_stack.enter_context(_unwrap_exception_groups()) await self._async_exit_stack.enter_async_context(self._iterator_task_group) await self._async_exit_stack.enter_async_context(self._iterator_context()) return self async def __aexit__(self, exc_type: Any, exc_val: Any, exc_tb: Any): await self._async_exit_stack.__aexit__(exc_type, exc_val, exc_tb) @asynccontextmanager async def _iterator_context(self): try: yield finally: self._iterator_instance.iter_stream_sender.close() self._iterator_instance.iter_stream_receiver.close() await self._iterator.aclose() @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: """Get the trace parent for instrumentation. Args: required: Whether to raise an error if no traceparent exists Returns: The traceparent string, or None if not required and not set Raises: GraphRuntimeError: If required is True and no traceparent exists """ if self.__traceparent is None and required: # pragma: no cover raise exceptions.GraphRuntimeError('No span was created for this graph run') return self.__traceparent def __aiter__(self) -> AsyncIterator[EndMarker[OutputT] | Sequence[GraphTask]]: """Return self as an async iterator. Returns: Self for async iteration """ return self async def __anext__(self) -> EndMarker[OutputT] | Sequence[GraphTask]: """Get the next item in the async iteration. Returns: The next execution result from the graph """ if self._next is None: self._next = await anext(self._iterator) else: self._next = await self._iterator.asend(self._next) return self._next async def next( self, value: EndMarker[OutputT] | Sequence[GraphTaskRequest] | None = None ) -> EndMarker[OutputT] | Sequence[GraphTask]: """Advance the graph execution by one step. This method allows for sending a value to the iterator, which is useful for resuming iteration or overriding intermediate results. Args: value: Optional value to send to the iterator Returns: The next execution result: either an EndMarker, or sequence of GraphTasks """ if self._next is None: # Prevent `TypeError: can't send non-None value to a just-started async generator` # if `next` is called before the `first_node` has run. await anext(self) if value is not None: if isinstance(value, EndMarker): self._next = value else: self._next = [GraphTask.from_request(gtr, self._get_next_task_id) for gtr in value] return await anext(self) @property def next_task(self) -> EndMarker[OutputT] | Sequence[GraphTask]: """Get the next task(s) to be executed. Returns: The next execution item, or the initial task if none is set """ return self._next or [self._first_task] @property def output(self) -> OutputT | None: """Get the final output if the graph has completed. Returns: The output value if execution is complete, None otherwise """ if isinstance(self._next, EndMarker): return self._next.value return None def _get_next_task_id(self) -> TaskID: next_id = TaskID(f'task:{self._next_task_id}') self._next_task_id += 1 return next_id def _get_next_node_run_id(self) -> NodeRunID: next_id = NodeRunID(f'task:{self._next_node_run_id}') self._next_node_run_id += 1 return next_id ``` #### __init__ ```python __init__( graph: Graph[StateT, DepsT, InputT, OutputT], *, state: StateT, deps: DepsT, inputs: InputT, traceparent: str | None ) ``` Initialize a graph run. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ---------------------------- | ----------------------------------------- | | `graph` | `Graph[StateT, DepsT, InputT, OutputT]` | The graph to execute | *required* | | `state` | `StateT` | The graph state instance | *required* | | `deps` | `DepsT` | The dependencies instance | *required* | | `inputs` | `InputT` | The input data for the graph | *required* | | `traceparent` | \`str | None\` | Optional trace parent for instrumentation | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def __init__( self, graph: Graph[StateT, DepsT, InputT, OutputT], *, state: StateT, deps: DepsT, inputs: InputT, traceparent: str | None, ): """Initialize a graph run. Args: graph: The graph to execute state: The graph state instance deps: The dependencies instance inputs: The input data for the graph traceparent: Optional trace parent for instrumentation """ self.graph = graph """The graph being executed.""" self.state = state """The graph state instance.""" self.deps = deps """The dependencies instance.""" self.inputs = inputs """The initial input data.""" self._active_reducers: dict[tuple[JoinID, NodeRunID], JoinState] = {} """Active reducers for join operations.""" self._next: EndMarker[OutputT] | Sequence[GraphTask] | None = None """The next item to be processed.""" self._next_task_id = 0 self._next_node_run_id = 0 initial_fork_stack: ForkStack = (ForkStackItem(StartNode.id, self._get_next_node_run_id(), 0),) self._first_task = GraphTask( node_id=StartNode.id, inputs=inputs, fork_stack=initial_fork_stack, task_id=self._get_next_task_id() ) self._iterator_task_group = create_task_group() self._iterator_instance = _GraphIterator[StateT, DepsT, OutputT]( self.graph, self.state, self.deps, self._iterator_task_group, self._get_next_node_run_id, self._get_next_task_id, ) self._iterator = self._iterator_instance.iter_graph(self._first_task) self.__traceparent = traceparent self._async_exit_stack = AsyncExitStack() ``` #### graph ```python graph = graph ``` The graph being executed. #### state ```python state = state ``` The graph state instance. #### deps ```python deps = deps ``` The dependencies instance. #### inputs ```python inputs = inputs ``` The initial input data. #### __aiter__ ```python __aiter__() -> ( AsyncIterator[EndMarker[OutputT] | Sequence[GraphTask]] ) ``` Return self as an async iterator. Returns: | Type | Description | | ----------------------------------- | ----------------------- | | \`AsyncIterator\[EndMarker[OutputT] | Sequence[GraphTask]\]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python def __aiter__(self) -> AsyncIterator[EndMarker[OutputT] | Sequence[GraphTask]]: """Return self as an async iterator. Returns: Self for async iteration """ return self ``` #### __anext__ ```python __anext__() -> EndMarker[OutputT] | Sequence[GraphTask] ``` Get the next item in the async iteration. Returns: | Type | Description | | -------------------- | --------------------- | | \`EndMarker[OutputT] | Sequence[GraphTask]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python async def __anext__(self) -> EndMarker[OutputT] | Sequence[GraphTask]: """Get the next item in the async iteration. Returns: The next execution result from the graph """ if self._next is None: self._next = await anext(self._iterator) else: self._next = await self._iterator.asend(self._next) return self._next ``` #### next ```python next( value: ( EndMarker[OutputT] | Sequence[GraphTaskRequest] | None ) = None, ) -> EndMarker[OutputT] | Sequence[GraphTask] ``` Advance the graph execution by one step. This method allows for sending a value to the iterator, which is useful for resuming iteration or overriding intermediate results. Parameters: | Name | Type | Description | Default | | ------- | -------------------- | -------------------------- | ------- | | `value` | \`EndMarker[OutputT] | Sequence[GraphTaskRequest] | None\` | Returns: | Type | Description | | -------------------- | --------------------- | | \`EndMarker[OutputT] | Sequence[GraphTask]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph.py` ```python async def next( self, value: EndMarker[OutputT] | Sequence[GraphTaskRequest] | None = None ) -> EndMarker[OutputT] | Sequence[GraphTask]: """Advance the graph execution by one step. This method allows for sending a value to the iterator, which is useful for resuming iteration or overriding intermediate results. Args: value: Optional value to send to the iterator Returns: The next execution result: either an EndMarker, or sequence of GraphTasks """ if self._next is None: # Prevent `TypeError: can't send non-None value to a just-started async generator` # if `next` is called before the `first_node` has run. await anext(self) if value is not None: if isinstance(value, EndMarker): self._next = value else: self._next = [GraphTask.from_request(gtr, self._get_next_task_id) for gtr in value] return await anext(self) ``` #### next_task ```python next_task: EndMarker[OutputT] | Sequence[GraphTask] ``` Get the next task(s) to be executed. Returns: | Type | Description | | -------------------- | --------------------- | | \`EndMarker[OutputT] | Sequence[GraphTask]\` | #### output ```python output: OutputT | None ``` Get the final output if the graph has completed. Returns: | Type | Description | | --------- | ----------- | | \`OutputT | None\` | # `pydantic_graph.beta.graph_builder` Graph builder for constructing executable graph definitions. This module provides the GraphBuilder class and related utilities for constructing typed, executable graph definitions with steps, joins, decisions, and edge routing. ### GraphBuilder Bases: `Generic[StateT, DepsT, GraphInputT, GraphOutputT]` A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | -------------- | -------------------- | --------------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `GraphInputT` | | The type of the graph input data | *required* | | `GraphOutputT` | | The type of the graph output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python @dataclass(init=False) class GraphBuilder(Generic[StateT, DepsT, GraphInputT, GraphOutputT]): """A builder for constructing executable graph definitions. GraphBuilder provides a fluent interface for defining nodes, edges, and routing in a graph workflow. It supports typed state, dependencies, and input/output validation. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies GraphInputT: The type of the graph input data GraphOutputT: The type of the graph output data """ name: str | None """Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method.""" state_type: TypeOrTypeExpression[StateT] """The type of the graph state.""" deps_type: TypeOrTypeExpression[DepsT] """The type of the dependencies.""" input_type: TypeOrTypeExpression[GraphInputT] """The type of the graph input data.""" output_type: TypeOrTypeExpression[GraphOutputT] """The type of the graph output data.""" auto_instrument: bool """Whether to automatically create instrumentation spans.""" _nodes: dict[NodeID, AnyNode] """Internal storage for nodes in the graph.""" _edges_by_source: dict[NodeID, list[Path]] """Internal storage for edges by source node.""" _decision_index: int """Counter for generating unique decision node IDs.""" Source = TypeAliasType('Source', SourceNode[StateT, DepsT, OutputT], type_params=(OutputT,)) Destination = TypeAliasType('Destination', DestinationNode[StateT, DepsT, InputT], type_params=(InputT,)) def __init__( self, *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ): """Initialize a graph builder. Args: name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the graph state deps_type: The type of the dependencies input_type: The type of the graph input data output_type: The type of the graph output data auto_instrument: Whether to automatically create instrumentation spans """ self.name = name self.state_type = state_type self.deps_type = deps_type self.input_type = input_type self.output_type = output_type self.auto_instrument = auto_instrument self._nodes = {} self._edges_by_source = defaultdict(list) self._decision_index = 1 self._start_node = StartNode[GraphInputT]() self._end_node = EndNode[GraphOutputT]() # Node building @property def start_node(self) -> StartNode[GraphInputT]: """Get the start node for the graph. Returns: The start node that receives the initial graph input """ return self._start_node @property def end_node(self) -> EndNode[GraphOutputT]: """Get the end node for the graph. Returns: The end node that produces the final graph output """ return self._end_node @overload def step( self, *, node_id: str | None = None, label: str | None = None, ) -> Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]]: ... @overload def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, OutputT]: ... def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] ): """Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StepFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, OutputT]: return self.step(call=func, node_id=node_id, label=label) return decorator node_id = node_id or get_callable_name(call) step = Step[StateT, DepsT, InputT, OutputT](id=NodeID(node_id), call=call, label=label) return step @overload def stream( self, *, node_id: str | None = None, label: str | None = None, ) -> Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] ]: ... @overload def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None, ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: ... @overload def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): ... def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): """Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StreamFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: return self.stream(call=func, node_id=node_id, label=label) return decorator # We need to wrap the call so that we can call `await` even though the result is an async iterator async def wrapper(ctx: StepContext[StateT, DepsT, InputT]): return call(ctx) node_id = node_id or get_callable_name(call) return self.step(call=wrapper, node_id=node_id, label=label) @overload def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial: OutputT, node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: ... @overload def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial_factory: Callable[[], OutputT], node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: ... def join( self, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], *, initial: OutputT | Unset = UNSET, initial_factory: Callable[[], OutputT] | Unset = UNSET, node_id: str | None = None, parent_fork_id: str | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ) -> Join[StateT, DepsT, InputT, OutputT]: if initial_factory is UNSET: initial_factory = lambda: initial # pyright: ignore[reportAssignmentType] # noqa: E731 return Join[StateT, DepsT, InputT, OutputT]( id=JoinID(NodeID(node_id or generate_placeholder_node_id(get_callable_name(reducer)))), reducer=reducer, initial_factory=cast(Callable[[], OutputT], initial_factory), parent_fork_id=ForkID(parent_fork_id) if parent_fork_id is not None else None, preferred_parent_fork=preferred_parent_fork, ) # Edge building def add(self, *edges: EdgePath[StateT, DepsT]) -> None: # noqa: C901 """Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Args: *edges: The edge paths to add to the graph """ def _handle_path(p: Path): """Process a path and create necessary fork nodes. Args: p: The path to process """ for item in p.items: if isinstance(item, BroadcastMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=False, downstream_join_id=None) self._insert_node(new_node) for path in item.paths: _handle_path(Path(items=[*path.items])) elif isinstance(item, MapMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=True, downstream_join_id=item.downstream_join_id) self._insert_node(new_node) elif isinstance(item, DestinationMarker): pass def _handle_destination_node(d: AnyDestinationNode): if id(d) in destination_ids: return # prevent infinite recursion if there is a cycle of decisions destination_ids.add(id(d)) destinations.append(d) self._insert_node(d) if isinstance(d, Decision): for branch in d.branches: _handle_path(branch.path) for d2 in branch.destinations: _handle_destination_node(d2) destination_ids = set[int]() destinations: list[AnyDestinationNode] = [] for edge in edges: for source_node in edge.sources: self._insert_node(source_node) self._edges_by_source[source_node.id].append(edge.path) for destination_node in edge.destinations: _handle_destination_node(destination_node) _handle_path(edge.path) # Automatically create edges from step function return hints including `BaseNode`s for destination in destinations: if not isinstance(destination, Step) or isinstance(destination, NodeStep): continue parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(destination.call, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError: pass else: edge = self._edge_from_return_hint(destination, return_hint) if edge is not None: self.add(edge) def add_edge(self, source: Source[T], destination: Destination[T], *, label: str | None = None) -> None: """Add a simple edge between two nodes. Args: source: The source node destination: The destination node label: Optional label for the edge """ builder = self.edge_from(source) if label is not None: builder = builder.label(label) self.add(builder.to(destination)) def add_mapping_edge( self, source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None: """Add an edge that maps iterable data across parallel paths. Args: source: The source node that produces iterable data map_to: The destination node that receives individual items pre_map_label: Optional label before the map operation post_map_label: Optional label after the map operation fork_id: Optional ID for the fork node produced for this map operation downstream_join_id: Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. """ builder = self.edge_from(source) if pre_map_label is not None: builder = builder.label(pre_map_label) builder = builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id) if post_map_label is not None: builder = builder.label(post_map_label) self.add(builder.to(map_to)) # TODO(DavidM): Support adding subgraphs; I think this behaves like a step with the same inputs/outputs but gets rendered as a subgraph in mermaid def edge_from(self, *sources: Source[SourceOutputT]) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]: """Create an edge path builder starting from the given source nodes. Args: *sources: The source nodes to start the edge path from Returns: An EdgePathBuilder for constructing the complete edge path """ return EdgePathBuilder[StateT, DepsT, SourceOutputT]( sources=sources, path_builder=PathBuilder(working_items=[]) ) def decision(self, *, note: str | None = None, node_id: str | None = None) -> Decision[StateT, DepsT, Never]: """Create a new decision node. Args: note: Optional note to describe the decision logic node_id: Optional ID for the node produced for this decision logic Returns: A new Decision node with no branches """ return Decision(id=NodeID(node_id or generate_placeholder_node_id('decision')), branches=[], note=note) def match( self, source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]: """Create a decision branch matcher. Args: source: The type or type expression to match against matches: Optional custom matching function Returns: A DecisionBranchBuilder for constructing the branch """ # Note, the following node_id really is just a placeholder and shouldn't end up in the final graph # This is why we don't expose a way for end users to override the value used here. node_id = NodeID(generate_placeholder_node_id('match_decision')) decision = Decision[StateT, DepsT, Never](id=node_id, branches=[], note=None) new_path_builder = PathBuilder[StateT, DepsT, SourceT](working_items=[]) return DecisionBranchBuilder(decision=decision, source=source, matches=matches, path_builder=new_path_builder) def match_node( self, source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT]: """Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Args: source: The BaseNode subclass to match against matches: Optional custom matching function Returns: A DecisionBranch for the BaseNode type """ node = NodeStep(source) path = Path(items=[DestinationMarker(node.id)]) return DecisionBranch(source=source, matches=matches, path=path, destinations=[node]) def node( self, node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT]: """Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Args: node_type: The BaseNode subclass to integrate Returns: An EdgePath representing the node and its connections Raises: GraphSetupError: If the node type is missing required type hints """ parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(node_type.run, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: # pragma: no cover raise exceptions.GraphSetupError( f'Node {node_type} is missing a return type hint on its `run` method' ) from e node = NodeStep(node_type) edge = self._edge_from_return_hint(node, return_hint) if not edge: # pragma: no cover raise exceptions.GraphSetupError(f'Node {node_type} is missing a return type hint on its `run` method') return edge # Helpers def _insert_node(self, node: AnyNode) -> None: """Insert a node into the graph, checking for ID conflicts. Args: node: The node to insert Raises: ValueError: If a different node with the same ID already exists """ existing = self._nodes.get(node.id) if existing is None: self._nodes[node.id] = node elif isinstance(existing, NodeStep) and isinstance(node, NodeStep) and existing.node_type is node.node_type: pass elif existing is not node: raise GraphBuildingError( f'All nodes must have unique node IDs. {node.id!r} was the ID for {existing} and {node}' ) def _edge_from_return_hint( self, node: SourceNode[StateT, DepsT, Any], return_hint: TypeOrTypeExpression[Any] ) -> EdgePath[StateT, DepsT] | None: """Create edges from a return type hint. This method analyzes return type hints from step functions or node methods to automatically create appropriate edges in the graph. Args: node: The source node return_hint: The return type hint to analyze Returns: An EdgePath if edges can be inferred, None otherwise Raises: GraphSetupError: If the return type hint is invalid or incomplete """ destinations: list[AnyDestinationNode] = [] union_args = _utils.get_union_args(return_hint) for return_type in union_args: return_type, annotations = _utils.unpack_annotated(return_type) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: destinations.append(self.end_node) elif return_type_origin is BaseNode: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a plain `BaseNode`. ' 'Edge inference requires each possible returned `BaseNode` subclass to be listed explicitly.' ) elif return_type_origin is StepNode: step = cast( Step[StateT, DepsT, Any, Any] | None, next((a for a in annotations if isinstance(a, Step)), None), # pyright: ignore[reportUnknownArgumentType] ) if step is None: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a `StepNode` without a `Step` annotation. ' 'When returning `my_step.as_node()`, use `Annotated[StepNode[StateT, DepsT], my_step]` as the return type hint.' ) destinations.append(step) elif return_type_origin is JoinNode: join = cast( Join[StateT, DepsT, Any, Any] | None, next((a for a in annotations if isinstance(a, Join)), None), # pyright: ignore[reportUnknownArgumentType] ) if join is None: raise exceptions.GraphSetupError( # pragma: no cover f'Node {node} return type hint includes a `JoinNode` without a `Join` annotation. ' 'When returning `my_join.as_node()`, use `Annotated[JoinNode[StateT, DepsT], my_join]` as the return type hint.' ) destinations.append(join) elif inspect.isclass(return_type_origin) and issubclass(return_type_origin, BaseNode): destinations.append(NodeStep(return_type)) if len(destinations) < len(union_args): # Only build edges if all the return types are nodes return None edge = self.edge_from(node) if len(destinations) == 1: return edge.to(destinations[0]) else: decision = self.decision() for destination in destinations: # We don't actually use this decision mechanism, but we need to build the edges for parent-fork finding decision = decision.branch(self.match(NoneType).to(destination)) return edge.to(decision) # Graph building def build(self, validate_graph_structure: bool = True) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]: """Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Args: validate_graph_structure: whether to perform validation of the graph structure See the docstring of _validate_graph_structure below for more details. Returns: A complete Graph instance ready for execution Raises: ValueError: If the graph structure is invalid (e.g., join without parent fork) """ nodes = self._nodes edges_by_source = self._edges_by_source nodes, edges_by_source = _replace_placeholder_node_ids(nodes, edges_by_source) nodes, edges_by_source = _flatten_paths(nodes, edges_by_source) nodes, edges_by_source = _normalize_forks(nodes, edges_by_source) if validate_graph_structure: _validate_graph_structure(nodes, edges_by_source) parent_forks = _collect_dominating_forks(nodes, edges_by_source) intermediate_join_nodes = _compute_intermediate_join_nodes(nodes, parent_forks) return Graph[StateT, DepsT, GraphInputT, GraphOutputT]( name=self.name, state_type=unpack_type_expression(self.state_type), deps_type=unpack_type_expression(self.deps_type), input_type=unpack_type_expression(self.input_type), output_type=unpack_type_expression(self.output_type), nodes=nodes, edges_by_source=edges_by_source, parent_forks=parent_forks, intermediate_join_nodes=intermediate_join_nodes, auto_instrument=self.auto_instrument, ) ``` #### __init__ ```python __init__( *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[ GraphInputT ] = NoneType, output_type: TypeOrTypeExpression[ GraphOutputT ] = NoneType, auto_instrument: bool = True ) ``` Initialize a graph builder. Parameters: | Name | Type | Description | Default | | ----------------- | ------------------------------------ | ----------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | | `name` | \`str | None\` | Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. | | `state_type` | `TypeOrTypeExpression[StateT]` | The type of the graph state | `NoneType` | | `deps_type` | `TypeOrTypeExpression[DepsT]` | The type of the dependencies | `NoneType` | | `input_type` | `TypeOrTypeExpression[GraphInputT]` | The type of the graph input data | `NoneType` | | `output_type` | `TypeOrTypeExpression[GraphOutputT]` | The type of the graph output data | `NoneType` | | `auto_instrument` | `bool` | Whether to automatically create instrumentation spans | `True` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def __init__( self, *, name: str | None = None, state_type: TypeOrTypeExpression[StateT] = NoneType, deps_type: TypeOrTypeExpression[DepsT] = NoneType, input_type: TypeOrTypeExpression[GraphInputT] = NoneType, output_type: TypeOrTypeExpression[GraphOutputT] = NoneType, auto_instrument: bool = True, ): """Initialize a graph builder. Args: name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the graph state deps_type: The type of the dependencies input_type: The type of the graph input data output_type: The type of the graph output data auto_instrument: Whether to automatically create instrumentation spans """ self.name = name self.state_type = state_type self.deps_type = deps_type self.input_type = input_type self.output_type = output_type self.auto_instrument = auto_instrument self._nodes = {} self._edges_by_source = defaultdict(list) self._decision_index = 1 self._start_node = StartNode[GraphInputT]() self._end_node = EndNode[GraphOutputT]() ``` #### name ```python name: str | None = name ``` Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. #### state_type ```python state_type: TypeOrTypeExpression[StateT] = state_type ``` The type of the graph state. #### deps_type ```python deps_type: TypeOrTypeExpression[DepsT] = deps_type ``` The type of the dependencies. #### input_type ```python input_type: TypeOrTypeExpression[GraphInputT] = input_type ``` The type of the graph input data. #### output_type ```python output_type: TypeOrTypeExpression[GraphOutputT] = ( output_type ) ``` The type of the graph output data. #### auto_instrument ```python auto_instrument: bool = auto_instrument ``` Whether to automatically create instrumentation spans. #### start_node ```python start_node: StartNode[GraphInputT] ``` Get the start node for the graph. Returns: | Type | Description | | ------------------------ | ---------------------------------------------------- | | `StartNode[GraphInputT]` | The start node that receives the initial graph input | #### end_node ```python end_node: EndNode[GraphOutputT] ``` Get the end node for the graph. Returns: | Type | Description | | ----------------------- | ------------------------------------------------- | | `EndNode[GraphOutputT]` | The end node that produces the final graph output | #### step ```python step( *, node_id: str | None = None, label: str | None = None ) -> Callable[ [StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT], ] ``` ```python step( call: StepFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None ) -> Step[StateT, DepsT, InputT, OutputT] ``` ```python step( call: ( StepFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[ [StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT], ] ) ``` Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Parameters: | Name | Type | Description | Default | | --------- | ---------------------------------------------- | ----------- | ----------------------------- | | `call` | \`StepFunction[StateT, DepsT, InputT, OutputT] | None\` | The step function to wrap | | `node_id` | \`str | None\` | Optional ID for the node | | `label` | \`str | None\` | Optional human-readable label | Returns: | Type | Description | | -------------------------------------- | ---------------------------------------------------------------------------------------------------- | | \`Step[StateT, DepsT, InputT, OutputT] | Callable\[\[StepFunction[StateT, DepsT, InputT, OutputT]\], Step[StateT, DepsT, InputT, OutputT]\]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def step( self, call: StepFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, OutputT] | Callable[[StepFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, OutputT]] ): """Create a step from a step function. This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StepFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, OutputT]: return self.step(call=func, node_id=node_id, label=label) return decorator node_id = node_id or get_callable_name(call) step = Step[StateT, DepsT, InputT, OutputT](id=NodeID(node_id), call=call, label=label) return step ``` #### stream ```python stream( *, node_id: str | None = None, label: str | None = None ) -> Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ``` ```python stream( call: StreamFunction[StateT, DepsT, InputT, OutputT], *, node_id: str | None = None, label: str | None = None ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] ``` ```python stream( call: ( StreamFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ) ``` ```python stream( call: ( StreamFunction[StateT, DepsT, InputT, OutputT] | None ) = None, *, node_id: str | None = None, label: str | None = None ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ) ``` Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------------------------ | ----------- | ----------------------------- | | `call` | \`StreamFunction[StateT, DepsT, InputT, OutputT] | None\` | The step function to wrap | | `node_id` | \`str | None\` | Optional ID for the node | | `label` | \`str | None\` | Optional human-readable label | Returns: | Type | Description | | ------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | \`Step\[StateT, DepsT, InputT, AsyncIterable[OutputT]\] | Callable\[\[StreamFunction[StateT, DepsT, InputT, OutputT]\], Step\[StateT, DepsT, InputT, AsyncIterable[OutputT]\]\]\` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def stream( self, call: StreamFunction[StateT, DepsT, InputT, OutputT] | None = None, *, node_id: str | None = None, label: str | None = None, ) -> ( Step[StateT, DepsT, InputT, AsyncIterable[OutputT]] | Callable[ [StreamFunction[StateT, DepsT, InputT, OutputT]], Step[StateT, DepsT, InputT, AsyncIterable[OutputT]], ] ): """Create a step from an async iterator (which functions like a "stream"). This method can be used as a decorator or called directly to create a step node from an async function. Args: call: The step function to wrap node_id: Optional ID for the node label: Optional human-readable label Returns: Either a Step instance or a decorator function """ if call is None: def decorator( func: StreamFunction[StateT, DepsT, InputT, OutputT], ) -> Step[StateT, DepsT, InputT, AsyncIterable[OutputT]]: return self.stream(call=func, node_id=node_id, label=label) return decorator # We need to wrap the call so that we can call `await` even though the result is an async iterator async def wrapper(ctx: StepContext[StateT, DepsT, InputT]): return call(ctx) node_id = node_id or get_callable_name(call) return self.step(call=wrapper, node_id=node_id, label=label) ``` #### add ```python add(*edges: EdgePath[StateT, DepsT]) -> None ``` Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Parameters: | Name | Type | Description | Default | | -------- | ------------------------- | ---------------------------------- | ------- | | `*edges` | `EdgePath[StateT, DepsT]` | The edge paths to add to the graph | `()` | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add(self, *edges: EdgePath[StateT, DepsT]) -> None: # noqa: C901 """Add one or more edge paths to the graph. This method processes edge paths and automatically creates any necessary fork nodes for broadcasts and maps. Args: *edges: The edge paths to add to the graph """ def _handle_path(p: Path): """Process a path and create necessary fork nodes. Args: p: The path to process """ for item in p.items: if isinstance(item, BroadcastMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=False, downstream_join_id=None) self._insert_node(new_node) for path in item.paths: _handle_path(Path(items=[*path.items])) elif isinstance(item, MapMarker): new_node = Fork[Any, Any](id=item.fork_id, is_map=True, downstream_join_id=item.downstream_join_id) self._insert_node(new_node) elif isinstance(item, DestinationMarker): pass def _handle_destination_node(d: AnyDestinationNode): if id(d) in destination_ids: return # prevent infinite recursion if there is a cycle of decisions destination_ids.add(id(d)) destinations.append(d) self._insert_node(d) if isinstance(d, Decision): for branch in d.branches: _handle_path(branch.path) for d2 in branch.destinations: _handle_destination_node(d2) destination_ids = set[int]() destinations: list[AnyDestinationNode] = [] for edge in edges: for source_node in edge.sources: self._insert_node(source_node) self._edges_by_source[source_node.id].append(edge.path) for destination_node in edge.destinations: _handle_destination_node(destination_node) _handle_path(edge.path) # Automatically create edges from step function return hints including `BaseNode`s for destination in destinations: if not isinstance(destination, Step) or isinstance(destination, NodeStep): continue parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(destination.call, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError: pass else: edge = self._edge_from_return_hint(destination, return_hint) if edge is not None: self.add(edge) ``` #### add_edge ```python add_edge( source: Source[T], destination: Destination[T], *, label: str | None = None ) -> None ``` Add a simple edge between two nodes. Parameters: | Name | Type | Description | Default | | ------------- | ---------------- | -------------------- | --------------------------- | | `source` | `Source[T]` | The source node | *required* | | `destination` | `Destination[T]` | The destination node | *required* | | `label` | \`str | None\` | Optional label for the edge | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add_edge(self, source: Source[T], destination: Destination[T], *, label: str | None = None) -> None: """Add a simple edge between two nodes. Args: source: The source node destination: The destination node label: Optional label for the edge """ builder = self.edge_from(source) if label is not None: builder = builder.label(label) self.add(builder.to(destination)) ``` #### add_mapping_edge ```python add_mapping_edge( source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None ) -> None ``` Add an edge that maps iterable data across parallel paths. Parameters: | Name | Type | Description | Default | | -------------------- | --------------------- | --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | | `source` | `Source[Iterable[T]]` | The source node that produces iterable data | *required* | | `map_to` | `Destination[T]` | The destination node that receives individual items | *required* | | `pre_map_label` | \`str | None\` | Optional label before the map operation | | `post_map_label` | \`str | None\` | Optional label after the map operation | | `fork_id` | \`ForkID | None\` | Optional ID for the fork node produced for this map operation | | `downstream_join_id` | \`JoinID | None\` | Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def add_mapping_edge( self, source: Source[Iterable[T]], map_to: Destination[T], *, pre_map_label: str | None = None, post_map_label: str | None = None, fork_id: ForkID | None = None, downstream_join_id: JoinID | None = None, ) -> None: """Add an edge that maps iterable data across parallel paths. Args: source: The source node that produces iterable data map_to: The destination node that receives individual items pre_map_label: Optional label before the map operation post_map_label: Optional label after the map operation fork_id: Optional ID for the fork node produced for this map operation downstream_join_id: Optional ID of a join node that will always be downstream of this map. Specifying this ensures correct handling if you try to map an empty iterable. """ builder = self.edge_from(source) if pre_map_label is not None: builder = builder.label(pre_map_label) builder = builder.map(fork_id=fork_id, downstream_join_id=downstream_join_id) if post_map_label is not None: builder = builder.label(post_map_label) self.add(builder.to(map_to)) ``` #### edge_from ```python edge_from( *sources: Source[SourceOutputT], ) -> EdgePathBuilder[StateT, DepsT, SourceOutputT] ``` Create an edge path builder starting from the given source nodes. Parameters: | Name | Type | Description | Default | | ---------- | ----------------------- | -------------------------------------------- | ------- | | `*sources` | `Source[SourceOutputT]` | The source nodes to start the edge path from | `()` | Returns: | Type | Description | | ----------------------------------------------- | ---------------------------------------------------------- | | `EdgePathBuilder[StateT, DepsT, SourceOutputT]` | An EdgePathBuilder for constructing the complete edge path | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def edge_from(self, *sources: Source[SourceOutputT]) -> EdgePathBuilder[StateT, DepsT, SourceOutputT]: """Create an edge path builder starting from the given source nodes. Args: *sources: The source nodes to start the edge path from Returns: An EdgePathBuilder for constructing the complete edge path """ return EdgePathBuilder[StateT, DepsT, SourceOutputT]( sources=sources, path_builder=PathBuilder(working_items=[]) ) ``` #### decision ```python decision( *, note: str | None = None, node_id: str | None = None ) -> Decision[StateT, DepsT, Never] ``` Create a new decision node. Parameters: | Name | Type | Description | Default | | --------- | ----- | ----------- | --------------------------------------------------------- | | `note` | \`str | None\` | Optional note to describe the decision logic | | `node_id` | \`str | None\` | Optional ID for the node produced for this decision logic | Returns: | Type | Description | | -------------------------------- | ------------------------------------ | | `Decision[StateT, DepsT, Never]` | A new Decision node with no branches | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def decision(self, *, note: str | None = None, node_id: str | None = None) -> Decision[StateT, DepsT, Never]: """Create a new decision node. Args: note: Optional note to describe the decision logic node_id: Optional ID for the node produced for this decision logic Returns: A new Decision node with no branches """ return Decision(id=NodeID(node_id or generate_placeholder_node_id('decision')), branches=[], note=note) ``` #### match ```python match( source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None ) -> DecisionBranchBuilder[ StateT, DepsT, SourceT, SourceT, Never ] ``` Create a decision branch matcher. Parameters: | Name | Type | Description | Default | | --------- | ------------------------------- | -------------------------------------------- | --------------------------------- | | `source` | `TypeOrTypeExpression[SourceT]` | The type or type expression to match against | *required* | | `matches` | \`Callable\[[Any], bool\] | None\` | Optional custom matching function | Returns: | Type | Description | | --------------------------------------------------------------- | --------------------------------------------------- | | `DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]` | A DecisionBranchBuilder for constructing the branch | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def match( self, source: TypeOrTypeExpression[SourceT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranchBuilder[StateT, DepsT, SourceT, SourceT, Never]: """Create a decision branch matcher. Args: source: The type or type expression to match against matches: Optional custom matching function Returns: A DecisionBranchBuilder for constructing the branch """ # Note, the following node_id really is just a placeholder and shouldn't end up in the final graph # This is why we don't expose a way for end users to override the value used here. node_id = NodeID(generate_placeholder_node_id('match_decision')) decision = Decision[StateT, DepsT, Never](id=node_id, branches=[], note=None) new_path_builder = PathBuilder[StateT, DepsT, SourceT](working_items=[]) return DecisionBranchBuilder(decision=decision, source=source, matches=matches, path_builder=new_path_builder) ``` #### match_node ```python match_node( source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None ) -> DecisionBranch[SourceNodeT] ``` Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Parameters: | Name | Type | Description | Default | | --------- | ------------------------- | -------------------------------------- | --------------------------------- | | `source` | `type[SourceNodeT]` | The BaseNode subclass to match against | *required* | | `matches` | \`Callable\[[Any], bool\] | None\` | Optional custom matching function | Returns: | Type | Description | | ----------------------------- | -------------------------------------- | | `DecisionBranch[SourceNodeT]` | A DecisionBranch for the BaseNode type | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def match_node( self, source: type[SourceNodeT], *, matches: Callable[[Any], bool] | None = None, ) -> DecisionBranch[SourceNodeT]: """Create a decision branch for BaseNode subclasses. This is similar to match() but specifically designed for matching against BaseNode types from the v1 system. Args: source: The BaseNode subclass to match against matches: Optional custom matching function Returns: A DecisionBranch for the BaseNode type """ node = NodeStep(source) path = Path(items=[DestinationMarker(node.id)]) return DecisionBranch(source=source, matches=matches, path=path, destinations=[node]) ``` #### node ```python node( node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT] ``` Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Parameters: | Name | Type | Description | Default | | ----------- | --------------------------------------------- | ---------------------------------- | ---------- | | `node_type` | `type[BaseNode[StateT, DepsT, GraphOutputT]]` | The BaseNode subclass to integrate | *required* | Returns: | Type | Description | | ------------------------- | ----------------------------------------------------- | | `EdgePath[StateT, DepsT]` | An EdgePath representing the node and its connections | Raises: | Type | Description | | ----------------- | ----------------------------------------------- | | `GraphSetupError` | If the node type is missing required type hints | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def node( self, node_type: type[BaseNode[StateT, DepsT, GraphOutputT]], ) -> EdgePath[StateT, DepsT]: """Create an edge path from a BaseNode class. This method integrates v1-style BaseNode classes into the v2 graph system by analyzing their type hints and creating appropriate edges. Args: node_type: The BaseNode subclass to integrate Returns: An EdgePath representing the node and its connections Raises: GraphSetupError: If the node type is missing required type hints """ parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) type_hints = get_type_hints(node_type.run, localns=parent_namespace, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: # pragma: no cover raise exceptions.GraphSetupError( f'Node {node_type} is missing a return type hint on its `run` method' ) from e node = NodeStep(node_type) edge = self._edge_from_return_hint(node, return_hint) if not edge: # pragma: no cover raise exceptions.GraphSetupError(f'Node {node_type} is missing a return type hint on its `run` method') return edge ``` #### build ```python build( validate_graph_structure: bool = True, ) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT] ``` Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Parameters: | Name | Type | Description | Default | | -------------------------- | ------ | ---------------------------------------------------------------------------------------------------------------------------- | ------- | | `validate_graph_structure` | `bool` | whether to perform validation of the graph structure See the docstring of \_validate_graph_structure below for more details. | `True` | Returns: | Type | Description | | ------------------------------------------------- | --------------------------------------------- | | `Graph[StateT, DepsT, GraphInputT, GraphOutputT]` | A complete Graph instance ready for execution | Raises: | Type | Description | | ------------ | ------------------------------------------------------------------ | | `ValueError` | If the graph structure is invalid (e.g., join without parent fork) | Source code in `pydantic_graph/pydantic_graph/beta/graph_builder.py` ```python def build(self, validate_graph_structure: bool = True) -> Graph[StateT, DepsT, GraphInputT, GraphOutputT]: """Build the final executable graph from the accumulated nodes and edges. This method performs validation, normalization, and analysis of the graph structure to create a complete, executable graph instance. Args: validate_graph_structure: whether to perform validation of the graph structure See the docstring of _validate_graph_structure below for more details. Returns: A complete Graph instance ready for execution Raises: ValueError: If the graph structure is invalid (e.g., join without parent fork) """ nodes = self._nodes edges_by_source = self._edges_by_source nodes, edges_by_source = _replace_placeholder_node_ids(nodes, edges_by_source) nodes, edges_by_source = _flatten_paths(nodes, edges_by_source) nodes, edges_by_source = _normalize_forks(nodes, edges_by_source) if validate_graph_structure: _validate_graph_structure(nodes, edges_by_source) parent_forks = _collect_dominating_forks(nodes, edges_by_source) intermediate_join_nodes = _compute_intermediate_join_nodes(nodes, parent_forks) return Graph[StateT, DepsT, GraphInputT, GraphOutputT]( name=self.name, state_type=unpack_type_expression(self.state_type), deps_type=unpack_type_expression(self.deps_type), input_type=unpack_type_expression(self.input_type), output_type=unpack_type_expression(self.output_type), nodes=nodes, edges_by_source=edges_by_source, parent_forks=parent_forks, intermediate_join_nodes=intermediate_join_nodes, auto_instrument=self.auto_instrument, ) ``` # `pydantic_graph.beta.join` Join operations and reducers for graph execution. This module provides the core components for joining parallel execution paths in a graph, including various reducer types that aggregate data from multiple sources into a single output. ### JoinState The state of a join during graph execution associated to a particular fork run. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python @dataclass class JoinState: """The state of a join during graph execution associated to a particular fork run.""" current: Any downstream_fork_stack: ForkStack cancelled_sibling_tasks: bool = False ``` ### ReducerContext Bases: `Generic[StateT, DepsT]` Context information passed to reducer functions during graph execution. The reducer context provides access to the current graph state and dependencies. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | -------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python @dataclass(init=False) class ReducerContext(Generic[StateT, DepsT]): """Context information passed to reducer functions during graph execution. The reducer context provides access to the current graph state and dependencies. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies """ _state: StateT """The current graph state.""" _deps: DepsT """The dependencies of the current graph run.""" _join_state: JoinState """The JoinState for this reducer context.""" def __init__(self, *, state: StateT, deps: DepsT, join_state: JoinState): self._state = state self._deps = deps self._join_state = join_state @property def state(self) -> StateT: """The state of the graph run.""" return self._state @property def deps(self) -> DepsT: """The deps for the graph run.""" return self._deps def cancel_sibling_tasks(self): """Cancel all sibling tasks created from the same fork. You can call this if you want your join to have early-stopping behavior. """ self._join_state.cancelled_sibling_tasks = True ``` #### state ```python state: StateT ``` The state of the graph run. #### deps ```python deps: DepsT ``` The deps for the graph run. #### cancel_sibling_tasks ```python cancel_sibling_tasks() ``` Cancel all sibling tasks created from the same fork. You can call this if you want your join to have early-stopping behavior. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def cancel_sibling_tasks(self): """Cancel all sibling tasks created from the same fork. You can call this if you want your join to have early-stopping behavior. """ self._join_state.cancelled_sibling_tasks = True ``` ### ReducerFunction ```python ReducerFunction = TypeAliasType( "ReducerFunction", ContextReducerFunction[StateT, DepsT, InputT, OutputT] | PlainReducerFunction[InputT, OutputT], type_params=(StateT, DepsT, InputT, OutputT), ) ``` A function used for reducing inputs to a join node. ### reduce_null ```python reduce_null(current: None, inputs: Any) -> None ``` A reducer that discards all input data and returns None. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def reduce_null(current: None, inputs: Any) -> None: """A reducer that discards all input data and returns None.""" return None ``` ### reduce_list_append ```python reduce_list_append( current: list[T], inputs: T ) -> list[T] ``` A reducer that appends to a list. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def reduce_list_append(current: list[T], inputs: T) -> list[T]: """A reducer that appends to a list.""" current.append(inputs) return current ``` ### reduce_list_extend ```python reduce_list_extend( current: list[T], inputs: Iterable[T] ) -> list[T] ``` A reducer that extends a list. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def reduce_list_extend(current: list[T], inputs: Iterable[T]) -> list[T]: """A reducer that extends a list.""" current.extend(inputs) return current ``` ### reduce_dict_update ```python reduce_dict_update( current: dict[K, V], inputs: Mapping[K, V] ) -> dict[K, V] ``` A reducer that updates a dict. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def reduce_dict_update(current: dict[K, V], inputs: Mapping[K, V]) -> dict[K, V]: """A reducer that updates a dict.""" current.update(inputs) return current ``` ### SupportsSum Bases: `Protocol` A protocol for a type that supports adding to itself. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python class SupportsSum(Protocol): """A protocol for a type that supports adding to itself.""" @abstractmethod def __add__(self, other: Self, /) -> Self: pass ``` ### reduce_sum ```python reduce_sum(current: NumericT, inputs: NumericT) -> NumericT ``` A reducer that sums numbers. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def reduce_sum(current: NumericT, inputs: NumericT) -> NumericT: """A reducer that sums numbers.""" return current + inputs ``` ### ReduceFirstValue Bases: `Generic[T]` A reducer that returns the first value it encounters, and cancels all other tasks. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python @dataclass class ReduceFirstValue(Generic[T]): """A reducer that returns the first value it encounters, and cancels all other tasks.""" def __call__(self, ctx: ReducerContext[object, object], current: T, inputs: T) -> T: """The reducer function.""" ctx.cancel_sibling_tasks() return inputs ``` #### __call__ ```python __call__( ctx: ReducerContext[object, object], current: T, inputs: T, ) -> T ``` The reducer function. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def __call__(self, ctx: ReducerContext[object, object], current: T, inputs: T) -> T: """The reducer function.""" ctx.cancel_sibling_tasks() return inputs ``` ### Join Bases: `Generic[StateT, DepsT, InputT, OutputT]` A join operation that synchronizes and aggregates parallel execution paths. A join defines how to combine outputs from multiple parallel execution paths using a ReducerFunction. It specifies which fork it joins (if any) and manages the initialization of reducers. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ----------------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of input data to join | *required* | | `OutputT` | | The type of the final joined output | *required* | Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python @dataclass(init=False) class Join(Generic[StateT, DepsT, InputT, OutputT]): """A join operation that synchronizes and aggregates parallel execution paths. A join defines how to combine outputs from multiple parallel execution paths using a [`ReducerFunction`][pydantic_graph.beta.join.ReducerFunction]. It specifies which fork it joins (if any) and manages the initialization of reducers. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of input data to join OutputT: The type of the final joined output """ id: JoinID _reducer: ReducerFunction[StateT, DepsT, InputT, OutputT] _initial_factory: Callable[[], OutputT] parent_fork_id: ForkID | None preferred_parent_fork: Literal['closest', 'farthest'] def __init__( self, *, id: JoinID, reducer: ReducerFunction[StateT, DepsT, InputT, OutputT], initial_factory: Callable[[], OutputT], parent_fork_id: ForkID | None = None, preferred_parent_fork: Literal['farthest', 'closest'] = 'farthest', ): self.id = id self._reducer = reducer self._initial_factory = initial_factory self.parent_fork_id = parent_fork_id self.preferred_parent_fork = preferred_parent_fork @property def reducer(self): return self._reducer @property def initial_factory(self): return self._initial_factory def reduce(self, ctx: ReducerContext[StateT, DepsT], current: OutputT, inputs: InputT) -> OutputT: n_parameters = len(inspect.signature(self.reducer).parameters) if n_parameters == 2: return cast(PlainReducerFunction[InputT, OutputT], self.reducer)(current, inputs) else: return cast(ContextReducerFunction[StateT, DepsT, InputT, OutputT], self.reducer)(ctx, current, inputs) @overload def as_node(self, inputs: None = None) -> JoinNode[StateT, DepsT]: ... @overload def as_node(self, inputs: InputT) -> JoinNode[StateT, DepsT]: ... def as_node(self, inputs: InputT | None = None) -> JoinNode[StateT, DepsT]: """Create a step node with bound inputs. Args: inputs: The input data to bind to this step, or None Returns: A [`StepNode`][pydantic_graph.beta.step.StepNode] with this step and the bound inputs """ return JoinNode(self, inputs) ``` #### as_node ```python as_node(inputs: None = None) -> JoinNode[StateT, DepsT] ``` ```python as_node(inputs: InputT) -> JoinNode[StateT, DepsT] ``` ```python as_node( inputs: InputT | None = None, ) -> JoinNode[StateT, DepsT] ``` Create a step node with bound inputs. Parameters: | Name | Type | Description | Default | | -------- | -------- | ----------- | -------------------------------------------- | | `inputs` | \`InputT | None\` | The input data to bind to this step, or None | Returns: | Type | Description | | ------------------------- | ---------------------------------------------- | | `JoinNode[StateT, DepsT]` | A StepNode with this step and the bound inputs | Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python def as_node(self, inputs: InputT | None = None) -> JoinNode[StateT, DepsT]: """Create a step node with bound inputs. Args: inputs: The input data to bind to this step, or None Returns: A [`StepNode`][pydantic_graph.beta.step.StepNode] with this step and the bound inputs """ return JoinNode(self, inputs) ``` ### JoinNode Bases: `BaseNode[StateT, DepsT, Any]` A base node that represents a join item with bound inputs. JoinNode bridges between the v1 and v2 graph execution systems by wrapping a Join with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python @dataclass class JoinNode(BaseNode[StateT, DepsT, Any]): """A base node that represents a join item with bound inputs. JoinNode bridges between the v1 and v2 graph execution systems by wrapping a [`Join`][pydantic_graph.beta.join.Join] with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. """ join: Join[StateT, DepsT, Any, Any] """The step to execute.""" inputs: Any """The inputs bound to this step.""" async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the join node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`JoinNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` #### join ```python join: Join[StateT, DepsT, Any, Any] ``` The step to execute. #### inputs ```python inputs: Any ``` The inputs bound to this step. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the join node. Parameters: | Name | Type | Description | Default | | ----- | -------------------------------- | --------------------------- | ---------- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph execution context | *required* | Returns: | Type | Description | | ------------------------------ | ----------- | | \`BaseNode[StateT, DepsT, Any] | End[Any]\` | Raises: | Type | Description | | --------------------- | --------------------------------------------------------- | | `NotImplementedError` | Always raised as StepNode is not meant to be run directly | Source code in `pydantic_graph/pydantic_graph/beta/join.py` ```python async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the join node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`JoinNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` # `pydantic_graph.beta.node` Core node types for graph construction and execution. This module defines the fundamental node types used to build execution graphs, including start/end nodes and fork nodes for parallel execution. ### StateT ```python StateT = TypeVar('StateT', infer_variance=True) ``` Type variable for graph state. ### OutputT ```python OutputT = TypeVar('OutputT', infer_variance=True) ``` Type variable for node output data. ### InputT ```python InputT = TypeVar('InputT', infer_variance=True) ``` Type variable for node input data. ### StartNode Bases: `Generic[OutputT]` Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. Source code in `pydantic_graph/pydantic_graph/beta/node.py` ```python class StartNode(Generic[OutputT]): """Entry point node for graph execution. The StartNode represents the beginning of a graph execution flow. """ id = NodeID('__start__') """Fixed identifier for the start node.""" ``` #### id ```python id = NodeID('__start__') ``` Fixed identifier for the start node. ### EndNode Bases: `Generic[InputT]` Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. Source code in `pydantic_graph/pydantic_graph/beta/node.py` ```python class EndNode(Generic[InputT]): """Terminal node representing the completion of graph execution. The EndNode marks the successful completion of a graph execution flow and can collect the final output data. """ id = NodeID('__end__') """Fixed identifier for the end node.""" def _force_variance(self, inputs: InputT) -> None: # pragma: no cover """Force type variance for proper generic typing. This method exists solely for type checking purposes and should never be called. Args: inputs: Input data of type InputT. Raises: RuntimeError: Always, as this method should never be executed. """ raise RuntimeError('This method should never be called, it is just defined for typing purposes.') ``` #### id ```python id = NodeID('__end__') ``` Fixed identifier for the end node. ### Fork Bases: `Generic[InputT, OutputT]` Fork node that creates parallel execution branches. A Fork node splits the execution flow into multiple parallel branches, enabling concurrent execution of downstream nodes. It can either map a sequence across multiple branches or duplicate data to each branch. Source code in `pydantic_graph/pydantic_graph/beta/node.py` ```python @dataclass class Fork(Generic[InputT, OutputT]): """Fork node that creates parallel execution branches. A Fork node splits the execution flow into multiple parallel branches, enabling concurrent execution of downstream nodes. It can either map a sequence across multiple branches or duplicate data to each branch. """ id: ForkID """Unique identifier for this fork node.""" is_map: bool """Determines fork behavior. If True, InputT must be Sequence[OutputT] and each element is sent to a separate branch. If False, InputT must be OutputT and the same data is sent to all branches. """ downstream_join_id: JoinID | None """Optional identifier of a downstream join node that should be jumped to if mapping an empty iterable.""" def _force_variance(self, inputs: InputT) -> OutputT: # pragma: no cover """Force type variance for proper generic typing. This method exists solely for type checking purposes and should never be called. Args: inputs: Input data to be forked. Returns: Output data type (never actually returned). Raises: RuntimeError: Always, as this method should never be executed. """ raise RuntimeError('This method should never be called, it is just defined for typing purposes.') ``` #### id ```python id: ForkID ``` Unique identifier for this fork node. #### is_map ```python is_map: bool ``` Determines fork behavior. If True, InputT must be Sequence[OutputT] and each element is sent to a separate branch. If False, InputT must be OutputT and the same data is sent to all branches. #### downstream_join_id ```python downstream_join_id: JoinID | None ``` Optional identifier of a downstream join node that should be jumped to if mapping an empty iterable. # `pydantic_graph.beta.step` Step-based graph execution components. This module provides the core abstractions for step-based graph execution, including step contexts, step functions, and step nodes that bridge between the v1 and v2 graph execution systems. ### StepContext Bases: `Generic[StateT, DepsT, InputT]` Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | -------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python @dataclass(init=False) class StepContext(Generic[StateT, DepsT, InputT]): """Context information passed to step functions during graph execution. The step context provides access to the current graph state, dependencies, and input data for a step. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data """ _state: StateT """The current graph state.""" _deps: DepsT """The graph run dependencies.""" _inputs: InputT """The input data for this step.""" def __init__(self, *, state: StateT, deps: DepsT, inputs: InputT): self._state = state self._deps = deps self._inputs = inputs @property def state(self) -> StateT: return self._state @property def deps(self) -> DepsT: return self._deps @property def inputs(self) -> InputT: """The input data for this step. This must be a property to ensure correct variance behavior """ return self._inputs ``` #### inputs ```python inputs: InputT ``` The input data for this step. This must be a property to ensure correct variance behavior ### StepFunction Bases: `Protocol[StateT, DepsT, InputT, OutputT]` Protocol for step functions that can be executed in the graph. Step functions are async callables that receive a step context and return a result. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python class StepFunction(Protocol[StateT, DepsT, InputT, OutputT]): """Protocol for step functions that can be executed in the graph. Step functions are async callables that receive a step context and return a result. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data OutputT: The type of the output data """ def __call__(self, ctx: StepContext[StateT, DepsT, InputT]) -> Awaitable[OutputT]: """Execute the step function with the given context. Args: ctx: The step context containing state, dependencies, and inputs Returns: An awaitable that resolves to the step's output """ raise NotImplementedError ``` #### __call__ ```python __call__( ctx: StepContext[StateT, DepsT, InputT], ) -> Awaitable[OutputT] ``` Execute the step function with the given context. Parameters: | Name | Type | Description | Default | | ----- | ------------------------------------ | ----------------------------------------------------------- | ---------- | | `ctx` | `StepContext[StateT, DepsT, InputT]` | The step context containing state, dependencies, and inputs | *required* | Returns: | Type | Description | | -------------------- | ----------------------------------------------- | | `Awaitable[OutputT]` | An awaitable that resolves to the step's output | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python def __call__(self, ctx: StepContext[StateT, DepsT, InputT]) -> Awaitable[OutputT]: """Execute the step function with the given context. Args: ctx: The step context containing state, dependencies, and inputs Returns: An awaitable that resolves to the step's output """ raise NotImplementedError ``` ### StreamFunction Bases: `Protocol[StateT, DepsT, InputT, OutputT]` Protocol for stream functions that can be executed in the graph. Stream functions are async callables that receive a step context and return an async iterator. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python class StreamFunction(Protocol[StateT, DepsT, InputT, OutputT]): """Protocol for stream functions that can be executed in the graph. Stream functions are async callables that receive a step context and return an async iterator. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data OutputT: The type of the output data """ def __call__(self, ctx: StepContext[StateT, DepsT, InputT]) -> AsyncIterator[OutputT]: """Execute the stream function with the given context. Args: ctx: The step context containing state, dependencies, and inputs Returns: An async iterator yielding the streamed output """ raise NotImplementedError yield ``` #### __call__ ```python __call__( ctx: StepContext[StateT, DepsT, InputT], ) -> AsyncIterator[OutputT] ``` Execute the stream function with the given context. Parameters: | Name | Type | Description | Default | | ----- | ------------------------------------ | ----------------------------------------------------------- | ---------- | | `ctx` | `StepContext[StateT, DepsT, InputT]` | The step context containing state, dependencies, and inputs | *required* | Returns: | Type | Description | | ------------------------ | ---------------------------------------------- | | `AsyncIterator[OutputT]` | An async iterator yielding the streamed output | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python def __call__(self, ctx: StepContext[StateT, DepsT, InputT]) -> AsyncIterator[OutputT]: """Execute the stream function with the given context. Args: ctx: The step context containing state, dependencies, and inputs Returns: An async iterator yielding the streamed output """ raise NotImplementedError yield ``` ### AnyStepFunction ```python AnyStepFunction = StepFunction[Any, Any, Any, Any] ``` Type alias for a step function with any type parameters. ### Step Bases: `Generic[StateT, DepsT, InputT, OutputT]` A step in the graph execution that wraps a step function. Steps represent individual units of execution in the graph, encapsulating a step function along with metadata like ID and label. Class Type Parameters: | Name | Bound or Constraints | Description | Default | | --------- | -------------------- | ---------------------------- | ---------- | | `StateT` | | The type of the graph state | *required* | | `DepsT` | | The type of the dependencies | *required* | | `InputT` | | The type of the input data | *required* | | `OutputT` | | The type of the output data | *required* | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python @dataclass(init=False) class Step(Generic[StateT, DepsT, InputT, OutputT]): """A step in the graph execution that wraps a step function. Steps represent individual units of execution in the graph, encapsulating a step function along with metadata like ID and label. Type Parameters: StateT: The type of the graph state DepsT: The type of the dependencies InputT: The type of the input data OutputT: The type of the output data """ id: NodeID """Unique identifier for this step.""" _call: StepFunction[StateT, DepsT, InputT, OutputT] """The step function to execute.""" label: str | None """Optional human-readable label for this step.""" def __init__(self, *, id: NodeID, call: StepFunction[StateT, DepsT, InputT, OutputT], label: str | None = None): self.id = id self._call = call self.label = label @property def call(self) -> StepFunction[StateT, DepsT, InputT, OutputT]: """The step function to execute. This needs to be a property for proper variance inference.""" return self._call @overload def as_node(self, inputs: None = None) -> StepNode[StateT, DepsT]: ... @overload def as_node(self, inputs: InputT) -> StepNode[StateT, DepsT]: ... def as_node(self, inputs: InputT | None = None) -> StepNode[StateT, DepsT]: """Create a step node with bound inputs. Args: inputs: The input data to bind to this step, or None Returns: A [`StepNode`][pydantic_graph.beta.step.StepNode] with this step and the bound inputs """ return StepNode(self, inputs) ``` #### id ```python id: NodeID = id ``` Unique identifier for this step. #### label ```python label: str | None = label ``` Optional human-readable label for this step. #### call ```python call: StepFunction[StateT, DepsT, InputT, OutputT] ``` The step function to execute. This needs to be a property for proper variance inference. #### as_node ```python as_node(inputs: None = None) -> StepNode[StateT, DepsT] ``` ```python as_node(inputs: InputT) -> StepNode[StateT, DepsT] ``` ```python as_node( inputs: InputT | None = None, ) -> StepNode[StateT, DepsT] ``` Create a step node with bound inputs. Parameters: | Name | Type | Description | Default | | -------- | -------- | ----------- | -------------------------------------------- | | `inputs` | \`InputT | None\` | The input data to bind to this step, or None | Returns: | Type | Description | | ------------------------- | ---------------------------------------------- | | `StepNode[StateT, DepsT]` | A StepNode with this step and the bound inputs | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python def as_node(self, inputs: InputT | None = None) -> StepNode[StateT, DepsT]: """Create a step node with bound inputs. Args: inputs: The input data to bind to this step, or None Returns: A [`StepNode`][pydantic_graph.beta.step.StepNode] with this step and the bound inputs """ return StepNode(self, inputs) ``` ### StepNode Bases: `BaseNode[StateT, DepsT, Any]` A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a Step with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python @dataclass class StepNode(BaseNode[StateT, DepsT, Any]): """A base node that represents a step with bound inputs. StepNode bridges between the v1 and v2 graph execution systems by wrapping a [`Step`][pydantic_graph.beta.step.Step] with bound inputs in a BaseNode interface. It is not meant to be run directly but rather used to indicate transitions to v2-style steps. """ step: Step[StateT, DepsT, Any, Any] """The step to execute.""" inputs: Any """The inputs bound to this step.""" async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the step node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`StepNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` #### step ```python step: Step[StateT, DepsT, Any, Any] ``` The step to execute. #### inputs ```python inputs: Any ``` The inputs bound to this step. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[Any] ``` Attempt to run the step node. Parameters: | Name | Type | Description | Default | | ----- | -------------------------------- | --------------------------- | ---------- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph execution context | *required* | Returns: | Type | Description | | ------------------------------ | ----------- | | \`BaseNode[StateT, DepsT, Any] | End[Any]\` | Raises: | Type | Description | | --------------------- | --------------------------------------------------------- | | `NotImplementedError` | Always raised as StepNode is not meant to be run directly | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Attempt to run the step node. Args: ctx: The graph execution context Returns: The result of step execution Raises: NotImplementedError: Always raised as StepNode is not meant to be run directly """ raise NotImplementedError( '`StepNode` is not meant to be run directly, it is meant to be used in `BaseNode` subclasses to indicate a transition to v2-style steps.' ) ``` ### NodeStep Bases: `Step[StateT, DepsT, Any, BaseNode[StateT, DepsT, Any] | End[Any]]` A step that wraps a BaseNode type for execution. NodeStep allows v1-style BaseNode classes to be used as steps in the v2 graph execution system. It validates that the input is of the expected node type and runs it with the appropriate graph context. Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python class NodeStep(Step[StateT, DepsT, Any, BaseNode[StateT, DepsT, Any] | End[Any]]): """A step that wraps a BaseNode type for execution. NodeStep allows v1-style BaseNode classes to be used as steps in the v2 graph execution system. It validates that the input is of the expected node type and runs it with the appropriate graph context. """ node_type: type[BaseNode[StateT, DepsT, Any]] """The BaseNode type this step executes.""" def __init__( self, node_type: type[BaseNode[StateT, DepsT, Any]], *, id: NodeID | None = None, label: str | None = None, ): """Initialize a node step. Args: node_type: The BaseNode class this step will execute id: Optional unique identifier, defaults to the node's get_node_id() label: Optional human-readable label for this step """ super().__init__( id=id or NodeID(node_type.get_node_id()), call=self._call_node, label=label, ) # `type[BaseNode[StateT, DepsT, Any]]` could actually be a `typing._GenericAlias` like `pydantic_ai._agent_graph.UserPromptNode[~DepsT, ~OutputT]`, # so we get the origin to get to the actual class self.node_type = get_origin(node_type) or node_type async def _call_node(self, ctx: StepContext[StateT, DepsT, Any]) -> BaseNode[StateT, DepsT, Any] | End[Any]: """Execute the wrapped node with the step context. Args: ctx: The step context containing the node instance to run Returns: The result of running the node, either another BaseNode or End Raises: ValueError: If the input node is not of the expected type """ node = ctx.inputs if not isinstance(node, self.node_type): raise ValueError(f'Node {node} is not of type {self.node_type}') # pragma: no cover node = cast(BaseNode[StateT, DepsT, Any], node) return await node.run(GraphRunContext(state=ctx.state, deps=ctx.deps)) ``` #### __init__ ```python __init__( node_type: type[BaseNode[StateT, DepsT, Any]], *, id: NodeID | None = None, label: str | None = None ) ``` Initialize a node step. Parameters: | Name | Type | Description | Default | | ----------- | ------------------------------------ | ----------------------------------------- | ---------------------------------------------------------------- | | `node_type` | `type[BaseNode[StateT, DepsT, Any]]` | The BaseNode class this step will execute | *required* | | `id` | \`NodeID | None\` | Optional unique identifier, defaults to the node's get_node_id() | | `label` | \`str | None\` | Optional human-readable label for this step | Source code in `pydantic_graph/pydantic_graph/beta/step.py` ```python def __init__( self, node_type: type[BaseNode[StateT, DepsT, Any]], *, id: NodeID | None = None, label: str | None = None, ): """Initialize a node step. Args: node_type: The BaseNode class this step will execute id: Optional unique identifier, defaults to the node's get_node_id() label: Optional human-readable label for this step """ super().__init__( id=id or NodeID(node_type.get_node_id()), call=self._call_node, label=label, ) # `type[BaseNode[StateT, DepsT, Any]]` could actually be a `typing._GenericAlias` like `pydantic_ai._agent_graph.UserPromptNode[~DepsT, ~OutputT]`, # so we get the origin to get to the actual class self.node_type = get_origin(node_type) or node_type ``` #### node_type ```python node_type: type[BaseNode[StateT, DepsT, Any]] = ( get_origin(node_type) or node_type ) ``` The BaseNode type this step executes. # `pydantic_graph.exceptions` ### GraphSetupError Bases: `TypeError` Error caused by an incorrectly configured graph. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphSetupError(TypeError): """Error caused by an incorrectly configured graph.""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### GraphBuildingError Bases: `ValueError` An error raised during graph-building. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphBuildingError(ValueError): """An error raised during graph-building.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` The error message. ### GraphValidationError Bases: `ValueError` An error raised during graph validation. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphValidationError(ValueError): """An error raised during graph validation.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` The error message. ### GraphRuntimeError Bases: `RuntimeError` Error caused by an issue during graph execution. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphRuntimeError(RuntimeError): """Error caused by an issue during graph execution.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` The error message. ### GraphNodeStatusError Bases: `GraphRuntimeError` Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphNodeStatusError(GraphRuntimeError): """Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`.""" def __init__(self, actual_status: 'SnapshotStatus'): self.actual_status = actual_status super().__init__(f"Incorrect snapshot status {actual_status!r}, must be 'created' or 'pending'.") @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` #### check ```python check(status: SnapshotStatus) -> None ``` Check if the status is valid. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` # `pydantic_graph` ### Graph Bases: `Generic[StateT, DepsT, RunEndT]` Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. never_42.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` *(This example is complete, it can be run "as is")* See run For an example of running graph, and mermaid_code for an example of generating a mermaid diagram from the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python @dataclass(init=False) class Graph(Generic[StateT, DepsT, RunEndT]): """Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. ```py {title="never_42.py" noqa="I001"} from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` _(This example is complete, it can be run "as is")_ See [`run`][pydantic_graph.graph.Graph.run] For an example of running graph, and [`mermaid_code`][pydantic_graph.graph.Graph.mermaid_code] for an example of generating a mermaid diagram from the graph. """ name: str | None node_defs: dict[str, NodeDef[StateT, DepsT, RunEndT]] _state_type: type[StateT] | _utils.Unset = field(repr=False) _run_end_type: type[RunEndT] | _utils.Unset = field(repr=False) auto_instrument: bool = field(repr=False) def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: # pragma: no branch # Separate variable because we actually don't want logfire's f-string magic here, # we want the span_name to be preformatted for other backends # as requested in https://github.com/pydantic/pydantic-ai/issues/3173. span_name = f'run graph {self.name}' entered_span = stack.enter_context(logfire_span(span_name, graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] @cached_property def inferred_types(self) -> tuple[type[StateT], type[RunEndT]]: # Get the types of the state and run end from the graph. if _utils.is_set(self._state_type) and _utils.is_set(self._run_end_type): return self._state_type, self._run_end_type state_type = self._state_type run_end_type = self._run_end_type for node_def in self.node_defs.values(): for base in typing_extensions.get_original_bases(node_def.node): if typing_extensions.get_origin(base) is BaseNode: args = typing_extensions.get_args(base) if not _utils.is_set(state_type) and args: state_type = args[0] if not _utils.is_set(run_end_type) and len(args) == 3: t = args[2] if not typing_objects.is_never(t): run_end_type = t if _utils.is_set(state_type) and _utils.is_set(run_end_type): return state_type, run_end_type # pyright: ignore[reportReturnType] # break the inner (bases) loop break if not _utils.is_set(state_type): # pragma: no branch # state defaults to None, so use that if we can't infer it state_type = None if not _utils.is_set(run_end_type): # this happens if a graph has no return nodes, use None so any downstream errors are clear run_end_type = None return state_type, run_end_type # pyright: ignore[reportReturnType] def _register_node( self, node: type[BaseNode[StateT, DepsT, RunEndT]], parent_namespace: dict[str, Any] | None, ) -> None: node_id = node.get_node_id() if existing_node := self.node_defs.get(node_id): raise exceptions.GraphSetupError( f'Node ID `{node_id}` is not unique — found on {existing_node.node} and {node}' ) else: self.node_defs[node_id] = node.get_node_def(parent_namespace) def _validate_edges(self): known_node_ids = self.node_defs.keys() bad_edges: dict[str, list[str]] = {} for node_id, node_def in self.node_defs.items(): for edge in node_def.next_node_edges.keys(): if edge not in known_node_ids: bad_edges.setdefault(edge, []).append(f'`{node_id}`') if bad_edges: bad_edges_list = [f'`{k}` is referenced by {_utils.comma_and(v)}' for k, v in bad_edges.items()] if len(bad_edges_list) == 1: raise exceptions.GraphSetupError(f'{bad_edges_list[0]} but not included in the graph.') else: b = '\n'.join(f' {be}' for be in bad_edges_list) raise exceptions.GraphSetupError( f'Nodes are referenced in the graph but not included in the graph:\n{b}' ) def _infer_name(self, function_frame: types.FrameType | None) -> None: """Infer the agent name from the call frame. Usage should be `self._infer_name(inspect.currentframe())`. Copied from `Agent`. """ assert self.name is None, 'Name already set' if function_frame is not None and (parent_frame := function_frame.f_back): # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): # pragma: no branch if item is self: self.name = name return ```` #### __init__ ```python __init__( *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | Unset = UNSET, run_end_type: type[RunEndT] | Unset = UNSET, auto_instrument: bool = True ) ``` Create a graph from a sequence of nodes. Parameters: | Name | Type | Description | Default | | ----------------- | -------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | | `nodes` | `Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]` | The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. | *required* | | `name` | \`str | None\` | Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. | | `state_type` | \`type[StateT] | Unset\` | The type of the state for the graph, this can generally be inferred from nodes. | | `run_end_type` | \`type[RunEndT] | Unset\` | The type of the result of running the graph, this can generally be inferred from nodes. | | `auto_instrument` | `bool` | Whether to create a span for the graph run and the execution of each node's run method. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() ``` #### run ```python run( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Run the graph from a starting node until it ends. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | \`BaseStatePersistence[StateT, RunEndT] | None\` | State persistence interface, defaults to SimpleStatePersistence if None. | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --------------------------------- | ---------------------------------------------------------------------------------- | | `GraphRunResult[StateT, RunEndT]` | A GraphRunResult containing information about the run, including its final result. | Here's an example of running the graph from above: run_never_42.py ```py from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result ```` #### run_sync ```python run_sync( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Synchronously run the graph. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | \`BaseStatePersistence[StateT, RunEndT] | None\` | State persistence interface, defaults to SimpleStatePersistence if None. | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --------------------------------- | --------------------------------------------------------------- | | `GraphRunResult[StateT, RunEndT]` | The result type from ending the run and the history of the run. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) ``` #### iter ```python iter( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling GraphRun.next. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of GraphRun. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | \`BaseStatePersistence[StateT, RunEndT] | None\` | State persistence interface, defaults to SimpleStatePersistence if None. | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | The span to use for the graph run. If not provided, a new span will be created. | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: # pragma: no branch # Separate variable because we actually don't want logfire's f-string magic here, # we want the span_name to be preformatted for other backends # as requested in https://github.com/pydantic/pydantic-ai/issues/3173. span_name = f'run graph {self.name}' entered_span = stack.enter_context(logfire_span(span_name, graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) ``` #### iter_from_persistence ```python iter_from_persistence( persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to iter, but instead of passing the node to run, it will restore the node and state from state persistence. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ------------------------------------------------------- | ------------------------------------------------------------------------------- | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | The state persistence interface to use. | *required* | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `span` | \`AbstractContextManager[AbstractSpan] | None\` | The span to use for the graph run. If not provided, a new span will be created. | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) ``` #### initialize ```python initialize( node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True ) -> None ``` Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via iter_from_persistence. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | ------------------------------------------------------- | ---------- | | `node` | `BaseNode[StateT, DepsT, RunEndT]` | The node to run first. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | The start state of the graph. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) ``` #### mermaid_code ```python mermaid_code( *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, title: str | None | Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: StateDiagramDirection | None = None ) -> str ``` Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls pydantic_graph.mermaid.generate_code. Parameters: | Name | Type | Description | Default | | ------------------- | ----------------------- | ------------------------------------------------------- | ----------------------- | | `start_node` | \`Sequence[NodeIdent] | NodeIdent | None\` | | `title` | \`str | None | Literal[False]\` | | `edge_labels` | `bool` | Whether to include edge labels. | `True` | | `notes` | `bool` | Whether to include notes on each node. | `True` | | `highlighted_nodes` | \`Sequence[NodeIdent] | NodeIdent | None\` | | `highlight_css` | `str` | The CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `direction` | \`StateDiagramDirection | None\` | The direction of flow. | Returns: | Type | Description | | ----- | ------------------------------------------------------------------------ | | `str` | The mermaid code for the graph, which can then be rendered as a diagram. | Here's an example of generating a diagram for the graph from above: mermaid_never_42.py ```py from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ``` --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) ```` #### mermaid_image ```python mermaid_image( infer_name: bool = True, **kwargs: Unpack[MermaidConfig] ) -> bytes ``` Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | ------------ | ----------------------- | ------------------------------------------------------- | ------- | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.request_image. | `{}` | Returns: | Type | Description | | ------- | ---------------- | | `bytes` | The image bytes. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) ``` #### mermaid_save ```python mermaid_save( path: Path | str, /, *, infer_name: bool = True, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | ------------ | ----------------------- | ------------------------------------------------------- | ------------------------------ | | `path` | \`Path | str\` | The path to save the image to. | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.save_image. | `{}` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) ``` #### get_nodes ```python get_nodes() -> ( Sequence[type[BaseNode[StateT, DepsT, RunEndT]]] ) ``` Get the nodes in the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] ``` ### GraphRun Bases: `Generic[StateT, DepsT, RunEndT]` A stateful, async-iterable run of a Graph. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from above: iter_never_42.py ```py from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the GraphRun.next documentation for an example of how to manually drive the graph run. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python class GraphRun(Generic[StateT, DepsT, RunEndT]): """A stateful, async-iterable run of a [`Graph`][pydantic_graph.graph.Graph]. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from [above][pydantic_graph.graph.Graph]: ```py {title="iter_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the [`GraphRun.next` documentation][pydantic_graph.graph.GraphRun.next] for an example of how to manually drive the graph run. """ def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self.__traceparent is None and required: # pragma: no cover raise exceptions.GraphRuntimeError('No span was created for this graph run') return self.__traceparent @property def next_node(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ return self._next_node @property def result(self) -> GraphRunResult[StateT, RunEndT] | None: """The final result of the graph run if the run is completed, otherwise `None`.""" if not isinstance(self._next_node, End): return None # The GraphRun has not finished running return GraphRunResult[StateT, RunEndT]( self._next_node.data, state=self.state, persistence=self.persistence, traceparent=self._traceparent(required=False), ) async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: # pragma: no branch # Separate variable because we actually don't want logfire's f-string magic here, # we want the span_name to be preformatted for other backends # as requested in https://github.com/pydantic/pydantic-ai/issues/3173. span_name = f'run node {node_id}' stack.enter_context(logfire_span(span_name, node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node def __aiter__(self) -> AsyncIterator[BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]]: return self async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) def __repr__(self) -> str: return f'' ```` #### __init__ ```python __init__( *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None ) ``` Create a new run for a given graph, starting at the specified node. Typically, you'll use Graph.iter rather than calling this directly. Parameters: | Name | Type | Description | Default | | ------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | | `graph` | `Graph[StateT, DepsT, RunEndT]` | The Graph to run. | *required* | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | The node where execution will begin. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via ctx.state. | *required* | | `deps` | `DepsT` | Optional dependencies that each node can access via ctx.deps, e.g. database connections, configuration, or logging clients. | *required* | | `traceparent` | \`str | None\` | The traceparent for the span used for the graph run. | | `snapshot_id` | \`str | None\` | The ID of the snapshot the node came from. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False ``` #### next_node ```python next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: GraphRunResult[StateT, RunEndT] | None ``` The final result of the graph run if the run is completed, otherwise `None`. #### next ```python next( node: BaseNode[StateT, DepsT, RunEndT] | None = None, ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an End node. Here's an example of using `next` to drive the graph from above: next_never_42.py ```py from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Parameters: | Name | Type | Description | Default | | ------ | ---------------------------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `node` | \`BaseNode[StateT, DepsT, RunEndT] | None\` | The node to run next in the graph. If not specified, uses self.next_node, which is initialized to the start_node of the run and updated each time a new node is returned. | Returns: | Type | Description | | ---------------------------------- | -------------- | | \`BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]\` | | \`BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]\` | Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: # pragma: no branch # Separate variable because we actually don't want logfire's f-string magic here, # we want the span_name to be preformatted for other backends # as requested in https://github.com/pydantic/pydantic-ai/issues/3173. span_name = f'run node {node_id}' stack.enter_context(logfire_span(span_name, node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node ```` #### __anext__ ```python __anext__() -> ( BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ) ``` Use the last returned node as the input to `Graph.next`. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) ``` ### GraphRunResult Bases: `Generic[StateT, RunEndT]` The final result of running a graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @dataclass(init=False) class GraphRunResult(Generic[StateT, RunEndT]): """The final result of running a graph.""" output: RunEndT state: StateT persistence: BaseStatePersistence[StateT, RunEndT] = field(repr=False) def __init__( self, output: RunEndT, state: StateT, persistence: BaseStatePersistence[StateT, RunEndT], traceparent: str | None = None, ): self.output = output self.state = state self.persistence = persistence self.__traceparent = traceparent @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: # pragma: no cover if self.__traceparent is None and required: raise exceptions.GraphRuntimeError('No span was created for this graph run.') return self.__traceparent ``` # `pydantic_graph.mermaid` ### DEFAULT_HIGHLIGHT_CSS ```python DEFAULT_HIGHLIGHT_CSS = 'fill:#fdff32' ``` The default CSS to use for highlighting nodes. ### StateDiagramDirection ```python StateDiagramDirection = Literal['TB', 'LR', 'RL', 'BT'] ``` Used to specify the direction of the state diagram generated by mermaid. - `'TB'`: Top to bottom, this is the default for mermaid charts. - `'LR'`: Left to right - `'RL'`: Right to left - `'BT'`: Bottom to top ### generate_code ```python generate_code( graph: Graph[Any, Any, Any], /, *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str ``` Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Parameters: | Name | Type | Description | Default | | ------------------- | ----------------------- | ---------------------------------------------- | ------------------------- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `start_node` | \`Sequence[NodeIdent] | NodeIdent | None\` | | `highlighted_nodes` | \`Sequence[NodeIdent] | NodeIdent | None\` | | `highlight_css` | `str` | CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `title` | \`str | None\` | The title of the diagram. | | `edge_labels` | `bool` | Whether to include edge labels in the diagram. | `True` | | `notes` | `bool` | Whether to include notes in the diagram. | `True` | | `direction` | \`StateDiagramDirection | None\` | The direction of flow. | Returns: | Type | Description | | ----- | ------------------------------- | | `str` | The Mermaid code for the graph. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def generate_code( # noqa: C901 graph: Graph[Any, Any, Any], /, *, start_node: Sequence[NodeIdent] | NodeIdent | None = None, highlighted_nodes: Sequence[NodeIdent] | NodeIdent | None = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str: """Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Args: graph: The graph to generate the image for. start_node: Identifiers of nodes that start the graph. highlighted_nodes: Identifiers of nodes to highlight. highlight_css: CSS to use for highlighting nodes. title: The title of the diagram. edge_labels: Whether to include edge labels in the diagram. notes: Whether to include notes in the diagram. direction: The direction of flow. Returns: The Mermaid code for the graph. """ start_node_ids = set(_node_ids(start_node or ())) for node_id in start_node_ids: if node_id not in graph.node_defs: raise LookupError(f'Start node "{node_id}" is not in the graph.') lines: list[str] = [] if title: lines = ['---', f'title: {title}', '---'] lines.append('stateDiagram-v2') if direction is not None: lines.append(f' direction {direction}') for node_id, node_def in graph.node_defs.items(): # we use round brackets (rounded box) for nodes other than the start and end if node_id in start_node_ids: lines.append(f' [*] --> {node_id}') if node_def.returns_base_node: for next_node_id in graph.node_defs: lines.append(f' {node_id} --> {next_node_id}') else: for next_node_id, edge in node_def.next_node_edges.items(): line = f' {node_id} --> {next_node_id}' if edge_labels and edge.label: line += f': {edge.label}' lines.append(line) if end_edge := node_def.end_edge: line = f' {node_id} --> [*]' if edge_labels and end_edge.label: line += f': {end_edge.label}' lines.append(line) if notes and node_def.note: lines.append(f' note right of {node_id}') # mermaid doesn't like multiple paragraphs in a note, and shows if so clean_docs = re.sub('\n{2,}', '\n', node_def.note) lines.append(indent(clean_docs, ' ')) lines.append(' end note') if highlighted_nodes: lines.append('') lines.append(f'classDef highlighted {highlight_css}') for node_id in _node_ids(highlighted_nodes): if node_id not in graph.node_defs: raise LookupError(f'Highlighted node "{node_id}" is not in the graph.') lines.append(f'class {node_id} highlighted') return '\n'.join(lines) ``` ### request_image ```python request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Parameters: | Name | Type | Description | Default | | ---------- | ----------------------- | ------------------------------------------------------------ | ---------- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Returns: | Type | Description | | ------- | --------------- | | `bytes` | The image data. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Args: graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. Returns: The image data. """ code = generate_code( graph, start_node=kwargs.get('start_node'), highlighted_nodes=kwargs.get('highlighted_nodes'), highlight_css=kwargs.get('highlight_css', DEFAULT_HIGHLIGHT_CSS), title=kwargs.get('title'), edge_labels=kwargs.get('edge_labels', True), notes=kwargs.get('notes', True), direction=kwargs.get('direction'), ) code_base64 = base64.b64encode(code.encode()).decode() params: dict[str, str | float] = {} if kwargs.get('image_type') == 'pdf': url = f'https://mermaid.ink/pdf/{code_base64}' if kwargs.get('pdf_fit'): params['fit'] = '' if kwargs.get('pdf_landscape'): params['landscape'] = '' if pdf_paper := kwargs.get('pdf_paper'): params['paper'] = pdf_paper elif kwargs.get('image_type') == 'svg': url = f'https://mermaid.ink/svg/{code_base64}' else: url = f'https://mermaid.ink/img/{code_base64}' if image_type := kwargs.get('image_type'): params['type'] = image_type if background_color := kwargs.get('background_color'): params['bgColor'] = background_color if theme := kwargs.get('theme'): params['theme'] = theme if width := kwargs.get('width'): params['width'] = width if height := kwargs.get('height'): params['height'] = height if scale := kwargs.get('scale'): params['scale'] = scale httpx_client = kwargs.get('httpx_client') or httpx.Client() response = httpx_client.get(url, params=params) if not response.is_success: raise httpx.HTTPStatusError( f'{response.status_code} error generating image:\n{response.text}', request=response.request, response=response, ) return response.content ``` ### save_image ```python save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Parameters: | Name | Type | Description | Default | | ---------- | ----------------------- | ------------------------------------------------------------ | ------------------------------ | | `path` | \`Path | str\` | The path to save the image to. | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Args: path: The path to save the image to. graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. """ if isinstance(path, str): path = Path(path) if 'image_type' not in kwargs: ext = path.suffix.lower()[1:] # no need to check for .jpeg/.jpg, as it is the default if ext in ('png', 'webp', 'svg', 'pdf'): kwargs['image_type'] = ext image_data = request_image(graph, **kwargs) path.write_bytes(image_data) ``` ### MermaidConfig Bases: `TypedDict` Parameters to configure mermaid chart generation. Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python class MermaidConfig(TypedDict, total=False): """Parameters to configure mermaid chart generation.""" start_node: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes that start the graph.""" highlighted_nodes: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes to highlight.""" highlight_css: str """CSS to use for highlighting nodes.""" title: str | None """The title of the diagram.""" edge_labels: bool """Whether to include edge labels in the diagram.""" notes: bool """Whether to include notes on nodes in the diagram, defaults to true.""" image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] """The image type to generate. If unspecified, the default behavior is `'jpeg'`.""" pdf_fit: bool """When using image_type='pdf', whether to fit the diagram to the PDF page.""" pdf_landscape: bool """When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. """ pdf_paper: Literal['letter', 'legal', 'tabloid', 'ledger', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'] """When using image_type='pdf', the paper size of the PDF.""" background_color: str """The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. """ theme: Literal['default', 'neutral', 'dark', 'forest'] """The theme of the diagram. Defaults to 'default'.""" width: int """The width of the diagram.""" height: int """The height of the diagram.""" scale: Annotated[float, Ge(1), Le(3)] """The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. """ httpx_client: httpx.Client """An HTTPX client to use for requests, mostly for testing purposes.""" direction: StateDiagramDirection """The direction of the state diagram.""" ``` #### start_node ```python start_node: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes that start the graph. #### highlighted_nodes ```python highlighted_nodes: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes to highlight. #### highlight_css ```python highlight_css: str ``` CSS to use for highlighting nodes. #### title ```python title: str | None ``` The title of the diagram. #### edge_labels ```python edge_labels: bool ``` Whether to include edge labels in the diagram. #### notes ```python notes: bool ``` Whether to include notes on nodes in the diagram, defaults to true. #### image_type ```python image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] ``` The image type to generate. If unspecified, the default behavior is `'jpeg'`. #### pdf_fit ```python pdf_fit: bool ``` When using image_type='pdf', whether to fit the diagram to the PDF page. #### pdf_landscape ```python pdf_landscape: bool ``` When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. #### pdf_paper ```python pdf_paper: Literal[ "letter", "legal", "tabloid", "ledger", "a0", "a1", "a2", "a3", "a4", "a5", "a6", ] ``` When using image_type='pdf', the paper size of the PDF. #### background_color ```python background_color: str ``` The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. #### theme ```python theme: Literal['default', 'neutral', 'dark', 'forest'] ``` The theme of the diagram. Defaults to 'default'. #### width ```python width: int ``` The width of the diagram. #### height ```python height: int ``` The height of the diagram. #### scale ```python scale: Annotated[float, Ge(1), Le(3)] ``` The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. #### httpx_client ```python httpx_client: Client ``` An HTTPX client to use for requests, mostly for testing purposes. #### direction ```python direction: StateDiagramDirection ``` The direction of the state diagram. ### NodeIdent ```python NodeIdent: TypeAlias = ( "type[BaseNode[Any, Any, Any]] | BaseNode[Any, Any, Any] | str" ) ``` A type alias for a node identifier. This can be: - A node instance (instance of a subclass of BaseNode). - A node class (subclass of BaseNode). - A string representing the node ID. # `pydantic_graph.nodes` ### StateT ```python StateT = TypeVar('StateT', default=None) ``` Type variable for the state in a graph. ### GraphRunContext Bases: `Generic[StateT, DepsT]` Context for a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(kw_only=True) class GraphRunContext(Generic[StateT, DepsT]): """Context for a graph.""" state: StateT """The state of the graph.""" deps: DepsT """Dependencies for the graph.""" ``` #### state ```python state: StateT ``` The state of the graph. #### deps ```python deps: DepsT ``` Dependencies for the graph. ### BaseNode Bases: `ABC`, `Generic[StateT, DepsT, NodeRunEndT]` Base class for a node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python class BaseNode(ABC, Generic[StateT, DepsT, NodeRunEndT]): """Base class for a node.""" docstring_notes: ClassVar[bool] = False """Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the [`get_note`][pydantic_graph.nodes.BaseNode.get_note] method. """ @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: snapshot_id = generate_snapshot_id(self.get_node_id()) object.__setattr__(self, '__snapshot_id', snapshot_id) return snapshot_id def set_snapshot_id(self, snapshot_id: str) -> None: object.__setattr__(self, '__snapshot_id', snapshot_id) @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` #### docstring_notes ```python docstring_notes: bool = False ``` Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the get_note method. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT] ``` Run the node. This is an abstract method that must be implemented by subclasses. Return types used at runtime The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](https://ai.pydantic.dev/api/pydantic_graph/mermaid/index.md) and enforced when running the graph. Parameters: | Name | Type | Description | Default | | ----- | -------------------------------- | ------------------ | ---------- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph context. | *required* | Returns: | Type | Description | | ------------------------------ | ------------------ | | \`BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]\` | Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... ``` #### get_node_id ```python get_node_id() -> str ``` Get the ID of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ ``` #### get_note ```python get_note() -> str | None ``` Get a note about the node to render on mermaid charts. By default, this returns a note only if docstring_notes is `True`. You can override this method to customise the node notes. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring ``` #### get_node_def ```python get_node_def( local_ns: dict[str, Any] | None, ) -> NodeDef[StateT, DepsT, NodeRunEndT] ``` Get the node definition. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) ``` #### deep_copy ```python deep_copy() -> Self ``` Returns a deep copy of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` ### End Bases: `Generic[RunEndT]` Type to return from a node to signal the end of the graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass class End(Generic[RunEndT]): """Type to return from a node to signal the end of the graph.""" data: RunEndT """Data to return from the graph.""" def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: self.__dict__['__snapshot_id'] = snapshot_id = generate_snapshot_id('end') return snapshot_id def set_snapshot_id(self, set_id: str) -> None: self.__dict__['__snapshot_id'] = set_id ``` #### data ```python data: RunEndT ``` Data to return from the graph. #### deep_copy_data ```python deep_copy_data() -> End[RunEndT] ``` Returns a deep copy of the end of the run. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end ``` ### Edge Annotation to apply a label to an edge in a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(frozen=True) class Edge: """Annotation to apply a label to an edge in a graph.""" label: str | None """Label for the edge.""" ``` #### label ```python label: str | None ``` Label for the edge. ### DepsT ```python DepsT = TypeVar('DepsT', default=None, contravariant=True) ``` Type variable for the dependencies of a graph and node. ### RunEndT ```python RunEndT = TypeVar('RunEndT', covariant=True, default=None) ``` Covariant type variable for the return type of a graph run. ### NodeRunEndT ```python NodeRunEndT = TypeVar( "NodeRunEndT", covariant=True, default=Never ) ``` Covariant type variable for the return type of a node run. # `pydantic_graph.persistence` ### SnapshotStatus ```python SnapshotStatus = Literal[ "created", "pending", "running", "success", "error" ] ``` The status of a snapshot. - `'created'`: The snapshot has been created but not yet run. - `'pending'`: The snapshot has been retrieved with load_next but not yet run. - `'running'`: The snapshot is currently running. - `'success'`: The snapshot has been run successfully. - `'error'`: The snapshot has been run but an error occurred. ### NodeSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the execution of a node in a graph. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class NodeSnapshot(Generic[StateT, RunEndT]): """History step describing the execution of a node in a graph.""" state: StateT """The state of the graph before the node is run.""" node: Annotated[BaseNode[StateT, Any, RunEndT], _utils.CustomNodeSchema()] """The node to run next.""" start_ts: datetime | None = None """The timestamp when the node started running, `None` until the run starts.""" duration: float | None = None """The duration of the node run in seconds, if the node has been run.""" status: SnapshotStatus = 'created' """The status of the snapshot.""" kind: Literal['node'] = 'node' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() ``` #### state ```python state: StateT ``` The state of the graph before the node is run. #### node ```python node: Annotated[ BaseNode[StateT, Any, RunEndT], CustomNodeSchema() ] ``` The node to run next. #### start_ts ```python start_ts: datetime | None = None ``` The timestamp when the node started running, `None` until the run starts. #### duration ```python duration: float | None = None ``` The duration of the node run in seconds, if the node has been run. #### status ```python status: SnapshotStatus = 'created' ``` The status of the snapshot. #### kind ```python kind: Literal['node'] = 'node' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. ### EndSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the end of a graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class EndSnapshot(Generic[StateT, RunEndT]): """History step describing the end of a graph run.""" state: StateT """The state of the graph at the end of the run.""" result: End[RunEndT] """The result of the graph run.""" ts: datetime = field(default_factory=_utils.now_utc) """The timestamp when the graph run ended.""" kind: Literal['end'] = 'end' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() @property def node(self) -> End[RunEndT]: """Shim to get the [`result`][pydantic_graph.persistence.EndSnapshot.result]. Useful to allow `[snapshot.node for snapshot in persistence.history]`. """ return self.result ``` #### state ```python state: StateT ``` The state of the graph at the end of the run. #### result ```python result: End[RunEndT] ``` The result of the graph run. #### ts ```python ts: datetime = field(default_factory=now_utc) ``` The timestamp when the graph run ended. #### kind ```python kind: Literal['end'] = 'end' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. #### node ```python node: End[RunEndT] ``` Shim to get the result. Useful to allow `[snapshot.node for snapshot in persistence.history]`. ### Snapshot ```python Snapshot = ( NodeSnapshot[StateT, RunEndT] | EndSnapshot[StateT, RunEndT] ) ``` A step in the history of a graph run. Graph.run returns a list of these steps describing the execution of the graph, together with the run return value. ### BaseStatePersistence Bases: `ABC`, `Generic[StateT, RunEndT]` Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python class BaseStatePersistence(ABC, Generic[StateT, RunEndT]): """Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. """ @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` #### snapshot_node ```python snapshot_node( state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None ``` Snapshot the state of a graph, when the next step is to run a node. This method should add a NodeSnapshot to persistence. Parameters: | Name | Type | Description | Default | | ----------- | -------------------------------- | ----------------------- | ---------- | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_node_if_new ```python snapshot_node_if_new( snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT], ) -> None ``` Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call snapshot_node but should do so in an atomic way. Parameters: | Name | Type | Description | Default | | ------------- | -------------------------------- | -------------------------------- | ---------- | | `snapshot_id` | `str` | The ID of the snapshot to check. | *required* | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_end ```python snapshot_end(state: StateT, end: End[RunEndT]) -> None ``` Snapshot the state of a graph when the graph has ended. This method should add an EndSnapshot to persistence. Parameters: | Name | Type | Description | Default | | ------- | -------------- | ----------------------------- | ---------- | | `state` | `StateT` | The state of the graph. | *required* | | `end` | `End[RunEndT]` | data from the end of the run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError ``` #### record_run ```python record_run( snapshot_id: str, ) -> AbstractAsyncContextManager[None] ``` Record the run of the node, or error if the node is already running. Parameters: | Name | Type | Description | Default | | ------------- | ----- | --------------------------------- | ---------- | | `snapshot_id` | `str` | The ID of the snapshot to record. | *required* | Raises: | Type | Description | | ----------------------- | ------------------------------------------------- | | `GraphNodeRunningError` | if the node status it not 'created' or 'pending'. | | `LookupError` | if the snapshot ID is not found in persistence. | Returns: | Type | Description | | ----------------------------------- | ---------------------------------------------------------- | | `AbstractAsyncContextManager[None]` | An async context manager that records the run of the node. | In particular this should set: - NodeSnapshot.status to `'running'` and NodeSnapshot.start_ts when the run starts. - NodeSnapshot.status to `'success'` or `'error'` and NodeSnapshot.duration when the run finishes. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError ``` #### load_next ```python load_next() -> NodeSnapshot[StateT, RunEndT] | None ``` Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by Graph.iter_from_persistence to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError ``` #### load_all ```python load_all() -> list[Snapshot[StateT, RunEndT]] ``` Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all snapshots from persistence. Returns: The list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError ``` #### set_graph_types ```python set_graph_types(graph: Graph[StateT, Any, RunEndT]) -> None ``` Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement set_types and should_set_types. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False ``` #### set_types ```python set_types( state_type: type[StateT], run_end_type: type[RunEndT] ) -> None ``` Set the types of the state and run end. This can be used to create type adapters for serializing and deserializing snapshots, e.g. with build_snapshot_list_type_adapter. Parameters: | Name | Type | Description | Default | | -------------- | --------------- | ----------------- | ---------- | | `state_type` | `type[StateT]` | The state type. | *required* | | `run_end_type` | `type[RunEndT]` | The run end type. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` ### build_snapshot_list_type_adapter ```python build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> TypeAdapter[list[Snapshot[StateT, RunEndT]]] ``` Build a type adapter for a list of snapshots. This method should be called from within set_types where context variables will be set such that Pydantic can create a schema for NodeSnapshot.node. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]]: """Build a type adapter for a list of snapshots. This method should be called from within [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] where context variables will be set such that Pydantic can create a schema for [`NodeSnapshot.node`][pydantic_graph.persistence.NodeSnapshot.node]. """ return pydantic.TypeAdapter(list[Annotated[Snapshot[state_t, run_end_t], pydantic.Discriminator('kind')]]) ``` In memory state persistence. This module provides simple in memory state persistence for graphs. ### SimpleStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class SimpleStatePersistence(BaseStatePersistence[StateT, RunEndT]): """Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. """ last_snapshot: Snapshot[StateT, RunEndT] | None = None """The last snapshot.""" async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: self.last_snapshot = NodeSnapshot(state=state, node=next_node) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if self.last_snapshot and self.last_snapshot.id == snapshot_id: return # pragma: no cover else: await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: self.last_snapshot = EndSnapshot(state=state, result=end) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: if self.last_snapshot is None or snapshot_id != self.last_snapshot.id: raise LookupError(f'No snapshot found with id={snapshot_id!r}') assert isinstance(self.last_snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(self.last_snapshot.status) self.last_snapshot.status = 'running' self.last_snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: # pragma: no cover self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'error' raise else: self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if isinstance(self.last_snapshot, NodeSnapshot) and self.last_snapshot.status == 'created': self.last_snapshot.status = 'pending' return copy.deepcopy(self.last_snapshot) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: raise NotImplementedError('load is not supported for SimpleStatePersistence') ``` #### last_snapshot ```python last_snapshot: Snapshot[StateT, RunEndT] | None = None ``` The last snapshot. ### FullStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` In memory state persistence that hold a list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class FullStatePersistence(BaseStatePersistence[StateT, RunEndT]): """In memory state persistence that hold a list of snapshots.""" deep_copy: bool = True """Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. """ history: list[Snapshot[StateT, RunEndT]] = field(default_factory=list[Snapshot[StateT, RunEndT]]) """List of snapshots taken during the graph run.""" _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: snapshot = NodeSnapshot( state=self._prep_state(state), node=next_node.deep_copy() if self.deep_copy else next_node, ) self.history.append(snapshot) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if not any(s.id == snapshot_id for s in self.history): await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: snapshot = EndSnapshot( state=self._prep_state(state), result=end.deep_copy_data() if self.deep_copy else end, ) self.history.append(snapshot) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: try: snapshot = next(s for s in self.history if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: snapshot.duration = perf_counter() - start snapshot.status = 'error' raise else: snapshot.duration = perf_counter() - start snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if snapshot := next((s for s in self.history if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' return copy.deepcopy(snapshot) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return self.history def should_set_types(self) -> bool: return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) def _prep_state(self, state: StateT) -> StateT: """Prepare state for snapshot, uses [`copy.deepcopy`][copy.deepcopy] by default.""" if not self.deep_copy or state is None: return state else: return copy.deepcopy(state) ``` #### deep_copy ```python deep_copy: bool = True ``` Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. #### history ```python history: list[Snapshot[StateT, RunEndT]] = field( default_factory=list[Snapshot[StateT, RunEndT]] ) ``` List of snapshots taken during the graph run. #### dump_json ```python dump_json(*, indent: int | None = None) -> bytes ``` Dump the history to JSON bytes. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) ``` #### load_json ```python load_json(json_data: str | bytes | bytearray) -> None ``` Load the history from JSON. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) ``` ### FileStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` File based state persistence that hold graph run state in a JSON file. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ````python @dataclass class FileStatePersistence(BaseStatePersistence[StateT, RunEndT]): """File based state persistence that hold graph run state in a JSON file.""" json_file: Path """Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` """ _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: await self._append_save(NodeSnapshot(state=state, node=next_node)) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: async with self._lock(): snapshots = await self.load_all() if not any(s.id == snapshot_id for s in snapshots): # pragma: no branch await self._append_save(NodeSnapshot(state=state, node=next_node), lock=False) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: await self._append_save(EndSnapshot(state=state, result=end)) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: async with self._lock(): snapshots = await self.load_all() try: snapshot = next(s for s in snapshots if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() await self._save(snapshots) start = perf_counter() try: yield except Exception: duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, duration, 'error') raise else: snapshot.duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, snapshot.duration, 'success') async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: async with self._lock(): snapshots = await self.load_all() if snapshot := next((s for s in snapshots if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' await self._save(snapshots) return snapshot def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return await _graph_utils.run_in_executor(self._load_sync) def _load_sync(self) -> list[Snapshot[StateT, RunEndT]]: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' try: content = self.json_file.read_bytes() except FileNotFoundError: return [] else: return self._snapshots_type_adapter.validate_json(content) def _after_run_sync(self, snapshot_id: str, duration: float, status: SnapshotStatus) -> None: snapshots = self._load_sync() snapshot = next(s for s in snapshots if s.id == snapshot_id) assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' snapshot.duration = duration snapshot.status = status self._save_sync(snapshots) async def _save(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: await _graph_utils.run_in_executor(self._save_sync, snapshots) def _save_sync(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' self.json_file.write_bytes(self._snapshots_type_adapter.dump_json(snapshots, indent=2)) async def _append_save(self, snapshot: Snapshot[StateT, RunEndT], *, lock: bool = True) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' async with AsyncExitStack() as stack: if lock: await stack.enter_async_context(self._lock()) snapshots = await self.load_all() snapshots.append(snapshot) await self._save(snapshots) @asynccontextmanager async def _lock(self, *, timeout: float = 1.0) -> AsyncIterator[None]: """Lock a file by checking and writing a `.pydantic-graph-persistence-lock` to it. Args: timeout: how long to wait for the lock Returns: an async context manager that holds the lock """ lock_file = self.json_file.parent / f'{self.json_file.name}.pydantic-graph-persistence-lock' lock_id = secrets.token_urlsafe().encode() with anyio.fail_after(timeout): while not await _file_append_check(lock_file, lock_id): await anyio.sleep(0.01) try: yield finally: await _graph_utils.run_in_executor(lock_file.unlink, missing_ok=True) ```` #### json_file ```python json_file: Path ``` Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ```python def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None ``` # `pydantic_ai.ui.ag_ui` AG-UI protocol integration for Pydantic AI agents. ### AGUIAdapter Bases: `UIAdapter[RunAgentInput, Message, BaseEvent, AgentDepsT, OutputDataT]` UI adapter for the Agent-User Interaction (AG-UI) protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_adapter.py` ```python class AGUIAdapter(UIAdapter[RunAgentInput, Message, BaseEvent, AgentDepsT, OutputDataT]): """UI adapter for the Agent-User Interaction (AG-UI) protocol.""" @classmethod def build_run_input(cls, body: bytes) -> RunAgentInput: """Build an AG-UI run input object from the request body.""" return RunAgentInput.model_validate_json(body) def build_event_stream(self) -> UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]: """Build an AG-UI event stream transformer.""" return AGUIEventStream(self.run_input, accept=self.accept) @cached_property def messages(self) -> list[ModelMessage]: """Pydantic AI messages from the AG-UI run input.""" return self.load_messages(self.run_input.messages) @cached_property def toolset(self) -> AbstractToolset[AgentDepsT] | None: """Toolset representing frontend tools from the AG-UI run input.""" if self.run_input.tools: return _AGUIFrontendToolset[AgentDepsT](self.run_input.tools) return None @cached_property def state(self) -> dict[str, Any] | None: """Frontend state from the AG-UI run input.""" state = self.run_input.state if state is None: return None if isinstance(state, Mapping) and not state: return None return cast('dict[str, Any]', state) @classmethod def load_messages(cls, messages: Sequence[Message]) -> list[ModelMessage]: # noqa: C901 """Transform AG-UI messages into Pydantic AI messages.""" builder = MessagesBuilder() tool_calls: dict[str, str] = {} # Tool call ID to tool name mapping. for msg in messages: match msg: case UserMessage(content=content): if isinstance(content, str): builder.add(UserPromptPart(content=content)) else: user_prompt_content: list[Any] = [] for part in content: match part: case TextInputContent(text=text): user_prompt_content.append(text) case BinaryInputContent(): if part.url: try: binary_part = BinaryContent.from_data_uri(part.url) except ValueError: media_type_constructors = { 'image': ImageUrl, 'video': VideoUrl, 'audio': AudioUrl, } media_type_prefix = part.mime_type.split('/', 1)[0] constructor = media_type_constructors.get(media_type_prefix, DocumentUrl) binary_part = constructor(url=part.url, media_type=part.mime_type) elif part.data: binary_part = BinaryContent( data=b64decode(part.data), media_type=part.mime_type ) else: # pragma: no cover raise ValueError('BinaryInputContent must have either a `url` or `data` field.') user_prompt_content.append(binary_part) case _: # pragma: no cover raise ValueError(f'Unsupported user message part type: {type(part)}') if user_prompt_content: # pragma: no branch content_to_add = ( user_prompt_content[0] if len(user_prompt_content) == 1 and isinstance(user_prompt_content[0], str) else user_prompt_content ) builder.add(UserPromptPart(content=content_to_add)) case SystemMessage(content=content) | DeveloperMessage(content=content): builder.add(SystemPromptPart(content=content)) case AssistantMessage(content=content, tool_calls=tool_calls_list): if content: builder.add(TextPart(content=content)) if tool_calls_list: for tool_call in tool_calls_list: tool_call_id = tool_call.id tool_name = tool_call.function.name tool_calls[tool_call_id] = tool_name if tool_call_id.startswith(BUILTIN_TOOL_CALL_ID_PREFIX): _, provider_name, original_id = tool_call_id.split('|', 2) builder.add( BuiltinToolCallPart( tool_name=tool_name, args=tool_call.function.arguments, tool_call_id=original_id, provider_name=provider_name, ) ) else: builder.add( ToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=tool_call.function.arguments, ) ) case ToolMessage() as tool_msg: tool_call_id = tool_msg.tool_call_id tool_name = tool_calls.get(tool_call_id) if tool_name is None: # pragma: no cover raise ValueError(f'Tool call with ID {tool_call_id} not found in the history.') if tool_call_id.startswith(BUILTIN_TOOL_CALL_ID_PREFIX): _, provider_name, original_id = tool_call_id.split('|', 2) builder.add( BuiltinToolReturnPart( tool_name=tool_name, content=tool_msg.content, tool_call_id=original_id, provider_name=provider_name, ) ) else: builder.add( ToolReturnPart( tool_name=tool_name, content=tool_msg.content, tool_call_id=tool_call_id, ) ) case ActivityMessage(): pass return builder.messages ``` #### build_run_input ```python build_run_input(body: bytes) -> RunAgentInput ``` Build an AG-UI run input object from the request body. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_adapter.py` ```python @classmethod def build_run_input(cls, body: bytes) -> RunAgentInput: """Build an AG-UI run input object from the request body.""" return RunAgentInput.model_validate_json(body) ``` #### build_event_stream ```python build_event_stream() -> ( UIEventStream[ RunAgentInput, BaseEvent, AgentDepsT, OutputDataT ] ) ``` Build an AG-UI event stream transformer. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_adapter.py` ```python def build_event_stream(self) -> UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]: """Build an AG-UI event stream transformer.""" return AGUIEventStream(self.run_input, accept=self.accept) ``` #### messages ```python messages: list[ModelMessage] ``` Pydantic AI messages from the AG-UI run input. #### toolset ```python toolset: AbstractToolset[AgentDepsT] | None ``` Toolset representing frontend tools from the AG-UI run input. #### state ```python state: dict[str, Any] | None ``` Frontend state from the AG-UI run input. #### load_messages ```python load_messages( messages: Sequence[Message], ) -> list[ModelMessage] ``` Transform AG-UI messages into Pydantic AI messages. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_adapter.py` ```python @classmethod def load_messages(cls, messages: Sequence[Message]) -> list[ModelMessage]: # noqa: C901 """Transform AG-UI messages into Pydantic AI messages.""" builder = MessagesBuilder() tool_calls: dict[str, str] = {} # Tool call ID to tool name mapping. for msg in messages: match msg: case UserMessage(content=content): if isinstance(content, str): builder.add(UserPromptPart(content=content)) else: user_prompt_content: list[Any] = [] for part in content: match part: case TextInputContent(text=text): user_prompt_content.append(text) case BinaryInputContent(): if part.url: try: binary_part = BinaryContent.from_data_uri(part.url) except ValueError: media_type_constructors = { 'image': ImageUrl, 'video': VideoUrl, 'audio': AudioUrl, } media_type_prefix = part.mime_type.split('/', 1)[0] constructor = media_type_constructors.get(media_type_prefix, DocumentUrl) binary_part = constructor(url=part.url, media_type=part.mime_type) elif part.data: binary_part = BinaryContent( data=b64decode(part.data), media_type=part.mime_type ) else: # pragma: no cover raise ValueError('BinaryInputContent must have either a `url` or `data` field.') user_prompt_content.append(binary_part) case _: # pragma: no cover raise ValueError(f'Unsupported user message part type: {type(part)}') if user_prompt_content: # pragma: no branch content_to_add = ( user_prompt_content[0] if len(user_prompt_content) == 1 and isinstance(user_prompt_content[0], str) else user_prompt_content ) builder.add(UserPromptPart(content=content_to_add)) case SystemMessage(content=content) | DeveloperMessage(content=content): builder.add(SystemPromptPart(content=content)) case AssistantMessage(content=content, tool_calls=tool_calls_list): if content: builder.add(TextPart(content=content)) if tool_calls_list: for tool_call in tool_calls_list: tool_call_id = tool_call.id tool_name = tool_call.function.name tool_calls[tool_call_id] = tool_name if tool_call_id.startswith(BUILTIN_TOOL_CALL_ID_PREFIX): _, provider_name, original_id = tool_call_id.split('|', 2) builder.add( BuiltinToolCallPart( tool_name=tool_name, args=tool_call.function.arguments, tool_call_id=original_id, provider_name=provider_name, ) ) else: builder.add( ToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=tool_call.function.arguments, ) ) case ToolMessage() as tool_msg: tool_call_id = tool_msg.tool_call_id tool_name = tool_calls.get(tool_call_id) if tool_name is None: # pragma: no cover raise ValueError(f'Tool call with ID {tool_call_id} not found in the history.') if tool_call_id.startswith(BUILTIN_TOOL_CALL_ID_PREFIX): _, provider_name, original_id = tool_call_id.split('|', 2) builder.add( BuiltinToolReturnPart( tool_name=tool_name, content=tool_msg.content, tool_call_id=original_id, provider_name=provider_name, ) ) else: builder.add( ToolReturnPart( tool_name=tool_name, content=tool_msg.content, tool_call_id=tool_call_id, ) ) case ActivityMessage(): pass return builder.messages ``` ### AGUIEventStream Bases: `UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]` UI event stream transformer for the Agent-User Interaction (AG-UI) protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_event_stream.py` ```python @dataclass class AGUIEventStream(UIEventStream[RunAgentInput, BaseEvent, AgentDepsT, OutputDataT]): """UI event stream transformer for the Agent-User Interaction (AG-UI) protocol.""" _thinking_text: bool = False _builtin_tool_call_ids: dict[str, str] = field(default_factory=dict[str, str]) _error: bool = False @property def _event_encoder(self) -> EventEncoder: return EventEncoder(accept=self.accept or SSE_CONTENT_TYPE) @property def content_type(self) -> str: return self._event_encoder.get_content_type() def encode_event(self, event: BaseEvent) -> str: return self._event_encoder.encode(event) @staticmethod def _get_timestamp() -> int: return int(now_utc().timestamp() * 1_000) async def handle_event(self, event: NativeEvent) -> AsyncIterator[BaseEvent]: """Override to set timestamps on all AG-UI events.""" async for agui_event in super().handle_event(event): if agui_event.timestamp is None: agui_event.timestamp = self._get_timestamp() yield agui_event async def before_stream(self) -> AsyncIterator[BaseEvent]: yield RunStartedEvent( thread_id=self.run_input.thread_id, run_id=self.run_input.run_id, timestamp=self._get_timestamp(), ) async def before_response(self) -> AsyncIterator[BaseEvent]: # Prevent parts from a subsequent response being tied to parts from an earlier response. # See https://github.com/pydantic/pydantic-ai/issues/3316 self.new_message_id() return yield # Make this an async generator async def after_stream(self) -> AsyncIterator[BaseEvent]: if not self._error: yield RunFinishedEvent( thread_id=self.run_input.thread_id, run_id=self.run_input.run_id, timestamp=self._get_timestamp(), ) async def on_error(self, error: Exception) -> AsyncIterator[BaseEvent]: self._error = True yield RunErrorEvent(message=str(error), timestamp=self._get_timestamp()) async def handle_text_start(self, part: TextPart, follows_text: bool = False) -> AsyncIterator[BaseEvent]: if follows_text: message_id = self.message_id else: message_id = self.new_message_id() yield TextMessageStartEvent(message_id=message_id) if part.content: # pragma: no branch yield TextMessageContentEvent(message_id=message_id, delta=part.content) async def handle_text_delta(self, delta: TextPartDelta) -> AsyncIterator[BaseEvent]: if delta.content_delta: # pragma: no branch yield TextMessageContentEvent(message_id=self.message_id, delta=delta.content_delta) async def handle_text_end(self, part: TextPart, followed_by_text: bool = False) -> AsyncIterator[BaseEvent]: if not followed_by_text: yield TextMessageEndEvent(message_id=self.message_id) async def handle_thinking_start( self, part: ThinkingPart, follows_thinking: bool = False ) -> AsyncIterator[BaseEvent]: if not follows_thinking: yield ThinkingStartEvent(type=EventType.THINKING_START) if part.content: yield ThinkingTextMessageStartEvent(type=EventType.THINKING_TEXT_MESSAGE_START) yield ThinkingTextMessageContentEvent(type=EventType.THINKING_TEXT_MESSAGE_CONTENT, delta=part.content) self._thinking_text = True async def handle_thinking_delta(self, delta: ThinkingPartDelta) -> AsyncIterator[BaseEvent]: if not delta.content_delta: return # pragma: no cover if not self._thinking_text: yield ThinkingTextMessageStartEvent(type=EventType.THINKING_TEXT_MESSAGE_START) self._thinking_text = True yield ThinkingTextMessageContentEvent(type=EventType.THINKING_TEXT_MESSAGE_CONTENT, delta=delta.content_delta) async def handle_thinking_end( self, part: ThinkingPart, followed_by_thinking: bool = False ) -> AsyncIterator[BaseEvent]: if self._thinking_text: yield ThinkingTextMessageEndEvent(type=EventType.THINKING_TEXT_MESSAGE_END) self._thinking_text = False if not followed_by_thinking: yield ThinkingEndEvent(type=EventType.THINKING_END) def handle_tool_call_start(self, part: ToolCallPart | BuiltinToolCallPart) -> AsyncIterator[BaseEvent]: return self._handle_tool_call_start(part) def handle_builtin_tool_call_start(self, part: BuiltinToolCallPart) -> AsyncIterator[BaseEvent]: tool_call_id = part.tool_call_id builtin_tool_call_id = '|'.join([BUILTIN_TOOL_CALL_ID_PREFIX, part.provider_name or '', tool_call_id]) self._builtin_tool_call_ids[tool_call_id] = builtin_tool_call_id tool_call_id = builtin_tool_call_id return self._handle_tool_call_start(part, tool_call_id) async def _handle_tool_call_start( self, part: ToolCallPart | BuiltinToolCallPart, tool_call_id: str | None = None ) -> AsyncIterator[BaseEvent]: tool_call_id = tool_call_id or part.tool_call_id parent_message_id = self.message_id yield ToolCallStartEvent( tool_call_id=tool_call_id, tool_call_name=part.tool_name, parent_message_id=parent_message_id ) if part.args: yield ToolCallArgsEvent(tool_call_id=tool_call_id, delta=part.args_as_json_str()) async def handle_tool_call_delta(self, delta: ToolCallPartDelta) -> AsyncIterator[BaseEvent]: tool_call_id = delta.tool_call_id assert tool_call_id, '`ToolCallPartDelta.tool_call_id` must be set' if tool_call_id in self._builtin_tool_call_ids: tool_call_id = self._builtin_tool_call_ids[tool_call_id] yield ToolCallArgsEvent( tool_call_id=tool_call_id, delta=delta.args_delta if isinstance(delta.args_delta, str) else json.dumps(delta.args_delta), ) async def handle_tool_call_end(self, part: ToolCallPart) -> AsyncIterator[BaseEvent]: yield ToolCallEndEvent(tool_call_id=part.tool_call_id) async def handle_builtin_tool_call_end(self, part: BuiltinToolCallPart) -> AsyncIterator[BaseEvent]: yield ToolCallEndEvent(tool_call_id=self._builtin_tool_call_ids[part.tool_call_id]) async def handle_builtin_tool_return(self, part: BuiltinToolReturnPart) -> AsyncIterator[BaseEvent]: tool_call_id = self._builtin_tool_call_ids[part.tool_call_id] yield ToolCallResultEvent( message_id=self.new_message_id(), type=EventType.TOOL_CALL_RESULT, role='tool', tool_call_id=tool_call_id, content=part.model_response_str(), ) async def handle_function_tool_result(self, event: FunctionToolResultEvent) -> AsyncIterator[BaseEvent]: result = event.result output = result.model_response() if isinstance(result, RetryPromptPart) else result.model_response_str() yield ToolCallResultEvent( message_id=self.new_message_id(), type=EventType.TOOL_CALL_RESULT, role='tool', tool_call_id=result.tool_call_id, content=output, ) # ToolCallResultEvent.content may hold user parts (e.g. text, images) that AG-UI does not currently have events for if isinstance(result, ToolReturnPart): # Check for AG-UI events returned by tool calls. possible_event = result.metadata or result.content if isinstance(possible_event, BaseEvent): yield possible_event elif isinstance(possible_event, str | bytes): # pragma: no branch # Avoid iterable check for strings and bytes. pass elif isinstance(possible_event, Iterable): # pragma: no branch for item in possible_event: # type: ignore[reportUnknownMemberType] if isinstance(item, BaseEvent): # pragma: no branch yield item ``` #### handle_event ```python handle_event( event: NativeEvent, ) -> AsyncIterator[BaseEvent] ``` Override to set timestamps on all AG-UI events. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/_event_stream.py` ```python async def handle_event(self, event: NativeEvent) -> AsyncIterator[BaseEvent]: """Override to set timestamps on all AG-UI events.""" async for agui_event in super().handle_event(event): if agui_event.timestamp is None: agui_event.timestamp = self._get_timestamp() yield agui_event ``` AG-UI protocol integration for Pydantic AI agents. ### AGUIApp Bases: `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/app.py` ```python class AGUIApp(Generic[AgentDepsT, OutputDataT], Starlette): """ASGI application for running Pydantic AI agents with AG-UI protocol support.""" def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # AGUIAdapter.dispatch_request parameters output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, # Starlette parameters debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None: """An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`AGUIAdapter.run_stream()`][pydantic_ai.ui.ag_ui.AGUIAdapter.run_stream] or [`AGUIAdapter.dispatch_request()`][pydantic_ai.ui.ag_ui.AGUIAdapter.dispatch_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def run_agent(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" # `dispatch_request` will store the frontend state from the request on `deps.state` (if it implements the `StateHandler` protocol), # so we need to copy the deps to avoid different requests mutating the same deps object. nonlocal deps if isinstance(deps, StateHandler): # pragma: no branch deps = replace(deps) return await AGUIAdapter[AgentDepsT, OutputDataT].dispatch_request( request, agent=agent, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ) self.router.add_route('/', run_agent, methods=['POST']) ``` #### __init__ ```python __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, on_complete: OnCompleteFunc[Any] | None = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None ) -> None ``` An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use AGUIAdapter.run_stream() or AGUIAdapter.dispatch_request() instead. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `agent` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to run. | *required* | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence[AbstractBuiltinTool] | None\` | Optional additional builtin tools for this run. | | `on_complete` | \`OnCompleteFunc[Any] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | \`Sequence[BaseRoute] | None\` | A list of routes to serve incoming HTTP and WebSocket requests. | | `middleware` | \`Sequence[Middleware] | None\` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | | `exception_handlers` | \`Mapping[Any, ExceptionHandler] | None\` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | | `on_startup` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | | `on_shutdown` | \`Sequence\[Callable\[[], Any\]\] | None\` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | | `lifespan` | \`Lifespan[Self] | None\` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | Source code in `pydantic_ai_slim/pydantic_ai/ui/ag_ui/app.py` ```python def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # AGUIAdapter.dispatch_request parameters output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[Any] | None = None, # Starlette parameters debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[Self] | None = None, ) -> None: """An ASGI application that handles every request by running the agent and streaming the response. Note that the `deps` will be the same for each request, with the exception of the frontend state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`AGUIAdapter.run_stream()`][pydantic_ai.ui.ag_ui.AGUIAdapter.run_stream] or [`AGUIAdapter.dispatch_request()`][pydantic_ai.ui.ag_ui.AGUIAdapter.dispatch_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def run_agent(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" # `dispatch_request` will store the frontend state from the request on `deps.state` (if it implements the `StateHandler` protocol), # so we need to copy the deps to avoid different requests mutating the same deps object. nonlocal deps if isinstance(deps, StateHandler): # pragma: no branch deps = replace(deps) return await AGUIAdapter[AgentDepsT, OutputDataT].dispatch_request( request, agent=agent, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ) self.router.add_route('/', run_agent, methods=['POST']) ``` # `pydantic_ai.ui` ### StateDeps Bases: `Generic[StateT]` Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @dataclass class StateDeps(Generic[StateT]): """Dependency type that holds state. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. """ state: StateT ``` ### StateHandler Bases: `Protocol` Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @runtime_checkable class StateHandler(Protocol): """Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.""" # Has to be a dataclass so we can use `replace` to update the state. # From https://github.com/python/typeshed/blob/9ab7fde0a0cd24ed7a72837fcb21093b811b80d8/stdlib/_typeshed/__init__.pyi#L352 __dataclass_fields__: ClassVar[dict[str, Field[Any]]] @property def state(self) -> Any: """Get the current state of the agent run.""" ... @state.setter def state(self, state: Any) -> None: """Set the state of the agent run. This method is called to update the state of the agent run with the provided state. Args: state: The run state. """ ... ``` #### state ```python state: Any ``` Get the current state of the agent run. ### UIAdapter Bases: `ABC`, `Generic[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]` Base class for UI adapters. This class is responsible for transforming agent run input received from the frontend into arguments for Agent.run_stream_events(), running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific UIEventStream subclass. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @dataclass class UIAdapter(ABC, Generic[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]): """Base class for UI adapters. This class is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`][pydantic_ai.agent.Agent.run_stream_events], running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific [`UIEventStream`][pydantic_ai.ui.UIEventStream] subclass. """ agent: AbstractAgent[AgentDepsT, OutputDataT] """The Pydantic AI agent to run.""" run_input: RunInputT """The protocol-specific run input object.""" _: KW_ONLY accept: str | None = None """The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response.""" @classmethod async def from_request( cls, request: Request, *, agent: AbstractAgent[AgentDepsT, OutputDataT] ) -> UIAdapter[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]: """Create an adapter from a request.""" return cls( agent=agent, run_input=cls.build_run_input(await request.body()), accept=request.headers.get('accept'), ) @classmethod @abstractmethod def build_run_input(cls, body: bytes) -> RunInputT: """Build a protocol-specific run input object from the request body.""" raise NotImplementedError @classmethod @abstractmethod def load_messages(cls, messages: Sequence[MessageT]) -> list[ModelMessage]: """Transform protocol-specific messages into Pydantic AI messages.""" raise NotImplementedError @classmethod def dump_messages(cls, messages: Sequence[ModelMessage]) -> list[MessageT]: """Transform Pydantic AI messages into protocol-specific messages.""" raise NotImplementedError @abstractmethod def build_event_stream(self) -> UIEventStream[RunInputT, EventT, AgentDepsT, OutputDataT]: """Build a protocol-specific event stream transformer.""" raise NotImplementedError @cached_property @abstractmethod def messages(self) -> list[ModelMessage]: """Pydantic AI messages from the protocol-specific run input.""" raise NotImplementedError @cached_property def toolset(self) -> AbstractToolset[AgentDepsT] | None: """Toolset representing frontend tools from the protocol-specific run input.""" return None @cached_property def state(self) -> dict[str, Any] | None: """Frontend state from the protocol-specific run input.""" return None def transform_stream( self, stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT]: """Transform a stream of Pydantic AI events into protocol-specific events. Args: stream: The stream of Pydantic AI events to transform. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ return self.build_event_stream().transform_stream(stream, on_complete=on_complete) def encode_stream(self, stream: AsyncIterator[EventT]) -> AsyncIterator[str]: """Encode a stream of protocol-specific events as strings according to the `Accept` header value. Args: stream: The stream of protocol-specific events to encode. """ return self.build_event_stream().encode_stream(stream) def streaming_response(self, stream: AsyncIterator[EventT]) -> StreamingResponse: """Generate a streaming response from a stream of protocol-specific events. Args: stream: The stream of protocol-specific events to encode. """ return self.build_event_stream().streaming_response(stream) def run_stream_native( self, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[NativeEvent]: """Run the agent with the protocol-specific run input and stream Pydantic AI events. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. """ message_history = [*(message_history or []), *self.messages] toolset = self.toolset if toolset: output_type = [output_type or self.agent.output_type, DeferredToolRequests] toolsets = [*(toolsets or []), toolset] if isinstance(deps, StateHandler): raw_state = self.state or {} if isinstance(deps.state, BaseModel): state = type(deps.state).model_validate(raw_state) else: state = raw_state deps.state = state elif self.state: warnings.warn( f'State was provided but `deps` of type `{type(deps).__name__}` does not implement the `StateHandler` protocol, so the state was ignored. Use `StateDeps[...]` or implement `StateHandler` to receive AG-UI state.', UserWarning, stacklevel=2, ) return self.agent.run_stream_events( output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, instructions=instructions, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) def run_stream( self, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT]: """Run the agent with the protocol-specific run input and stream protocol-specific events. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ return self.transform_stream( self.run_stream_native( output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ), on_complete=on_complete, ) @classmethod async def dispatch_request( cls, request: Request, *, agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[DispatchDepsT] = None, deps: DispatchDepsT = None, output_type: OutputSpec[Any] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[DispatchDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, ) -> Response: """Handle a protocol-specific HTTP request by running the agent and returning a streaming response of protocol-specific events. Args: request: The incoming Starlette/FastAPI request. agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. Returns: A streaming Starlette response with protocol-specific events encoded per the request's `Accept` header value. """ try: from starlette.responses import Response except ImportError as e: # pragma: no cover raise ImportError( 'Please install the `starlette` package to use `dispatch_request()` method, ' 'you can use the `ui` optional group — `pip install "pydantic-ai-slim[ui]"`' ) from e try: # The DepsT and OutputDataT come from `agent`, not from `cls`; the cast is necessary to explain this to pyright adapter = cast( UIAdapter[RunInputT, MessageT, EventT, DispatchDepsT, DispatchOutputDataT], await cls.from_request(request, agent=cast(AbstractAgent[AgentDepsT, OutputDataT], agent)), ) except ValidationError as e: # pragma: no cover return Response( content=e.json(), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) return adapter.streaming_response( adapter.run_stream( message_history=message_history, deferred_tool_results=deferred_tool_results, deps=deps, output_type=output_type, model=model, instructions=instructions, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ), ) ``` #### agent ```python agent: AbstractAgent[AgentDepsT, OutputDataT] ``` The Pydantic AI agent to run. #### run_input ```python run_input: RunInputT ``` The protocol-specific run input object. #### accept ```python accept: str | None = None ``` The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response. #### from_request ```python from_request( request: Request, *, agent: AbstractAgent[AgentDepsT, OutputDataT] ) -> UIAdapter[ RunInputT, MessageT, EventT, AgentDepsT, OutputDataT ] ``` Create an adapter from a request. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @classmethod async def from_request( cls, request: Request, *, agent: AbstractAgent[AgentDepsT, OutputDataT] ) -> UIAdapter[RunInputT, MessageT, EventT, AgentDepsT, OutputDataT]: """Create an adapter from a request.""" return cls( agent=agent, run_input=cls.build_run_input(await request.body()), accept=request.headers.get('accept'), ) ``` #### build_run_input ```python build_run_input(body: bytes) -> RunInputT ``` Build a protocol-specific run input object from the request body. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @classmethod @abstractmethod def build_run_input(cls, body: bytes) -> RunInputT: """Build a protocol-specific run input object from the request body.""" raise NotImplementedError ``` #### load_messages ```python load_messages( messages: Sequence[MessageT], ) -> list[ModelMessage] ``` Transform protocol-specific messages into Pydantic AI messages. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @classmethod @abstractmethod def load_messages(cls, messages: Sequence[MessageT]) -> list[ModelMessage]: """Transform protocol-specific messages into Pydantic AI messages.""" raise NotImplementedError ``` #### dump_messages ```python dump_messages( messages: Sequence[ModelMessage], ) -> list[MessageT] ``` Transform Pydantic AI messages into protocol-specific messages. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @classmethod def dump_messages(cls, messages: Sequence[ModelMessage]) -> list[MessageT]: """Transform Pydantic AI messages into protocol-specific messages.""" raise NotImplementedError ``` #### build_event_stream ```python build_event_stream() -> ( UIEventStream[ RunInputT, EventT, AgentDepsT, OutputDataT ] ) ``` Build a protocol-specific event stream transformer. Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @abstractmethod def build_event_stream(self) -> UIEventStream[RunInputT, EventT, AgentDepsT, OutputDataT]: """Build a protocol-specific event stream transformer.""" raise NotImplementedError ``` #### messages ```python messages: list[ModelMessage] ``` Pydantic AI messages from the protocol-specific run input. #### toolset ```python toolset: AbstractToolset[AgentDepsT] | None ``` Toolset representing frontend tools from the protocol-specific run input. #### state ```python state: dict[str, Any] | None ``` Frontend state from the protocol-specific run input. #### transform_stream ```python transform_stream( stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT] ``` Transform a stream of Pydantic AI events into protocol-specific events. Parameters: | Name | Type | Description | Default | | ------------- | ---------------------------- | ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `stream` | `AsyncIterator[NativeEvent]` | The stream of Pydantic AI events to transform. | *required* | | `on_complete` | \`OnCompleteFunc[EventT] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can optionally yield additional protocol-specific events. | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python def transform_stream( self, stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT]: """Transform a stream of Pydantic AI events into protocol-specific events. Args: stream: The stream of Pydantic AI events to transform. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ return self.build_event_stream().transform_stream(stream, on_complete=on_complete) ``` #### encode_stream ```python encode_stream( stream: AsyncIterator[EventT], ) -> AsyncIterator[str] ``` Encode a stream of protocol-specific events as strings according to the `Accept` header value. Parameters: | Name | Type | Description | Default | | -------- | ----------------------- | ------------------------------------------------- | ---------- | | `stream` | `AsyncIterator[EventT]` | The stream of protocol-specific events to encode. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python def encode_stream(self, stream: AsyncIterator[EventT]) -> AsyncIterator[str]: """Encode a stream of protocol-specific events as strings according to the `Accept` header value. Args: stream: The stream of protocol-specific events to encode. """ return self.build_event_stream().encode_stream(stream) ``` #### streaming_response ```python streaming_response( stream: AsyncIterator[EventT], ) -> StreamingResponse ``` Generate a streaming response from a stream of protocol-specific events. Parameters: | Name | Type | Description | Default | | -------- | ----------------------- | ------------------------------------------------- | ---------- | | `stream` | `AsyncIterator[EventT]` | The stream of protocol-specific events to encode. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python def streaming_response(self, stream: AsyncIterator[EventT]) -> StreamingResponse: """Generate a streaming response from a stream of protocol-specific events. Args: stream: The stream of protocol-specific events to encode. """ return self.build_event_stream().streaming_response(stream) ``` #### run_stream_native ```python run_stream_native( *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[NativeEvent] ``` Run the agent with the protocol-specific run input and stream Pydantic AI events. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence[AbstractBuiltinTool] | None\` | Optional additional builtin tools to use for this run. | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python def run_stream_native( self, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[NativeEvent]: """Run the agent with the protocol-specific run input and stream Pydantic AI events. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. """ message_history = [*(message_history or []), *self.messages] toolset = self.toolset if toolset: output_type = [output_type or self.agent.output_type, DeferredToolRequests] toolsets = [*(toolsets or []), toolset] if isinstance(deps, StateHandler): raw_state = self.state or {} if isinstance(deps.state, BaseModel): state = type(deps.state).model_validate(raw_state) else: state = raw_state deps.state = state elif self.state: warnings.warn( f'State was provided but `deps` of type `{type(deps).__name__}` does not implement the `StateHandler` protocol, so the state was ignored. Use `StateDeps[...]` or implement `StateHandler` to receive AG-UI state.', UserWarning, stacklevel=2, ) return self.agent.run_stream_events( output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, instructions=instructions, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ``` #### run_stream ```python run_stream( *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, on_complete: OnCompleteFunc[EventT] | None = None ) -> AsyncIterator[EventT] ``` Run the agent with the protocol-specific run input and stream protocol-specific events. Parameters: | Name | Type | Description | Default | | ----------------------- | ----------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[AgentDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[AgentDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[AgentDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence[AbstractBuiltinTool] | None\` | Optional additional builtin tools to use for this run. | | `on_complete` | \`OnCompleteFunc[EventT] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can optionally yield additional protocol-specific events. | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python def run_stream( self, *, output_type: OutputSpec[Any] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[AgentDepsT] = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[AgentDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT]: """Run the agent with the protocol-specific run input and stream protocol-specific events. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ return self.transform_stream( self.run_stream_native( output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, instructions=instructions, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ), on_complete=on_complete, ) ``` #### dispatch_request ```python dispatch_request( request: Request, *, agent: AbstractAgent[ DispatchDepsT, DispatchOutputDataT ], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[DispatchDepsT] = None, deps: DispatchDepsT = None, output_type: OutputSpec[Any] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[DispatchDepsT] | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[DispatchDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, on_complete: OnCompleteFunc[EventT] | None = None ) -> Response ``` Handle a protocol-specific HTTP request by running the agent and returning a streaming response of protocol-specific events. Parameters: | Name | Type | Description | Default | | ----------------------- | --------------------------------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `request` | `Request` | The incoming Starlette/FastAPI request. | *required* | | `agent` | `AbstractAgent[DispatchDepsT, DispatchOutputDataT]` | The agent to run. | *required* | | `output_type` | \`OutputSpec[Any] | None\` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | | `message_history` | \`Sequence[ModelMessage] | None\` | History of the conversation so far. | | `deferred_tool_results` | \`DeferredToolResults | None\` | Optional results for deferred tool calls in the message history. | | `model` | \`Model | KnownModelName | str | | `instructions` | `Instructions[DispatchDepsT]` | Optional additional instructions to use for this run. | `None` | | `deps` | `DispatchDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | \`ModelSettings | None\` | Optional settings to use for this model's request. | | `usage_limits` | \`UsageLimits | None\` | Optional limits on model request count or token usage. | | `usage` | \`RunUsage | None\` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | | `metadata` | \`AgentMetadata[DispatchDepsT] | None\` | Optional metadata to attach to this run. Accepts a dictionary or a callable taking RunContext; merged with the agent's configured metadata. | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | \`Sequence\[AbstractToolset[DispatchDepsT]\] | None\` | Optional additional toolsets for this run. | | `builtin_tools` | \`Sequence[AbstractBuiltinTool] | None\` | Optional additional builtin tools to use for this run. | | `on_complete` | \`OnCompleteFunc[EventT] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can optionally yield additional protocol-specific events. | Returns: | Type | Description | | ---------- | ----------------------------------------------------------------------------------------------------------- | | `Response` | A streaming Starlette response with protocol-specific events encoded per the request's Accept header value. | Source code in `pydantic_ai_slim/pydantic_ai/ui/_adapter.py` ```python @classmethod async def dispatch_request( cls, request: Request, *, agent: AbstractAgent[DispatchDepsT, DispatchOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: Model | KnownModelName | str | None = None, instructions: Instructions[DispatchDepsT] = None, deps: DispatchDepsT = None, output_type: OutputSpec[Any] | None = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, metadata: AgentMetadata[DispatchDepsT] | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[DispatchDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, on_complete: OnCompleteFunc[EventT] | None = None, ) -> Response: """Handle a protocol-specific HTTP request by running the agent and returning a streaming response of protocol-specific events. Args: request: The incoming Starlette/FastAPI request. agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. instructions: Optional additional instructions to use for this run. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. metadata: Optional metadata to attach to this run. Accepts a dictionary or a callable taking [`RunContext`][pydantic_ai.tools.RunContext]; merged with the agent's configured metadata. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools to use for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. Returns: A streaming Starlette response with protocol-specific events encoded per the request's `Accept` header value. """ try: from starlette.responses import Response except ImportError as e: # pragma: no cover raise ImportError( 'Please install the `starlette` package to use `dispatch_request()` method, ' 'you can use the `ui` optional group — `pip install "pydantic-ai-slim[ui]"`' ) from e try: # The DepsT and OutputDataT come from `agent`, not from `cls`; the cast is necessary to explain this to pyright adapter = cast( UIAdapter[RunInputT, MessageT, EventT, DispatchDepsT, DispatchOutputDataT], await cls.from_request(request, agent=cast(AbstractAgent[AgentDepsT, OutputDataT], agent)), ) except ValidationError as e: # pragma: no cover return Response( content=e.json(), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) return adapter.streaming_response( adapter.run_stream( message_history=message_history, deferred_tool_results=deferred_tool_results, deps=deps, output_type=output_type, model=model, instructions=instructions, model_settings=model_settings, usage_limits=usage_limits, usage=usage, metadata=metadata, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, on_complete=on_complete, ), ) ``` ### SSE_CONTENT_TYPE ```python SSE_CONTENT_TYPE = 'text/event-stream' ``` Content type header value for Server-Sent Events (SSE). ### NativeEvent ```python NativeEvent: TypeAlias = ( AgentStreamEvent | AgentRunResultEvent[Any] ) ``` Type alias for the native event type, which is either an `AgentStreamEvent` or an `AgentRunResultEvent`. ### OnCompleteFunc ```python OnCompleteFunc: TypeAlias = ( Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] | Callable[[AgentRunResult[Any]], AsyncIterator[EventT]] ) ``` Callback function type that receives the `AgentRunResult` of the completed run. Can be sync, async, or an async generator of protocol-specific events. ### UIEventStream Bases: `ABC`, `Generic[RunInputT, EventT, AgentDepsT, OutputDataT]` Base class for UI event stream transformers. This class is responsible for transforming Pydantic AI events into protocol-specific events. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python @dataclass class UIEventStream(ABC, Generic[RunInputT, EventT, AgentDepsT, OutputDataT]): """Base class for UI event stream transformers. This class is responsible for transforming Pydantic AI events into protocol-specific events. """ run_input: RunInputT accept: str | None = None """The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response.""" message_id: str = field(default_factory=lambda: str(uuid4())) """The message ID to use for the next event.""" _turn: Literal['request', 'response'] | None = None _result: AgentRunResult[OutputDataT] | None = None _final_result_event: FinalResultEvent | None = None def new_message_id(self) -> str: """Generate and store a new message ID.""" self.message_id = str(uuid4()) return self.message_id @property def response_headers(self) -> Mapping[str, str] | None: """Response headers to return to the frontend.""" return None @property def content_type(self) -> str: """Get the content type for the event stream, compatible with the `Accept` header value. By default, this returns the Server-Sent Events content type (`text/event-stream`). If a subclass supports other types as well, it should consider `self.accept` in [`encode_event()`][pydantic_ai.ui.UIEventStream.encode_event] and return the resulting content type. """ return SSE_CONTENT_TYPE @abstractmethod def encode_event(self, event: EventT) -> str: """Encode a protocol-specific event as a string.""" raise NotImplementedError async def encode_stream(self, stream: AsyncIterator[EventT]) -> AsyncIterator[str]: """Encode a stream of protocol-specific events as strings according to the `Accept` header value.""" async for event in stream: yield self.encode_event(event) def streaming_response(self, stream: AsyncIterator[EventT]) -> StreamingResponse: """Generate a streaming response from a stream of protocol-specific events.""" try: from starlette.responses import StreamingResponse except ImportError as e: # pragma: no cover raise ImportError( 'Please install the `starlette` package to use the `streaming_response()` method, ' 'you can use the `ui` optional group — `pip install "pydantic-ai-slim[ui]"`' ) from e return StreamingResponse( self.encode_stream(stream), headers=self.response_headers, media_type=self.content_type, ) async def transform_stream( # noqa: C901 self, stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None ) -> AsyncIterator[EventT]: """Transform a stream of Pydantic AI events into protocol-specific events. This method dispatches to specific hooks and `handle_*` methods that subclasses can override: - [`before_stream()`][pydantic_ai.ui.UIEventStream.before_stream] - [`after_stream()`][pydantic_ai.ui.UIEventStream.after_stream] - [`on_error()`][pydantic_ai.ui.UIEventStream.on_error] - [`before_request()`][pydantic_ai.ui.UIEventStream.before_request] - [`after_request()`][pydantic_ai.ui.UIEventStream.after_request] - [`before_response()`][pydantic_ai.ui.UIEventStream.before_response] - [`after_response()`][pydantic_ai.ui.UIEventStream.after_response] - [`handle_event()`][pydantic_ai.ui.UIEventStream.handle_event] Args: stream: The stream of Pydantic AI events to transform. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ async for e in self.before_stream(): yield e try: async for event in stream: if isinstance(event, PartStartEvent): async for e in self._turn_to('response'): yield e elif isinstance(event, FunctionToolCallEvent): async for e in self._turn_to('request'): yield e elif isinstance(event, AgentRunResultEvent): if ( self._final_result_event and (tool_call_id := self._final_result_event.tool_call_id) and (tool_name := self._final_result_event.tool_name) ): async for e in self._turn_to('request'): yield e self._final_result_event = None # Ensure the stream does not end on a dangling tool call without a result. output_tool_result_event = FunctionToolResultEvent( result=ToolReturnPart( tool_call_id=tool_call_id, tool_name=tool_name, content='Final result processed.', ) ) async for e in self.handle_function_tool_result(output_tool_result_event): yield e result = cast(AgentRunResult[OutputDataT], event.result) self._result = result async for e in self._turn_to(None): yield e if on_complete is not None: if inspect.isasyncgenfunction(on_complete): async for e in on_complete(result): yield e elif _utils.is_async_callable(on_complete): await on_complete(result) else: await _utils.run_in_executor(on_complete, result) elif isinstance(event, FinalResultEvent): self._final_result_event = event if isinstance(event, BuiltinToolCallEvent | BuiltinToolResultEvent): # pyright: ignore[reportDeprecated] # These events were deprecated before this feature was introduced continue async for e in self.handle_event(event): yield e except Exception as e: async for e in self.on_error(e): yield e finally: async for e in self._turn_to(None): yield e async for e in self.after_stream(): yield e async def _turn_to(self, to_turn: Literal['request', 'response'] | None) -> AsyncIterator[EventT]: """Fire hooks when turning from request to response or vice versa.""" if to_turn == self._turn: return if self._turn == 'request': async for e in self.after_request(): yield e elif self._turn == 'response': async for e in self.after_response(): yield e self._turn = to_turn if to_turn == 'request': async for e in self.before_request(): yield e elif to_turn == 'response': async for e in self.before_response(): yield e async def handle_event(self, event: NativeEvent) -> AsyncIterator[EventT]: """Transform a Pydantic AI event into one or more protocol-specific events. This method dispatches to specific `handle_*` methods based on event type: - [`PartStartEvent`][pydantic_ai.messages.PartStartEvent] -> [`handle_part_start()`][pydantic_ai.ui.UIEventStream.handle_part_start] - [`PartDeltaEvent`][pydantic_ai.messages.PartDeltaEvent] -> `handle_part_delta` - [`PartEndEvent`][pydantic_ai.messages.PartEndEvent] -> `handle_part_end` - [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] -> `handle_final_result` - [`FunctionToolCallEvent`][pydantic_ai.messages.FunctionToolCallEvent] -> `handle_function_tool_call` - [`FunctionToolResultEvent`][pydantic_ai.messages.FunctionToolResultEvent] -> `handle_function_tool_result` - [`AgentRunResultEvent`][pydantic_ai.run.AgentRunResultEvent] -> `handle_run_result` Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all events, make sure you call the super method. """ match event: case PartStartEvent(): async for e in self.handle_part_start(event): yield e case PartDeltaEvent(): async for e in self.handle_part_delta(event): yield e case PartEndEvent(): async for e in self.handle_part_end(event): yield e case FinalResultEvent(): async for e in self.handle_final_result(event): yield e case FunctionToolCallEvent(): async for e in self.handle_function_tool_call(event): yield e case FunctionToolResultEvent(): async for e in self.handle_function_tool_result(event): yield e case AgentRunResultEvent(): async for e in self.handle_run_result(event): yield e case _: pass async def handle_part_start(self, event: PartStartEvent) -> AsyncIterator[EventT]: """Handle a `PartStartEvent`. This method dispatches to specific `handle_*` methods based on part type: - [`TextPart`][pydantic_ai.messages.TextPart] -> [`handle_text_start()`][pydantic_ai.ui.UIEventStream.handle_text_start] - [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] -> [`handle_thinking_start()`][pydantic_ai.ui.UIEventStream.handle_thinking_start] - [`ToolCallPart`][pydantic_ai.messages.ToolCallPart] -> [`handle_tool_call_start()`][pydantic_ai.ui.UIEventStream.handle_tool_call_start] - [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] -> [`handle_builtin_tool_call_start()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_start] - [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] -> [`handle_builtin_tool_return()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_return] - [`FilePart`][pydantic_ai.messages.FilePart] -> [`handle_file()`][pydantic_ai.ui.UIEventStream.handle_file] Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all part start events, make sure you call the super method. Args: event: The part start event. """ part = event.part previous_part_kind = event.previous_part_kind match part: case TextPart(): async for e in self.handle_text_start(part, follows_text=previous_part_kind == 'text'): yield e case ThinkingPart(): async for e in self.handle_thinking_start(part, follows_thinking=previous_part_kind == 'thinking'): yield e case ToolCallPart(): async for e in self.handle_tool_call_start(part): yield e case BuiltinToolCallPart(): async for e in self.handle_builtin_tool_call_start(part): yield e case BuiltinToolReturnPart(): async for e in self.handle_builtin_tool_return(part): yield e case FilePart(): # pragma: no branch async for e in self.handle_file(part): yield e async def handle_part_delta(self, event: PartDeltaEvent) -> AsyncIterator[EventT]: """Handle a PartDeltaEvent. This method dispatches to specific `handle_*_delta` methods based on part delta type: - [`TextPartDelta`][pydantic_ai.messages.TextPartDelta] -> [`handle_text_delta()`][pydantic_ai.ui.UIEventStream.handle_text_delta] - [`ThinkingPartDelta`][pydantic_ai.messages.ThinkingPartDelta] -> [`handle_thinking_delta()`][pydantic_ai.ui.UIEventStream.handle_thinking_delta] - [`ToolCallPartDelta`][pydantic_ai.messages.ToolCallPartDelta] -> [`handle_tool_call_delta()`][pydantic_ai.ui.UIEventStream.handle_tool_call_delta] Subclasses are encouraged to override the individual `handle_*_delta` methods rather than this one. If you need specific behavior for all part delta events, make sure you call the super method. Args: event: The PartDeltaEvent. """ delta = event.delta match delta: case TextPartDelta(): async for e in self.handle_text_delta(delta): yield e case ThinkingPartDelta(): async for e in self.handle_thinking_delta(delta): yield e case ToolCallPartDelta(): # pragma: no branch async for e in self.handle_tool_call_delta(delta): yield e async def handle_part_end(self, event: PartEndEvent) -> AsyncIterator[EventT]: """Handle a `PartEndEvent`. This method dispatches to specific `handle_*_end` methods based on part type: - [`TextPart`][pydantic_ai.messages.TextPart] -> [`handle_text_end()`][pydantic_ai.ui.UIEventStream.handle_text_end] - [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] -> [`handle_thinking_end()`][pydantic_ai.ui.UIEventStream.handle_thinking_end] - [`ToolCallPart`][pydantic_ai.messages.ToolCallPart] -> [`handle_tool_call_end()`][pydantic_ai.ui.UIEventStream.handle_tool_call_end] - [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] -> [`handle_builtin_tool_call_end()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_end] Subclasses are encouraged to override the individual `handle_*_end` methods rather than this one. If you need specific behavior for all part end events, make sure you call the super method. Args: event: The part end event. """ part = event.part next_part_kind = event.next_part_kind match part: case TextPart(): async for e in self.handle_text_end(part, followed_by_text=next_part_kind == 'text'): yield e case ThinkingPart(): async for e in self.handle_thinking_end(part, followed_by_thinking=next_part_kind == 'thinking'): yield e case ToolCallPart(): async for e in self.handle_tool_call_end(part): yield e case BuiltinToolCallPart(): async for e in self.handle_builtin_tool_call_end(part): yield e case BuiltinToolReturnPart() | FilePart(): # pragma: no cover # These don't have deltas, so they don't need to be ended. pass async def before_stream(self) -> AsyncIterator[EventT]: """Yield events before agent streaming starts. This hook is called before any agent events are processed. Override this to inject custom events at the start of the stream. """ return # pragma: no cover yield # Make this an async generator async def after_stream(self) -> AsyncIterator[EventT]: """Yield events after agent streaming completes. This hook is called after all agent events have been processed. Override this to inject custom events at the end of the stream. """ return # pragma: no cover yield # Make this an async generator async def on_error(self, error: Exception) -> AsyncIterator[EventT]: """Handle errors that occur during streaming. Args: error: The error that occurred during streaming. """ return # pragma: no cover yield # Make this an async generator async def before_request(self) -> AsyncIterator[EventT]: """Yield events before a model request is processed. Override this to inject custom events at the start of the request. """ return # pragma: lax no cover yield # Make this an async generator async def after_request(self) -> AsyncIterator[EventT]: """Yield events after a model request is processed. Override this to inject custom events at the end of the request. """ return # pragma: lax no cover yield # Make this an async generator async def before_response(self) -> AsyncIterator[EventT]: """Yield events before a model response is processed. Override this to inject custom events at the start of the response. """ return # pragma: no cover yield # Make this an async generator async def after_response(self) -> AsyncIterator[EventT]: """Yield events after a model response is processed. Override this to inject custom events at the end of the response. """ return # pragma: lax no cover yield # Make this an async generator async def handle_text_start(self, part: TextPart, follows_text: bool = False) -> AsyncIterator[EventT]: """Handle the start of a `TextPart`. Args: part: The text part. follows_text: Whether the part is directly preceded by another text part. In this case, you may want to yield a "text-delta" event instead of a "text-start" event. """ return # pragma: no cover yield # Make this an async generator async def handle_text_delta(self, delta: TextPartDelta) -> AsyncIterator[EventT]: """Handle a `TextPartDelta`. Args: delta: The text part delta. """ return # pragma: no cover yield # Make this an async generator async def handle_text_end(self, part: TextPart, followed_by_text: bool = False) -> AsyncIterator[EventT]: """Handle the end of a `TextPart`. Args: part: The text part. followed_by_text: Whether the part is directly followed by another text part. In this case, you may not want to yield a "text-end" event yet. """ return # pragma: no cover yield # Make this an async generator async def handle_thinking_start(self, part: ThinkingPart, follows_thinking: bool = False) -> AsyncIterator[EventT]: """Handle the start of a `ThinkingPart`. Args: part: The thinking part. follows_thinking: Whether the part is directly preceded by another thinking part. In this case, you may want to yield a "thinking-delta" event instead of a "thinking-start" event. """ return # pragma: no cover yield # Make this an async generator async def handle_thinking_delta(self, delta: ThinkingPartDelta) -> AsyncIterator[EventT]: """Handle a `ThinkingPartDelta`. Args: delta: The thinking part delta. """ return # pragma: no cover yield # Make this an async generator async def handle_thinking_end( self, part: ThinkingPart, followed_by_thinking: bool = False ) -> AsyncIterator[EventT]: """Handle the end of a `ThinkingPart`. Args: part: The thinking part. followed_by_thinking: Whether the part is directly followed by another thinking part. In this case, you may not want to yield a "thinking-end" event yet. """ return # pragma: no cover yield # Make this an async generator async def handle_tool_call_start(self, part: ToolCallPart) -> AsyncIterator[EventT]: """Handle the start of a `ToolCallPart`. Args: part: The tool call part. """ return # pragma: no cover yield # Make this an async generator async def handle_tool_call_delta(self, delta: ToolCallPartDelta) -> AsyncIterator[EventT]: """Handle a `ToolCallPartDelta`. Args: delta: The tool call part delta. """ return # pragma: no cover yield # Make this an async generator async def handle_tool_call_end(self, part: ToolCallPart) -> AsyncIterator[EventT]: """Handle the end of a `ToolCallPart`. Args: part: The tool call part. """ return # pragma: no cover yield # Make this an async generator async def handle_builtin_tool_call_start(self, part: BuiltinToolCallPart) -> AsyncIterator[EventT]: """Handle a `BuiltinToolCallPart` at start. Args: part: The builtin tool call part. """ return # pragma: no cover yield # Make this an async generator async def handle_builtin_tool_call_end(self, part: BuiltinToolCallPart) -> AsyncIterator[EventT]: """Handle the end of a `BuiltinToolCallPart`. Args: part: The builtin tool call part. """ return # pragma: no cover yield # Make this an async generator async def handle_builtin_tool_return(self, part: BuiltinToolReturnPart) -> AsyncIterator[EventT]: """Handle a `BuiltinToolReturnPart`. Args: part: The builtin tool return part. """ return # pragma: no cover yield # Make this an async generator async def handle_file(self, part: FilePart) -> AsyncIterator[EventT]: """Handle a `FilePart`. Args: part: The file part. """ return # pragma: no cover yield # Make this an async generator async def handle_final_result(self, event: FinalResultEvent) -> AsyncIterator[EventT]: """Handle a `FinalResultEvent`. Args: event: The final result event. """ return yield # Make this an async generator async def handle_function_tool_call(self, event: FunctionToolCallEvent) -> AsyncIterator[EventT]: """Handle a `FunctionToolCallEvent`. Args: event: The function tool call event. """ return yield # Make this an async generator async def handle_function_tool_result(self, event: FunctionToolResultEvent) -> AsyncIterator[EventT]: """Handle a `FunctionToolResultEvent`. Args: event: The function tool result event. """ return # pragma: no cover yield # Make this an async generator async def handle_run_result(self, event: AgentRunResultEvent) -> AsyncIterator[EventT]: """Handle an `AgentRunResultEvent`. Args: event: The agent run result event. """ return yield # Make this an async generator ``` #### accept ```python accept: str | None = None ``` The `Accept` header value of the request, used to determine how to encode the protocol-specific events for the streaming response. #### message_id ```python message_id: str = field( default_factory=lambda: str(uuid4()) ) ``` The message ID to use for the next event. #### new_message_id ```python new_message_id() -> str ``` Generate and store a new message ID. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python def new_message_id(self) -> str: """Generate and store a new message ID.""" self.message_id = str(uuid4()) return self.message_id ``` #### response_headers ```python response_headers: Mapping[str, str] | None ``` Response headers to return to the frontend. #### content_type ```python content_type: str ``` Get the content type for the event stream, compatible with the `Accept` header value. By default, this returns the Server-Sent Events content type (`text/event-stream`). If a subclass supports other types as well, it should consider `self.accept` in encode_event() and return the resulting content type. #### encode_event ```python encode_event(event: EventT) -> str ``` Encode a protocol-specific event as a string. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python @abstractmethod def encode_event(self, event: EventT) -> str: """Encode a protocol-specific event as a string.""" raise NotImplementedError ``` #### encode_stream ```python encode_stream( stream: AsyncIterator[EventT], ) -> AsyncIterator[str] ``` Encode a stream of protocol-specific events as strings according to the `Accept` header value. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def encode_stream(self, stream: AsyncIterator[EventT]) -> AsyncIterator[str]: """Encode a stream of protocol-specific events as strings according to the `Accept` header value.""" async for event in stream: yield self.encode_event(event) ``` #### streaming_response ```python streaming_response( stream: AsyncIterator[EventT], ) -> StreamingResponse ``` Generate a streaming response from a stream of protocol-specific events. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python def streaming_response(self, stream: AsyncIterator[EventT]) -> StreamingResponse: """Generate a streaming response from a stream of protocol-specific events.""" try: from starlette.responses import StreamingResponse except ImportError as e: # pragma: no cover raise ImportError( 'Please install the `starlette` package to use the `streaming_response()` method, ' 'you can use the `ui` optional group — `pip install "pydantic-ai-slim[ui]"`' ) from e return StreamingResponse( self.encode_stream(stream), headers=self.response_headers, media_type=self.content_type, ) ``` #### transform_stream ```python transform_stream( stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None, ) -> AsyncIterator[EventT] ``` Transform a stream of Pydantic AI events into protocol-specific events. This method dispatches to specific hooks and `handle_*` methods that subclasses can override: - before_stream() - after_stream() - on_error() - before_request() - after_request() - before_response() - after_response() - handle_event() Parameters: | Name | Type | Description | Default | | ------------- | ---------------------------- | ---------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `stream` | `AsyncIterator[NativeEvent]` | The stream of Pydantic AI events to transform. | *required* | | `on_complete` | \`OnCompleteFunc[EventT] | None\` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can optionally yield additional protocol-specific events. | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def transform_stream( # noqa: C901 self, stream: AsyncIterator[NativeEvent], on_complete: OnCompleteFunc[EventT] | None = None ) -> AsyncIterator[EventT]: """Transform a stream of Pydantic AI events into protocol-specific events. This method dispatches to specific hooks and `handle_*` methods that subclasses can override: - [`before_stream()`][pydantic_ai.ui.UIEventStream.before_stream] - [`after_stream()`][pydantic_ai.ui.UIEventStream.after_stream] - [`on_error()`][pydantic_ai.ui.UIEventStream.on_error] - [`before_request()`][pydantic_ai.ui.UIEventStream.before_request] - [`after_request()`][pydantic_ai.ui.UIEventStream.after_request] - [`before_response()`][pydantic_ai.ui.UIEventStream.before_response] - [`after_response()`][pydantic_ai.ui.UIEventStream.after_response] - [`handle_event()`][pydantic_ai.ui.UIEventStream.handle_event] Args: stream: The stream of Pydantic AI events to transform. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can optionally yield additional protocol-specific events. """ async for e in self.before_stream(): yield e try: async for event in stream: if isinstance(event, PartStartEvent): async for e in self._turn_to('response'): yield e elif isinstance(event, FunctionToolCallEvent): async for e in self._turn_to('request'): yield e elif isinstance(event, AgentRunResultEvent): if ( self._final_result_event and (tool_call_id := self._final_result_event.tool_call_id) and (tool_name := self._final_result_event.tool_name) ): async for e in self._turn_to('request'): yield e self._final_result_event = None # Ensure the stream does not end on a dangling tool call without a result. output_tool_result_event = FunctionToolResultEvent( result=ToolReturnPart( tool_call_id=tool_call_id, tool_name=tool_name, content='Final result processed.', ) ) async for e in self.handle_function_tool_result(output_tool_result_event): yield e result = cast(AgentRunResult[OutputDataT], event.result) self._result = result async for e in self._turn_to(None): yield e if on_complete is not None: if inspect.isasyncgenfunction(on_complete): async for e in on_complete(result): yield e elif _utils.is_async_callable(on_complete): await on_complete(result) else: await _utils.run_in_executor(on_complete, result) elif isinstance(event, FinalResultEvent): self._final_result_event = event if isinstance(event, BuiltinToolCallEvent | BuiltinToolResultEvent): # pyright: ignore[reportDeprecated] # These events were deprecated before this feature was introduced continue async for e in self.handle_event(event): yield e except Exception as e: async for e in self.on_error(e): yield e finally: async for e in self._turn_to(None): yield e async for e in self.after_stream(): yield e ``` #### handle_event ```python handle_event(event: NativeEvent) -> AsyncIterator[EventT] ``` Transform a Pydantic AI event into one or more protocol-specific events. This method dispatches to specific `handle_*` methods based on event type: - PartStartEvent -> handle_part_start() - PartDeltaEvent -> `handle_part_delta` - PartEndEvent -> `handle_part_end` - FinalResultEvent -> `handle_final_result` - FunctionToolCallEvent -> `handle_function_tool_call` - FunctionToolResultEvent -> `handle_function_tool_result` - AgentRunResultEvent -> `handle_run_result` Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all events, make sure you call the super method. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_event(self, event: NativeEvent) -> AsyncIterator[EventT]: """Transform a Pydantic AI event into one or more protocol-specific events. This method dispatches to specific `handle_*` methods based on event type: - [`PartStartEvent`][pydantic_ai.messages.PartStartEvent] -> [`handle_part_start()`][pydantic_ai.ui.UIEventStream.handle_part_start] - [`PartDeltaEvent`][pydantic_ai.messages.PartDeltaEvent] -> `handle_part_delta` - [`PartEndEvent`][pydantic_ai.messages.PartEndEvent] -> `handle_part_end` - [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] -> `handle_final_result` - [`FunctionToolCallEvent`][pydantic_ai.messages.FunctionToolCallEvent] -> `handle_function_tool_call` - [`FunctionToolResultEvent`][pydantic_ai.messages.FunctionToolResultEvent] -> `handle_function_tool_result` - [`AgentRunResultEvent`][pydantic_ai.run.AgentRunResultEvent] -> `handle_run_result` Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all events, make sure you call the super method. """ match event: case PartStartEvent(): async for e in self.handle_part_start(event): yield e case PartDeltaEvent(): async for e in self.handle_part_delta(event): yield e case PartEndEvent(): async for e in self.handle_part_end(event): yield e case FinalResultEvent(): async for e in self.handle_final_result(event): yield e case FunctionToolCallEvent(): async for e in self.handle_function_tool_call(event): yield e case FunctionToolResultEvent(): async for e in self.handle_function_tool_result(event): yield e case AgentRunResultEvent(): async for e in self.handle_run_result(event): yield e case _: pass ``` #### handle_part_start ```python handle_part_start( event: PartStartEvent, ) -> AsyncIterator[EventT] ``` Handle a `PartStartEvent`. This method dispatches to specific `handle_*` methods based on part type: - TextPart -> handle_text_start() - ThinkingPart -> handle_thinking_start() - ToolCallPart -> handle_tool_call_start() - BuiltinToolCallPart -> handle_builtin_tool_call_start() - BuiltinToolReturnPart -> handle_builtin_tool_return() - FilePart -> handle_file() Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all part start events, make sure you call the super method. Parameters: | Name | Type | Description | Default | | ------- | ---------------- | --------------------- | ---------- | | `event` | `PartStartEvent` | The part start event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_part_start(self, event: PartStartEvent) -> AsyncIterator[EventT]: """Handle a `PartStartEvent`. This method dispatches to specific `handle_*` methods based on part type: - [`TextPart`][pydantic_ai.messages.TextPart] -> [`handle_text_start()`][pydantic_ai.ui.UIEventStream.handle_text_start] - [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] -> [`handle_thinking_start()`][pydantic_ai.ui.UIEventStream.handle_thinking_start] - [`ToolCallPart`][pydantic_ai.messages.ToolCallPart] -> [`handle_tool_call_start()`][pydantic_ai.ui.UIEventStream.handle_tool_call_start] - [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] -> [`handle_builtin_tool_call_start()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_start] - [`BuiltinToolReturnPart`][pydantic_ai.messages.BuiltinToolReturnPart] -> [`handle_builtin_tool_return()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_return] - [`FilePart`][pydantic_ai.messages.FilePart] -> [`handle_file()`][pydantic_ai.ui.UIEventStream.handle_file] Subclasses are encouraged to override the individual `handle_*` methods rather than this one. If you need specific behavior for all part start events, make sure you call the super method. Args: event: The part start event. """ part = event.part previous_part_kind = event.previous_part_kind match part: case TextPart(): async for e in self.handle_text_start(part, follows_text=previous_part_kind == 'text'): yield e case ThinkingPart(): async for e in self.handle_thinking_start(part, follows_thinking=previous_part_kind == 'thinking'): yield e case ToolCallPart(): async for e in self.handle_tool_call_start(part): yield e case BuiltinToolCallPart(): async for e in self.handle_builtin_tool_call_start(part): yield e case BuiltinToolReturnPart(): async for e in self.handle_builtin_tool_return(part): yield e case FilePart(): # pragma: no branch async for e in self.handle_file(part): yield e ``` #### handle_part_delta ```python handle_part_delta( event: PartDeltaEvent, ) -> AsyncIterator[EventT] ``` Handle a PartDeltaEvent. This method dispatches to specific `handle_*_delta` methods based on part delta type: - TextPartDelta -> handle_text_delta() - ThinkingPartDelta -> handle_thinking_delta() - ToolCallPartDelta -> handle_tool_call_delta() Subclasses are encouraged to override the individual `handle_*_delta` methods rather than this one. If you need specific behavior for all part delta events, make sure you call the super method. Parameters: | Name | Type | Description | Default | | ------- | ---------------- | ------------------- | ---------- | | `event` | `PartDeltaEvent` | The PartDeltaEvent. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_part_delta(self, event: PartDeltaEvent) -> AsyncIterator[EventT]: """Handle a PartDeltaEvent. This method dispatches to specific `handle_*_delta` methods based on part delta type: - [`TextPartDelta`][pydantic_ai.messages.TextPartDelta] -> [`handle_text_delta()`][pydantic_ai.ui.UIEventStream.handle_text_delta] - [`ThinkingPartDelta`][pydantic_ai.messages.ThinkingPartDelta] -> [`handle_thinking_delta()`][pydantic_ai.ui.UIEventStream.handle_thinking_delta] - [`ToolCallPartDelta`][pydantic_ai.messages.ToolCallPartDelta] -> [`handle_tool_call_delta()`][pydantic_ai.ui.UIEventStream.handle_tool_call_delta] Subclasses are encouraged to override the individual `handle_*_delta` methods rather than this one. If you need specific behavior for all part delta events, make sure you call the super method. Args: event: The PartDeltaEvent. """ delta = event.delta match delta: case TextPartDelta(): async for e in self.handle_text_delta(delta): yield e case ThinkingPartDelta(): async for e in self.handle_thinking_delta(delta): yield e case ToolCallPartDelta(): # pragma: no branch async for e in self.handle_tool_call_delta(delta): yield e ``` #### handle_part_end ```python handle_part_end( event: PartEndEvent, ) -> AsyncIterator[EventT] ``` Handle a `PartEndEvent`. This method dispatches to specific `handle_*_end` methods based on part type: - TextPart -> handle_text_end() - ThinkingPart -> handle_thinking_end() - ToolCallPart -> handle_tool_call_end() - BuiltinToolCallPart -> handle_builtin_tool_call_end() Subclasses are encouraged to override the individual `handle_*_end` methods rather than this one. If you need specific behavior for all part end events, make sure you call the super method. Parameters: | Name | Type | Description | Default | | ------- | -------------- | ------------------- | ---------- | | `event` | `PartEndEvent` | The part end event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_part_end(self, event: PartEndEvent) -> AsyncIterator[EventT]: """Handle a `PartEndEvent`. This method dispatches to specific `handle_*_end` methods based on part type: - [`TextPart`][pydantic_ai.messages.TextPart] -> [`handle_text_end()`][pydantic_ai.ui.UIEventStream.handle_text_end] - [`ThinkingPart`][pydantic_ai.messages.ThinkingPart] -> [`handle_thinking_end()`][pydantic_ai.ui.UIEventStream.handle_thinking_end] - [`ToolCallPart`][pydantic_ai.messages.ToolCallPart] -> [`handle_tool_call_end()`][pydantic_ai.ui.UIEventStream.handle_tool_call_end] - [`BuiltinToolCallPart`][pydantic_ai.messages.BuiltinToolCallPart] -> [`handle_builtin_tool_call_end()`][pydantic_ai.ui.UIEventStream.handle_builtin_tool_call_end] Subclasses are encouraged to override the individual `handle_*_end` methods rather than this one. If you need specific behavior for all part end events, make sure you call the super method. Args: event: The part end event. """ part = event.part next_part_kind = event.next_part_kind match part: case TextPart(): async for e in self.handle_text_end(part, followed_by_text=next_part_kind == 'text'): yield e case ThinkingPart(): async for e in self.handle_thinking_end(part, followed_by_thinking=next_part_kind == 'thinking'): yield e case ToolCallPart(): async for e in self.handle_tool_call_end(part): yield e case BuiltinToolCallPart(): async for e in self.handle_builtin_tool_call_end(part): yield e case BuiltinToolReturnPart() | FilePart(): # pragma: no cover # These don't have deltas, so they don't need to be ended. pass ``` #### before_stream ```python before_stream() -> AsyncIterator[EventT] ``` Yield events before agent streaming starts. This hook is called before any agent events are processed. Override this to inject custom events at the start of the stream. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def before_stream(self) -> AsyncIterator[EventT]: """Yield events before agent streaming starts. This hook is called before any agent events are processed. Override this to inject custom events at the start of the stream. """ return # pragma: no cover yield # Make this an async generator ``` #### after_stream ```python after_stream() -> AsyncIterator[EventT] ``` Yield events after agent streaming completes. This hook is called after all agent events have been processed. Override this to inject custom events at the end of the stream. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def after_stream(self) -> AsyncIterator[EventT]: """Yield events after agent streaming completes. This hook is called after all agent events have been processed. Override this to inject custom events at the end of the stream. """ return # pragma: no cover yield # Make this an async generator ``` #### on_error ```python on_error(error: Exception) -> AsyncIterator[EventT] ``` Handle errors that occur during streaming. Parameters: | Name | Type | Description | Default | | ------- | ----------- | ----------------------------------------- | ---------- | | `error` | `Exception` | The error that occurred during streaming. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def on_error(self, error: Exception) -> AsyncIterator[EventT]: """Handle errors that occur during streaming. Args: error: The error that occurred during streaming. """ return # pragma: no cover yield # Make this an async generator ``` #### before_request ```python before_request() -> AsyncIterator[EventT] ``` Yield events before a model request is processed. Override this to inject custom events at the start of the request. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def before_request(self) -> AsyncIterator[EventT]: """Yield events before a model request is processed. Override this to inject custom events at the start of the request. """ return # pragma: lax no cover yield # Make this an async generator ``` #### after_request ```python after_request() -> AsyncIterator[EventT] ``` Yield events after a model request is processed. Override this to inject custom events at the end of the request. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def after_request(self) -> AsyncIterator[EventT]: """Yield events after a model request is processed. Override this to inject custom events at the end of the request. """ return # pragma: lax no cover yield # Make this an async generator ``` #### before_response ```python before_response() -> AsyncIterator[EventT] ``` Yield events before a model response is processed. Override this to inject custom events at the start of the response. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def before_response(self) -> AsyncIterator[EventT]: """Yield events before a model response is processed. Override this to inject custom events at the start of the response. """ return # pragma: no cover yield # Make this an async generator ``` #### after_response ```python after_response() -> AsyncIterator[EventT] ``` Yield events after a model response is processed. Override this to inject custom events at the end of the response. Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def after_response(self) -> AsyncIterator[EventT]: """Yield events after a model response is processed. Override this to inject custom events at the end of the response. """ return # pragma: lax no cover yield # Make this an async generator ``` #### handle_text_start ```python handle_text_start( part: TextPart, follows_text: bool = False ) -> AsyncIterator[EventT] ``` Handle the start of a `TextPart`. Parameters: | Name | Type | Description | Default | | -------------- | ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | | `part` | `TextPart` | The text part. | *required* | | `follows_text` | `bool` | Whether the part is directly preceded by another text part. In this case, you may want to yield a "text-delta" event instead of a "text-start" event. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_text_start(self, part: TextPart, follows_text: bool = False) -> AsyncIterator[EventT]: """Handle the start of a `TextPart`. Args: part: The text part. follows_text: Whether the part is directly preceded by another text part. In this case, you may want to yield a "text-delta" event instead of a "text-start" event. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_text_delta ```python handle_text_delta( delta: TextPartDelta, ) -> AsyncIterator[EventT] ``` Handle a `TextPartDelta`. Parameters: | Name | Type | Description | Default | | ------- | --------------- | -------------------- | ---------- | | `delta` | `TextPartDelta` | The text part delta. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_text_delta(self, delta: TextPartDelta) -> AsyncIterator[EventT]: """Handle a `TextPartDelta`. Args: delta: The text part delta. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_text_end ```python handle_text_end( part: TextPart, followed_by_text: bool = False ) -> AsyncIterator[EventT] ``` Handle the end of a `TextPart`. Parameters: | Name | Type | Description | Default | | ------------------ | ---------- | --------------------------------------------------------------------------------------------------------------------------- | ---------- | | `part` | `TextPart` | The text part. | *required* | | `followed_by_text` | `bool` | Whether the part is directly followed by another text part. In this case, you may not want to yield a "text-end" event yet. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_text_end(self, part: TextPart, followed_by_text: bool = False) -> AsyncIterator[EventT]: """Handle the end of a `TextPart`. Args: part: The text part. followed_by_text: Whether the part is directly followed by another text part. In this case, you may not want to yield a "text-end" event yet. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_thinking_start ```python handle_thinking_start( part: ThinkingPart, follows_thinking: bool = False ) -> AsyncIterator[EventT] ``` Handle the start of a `ThinkingPart`. Parameters: | Name | Type | Description | Default | | ------------------ | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | | `part` | `ThinkingPart` | The thinking part. | *required* | | `follows_thinking` | `bool` | Whether the part is directly preceded by another thinking part. In this case, you may want to yield a "thinking-delta" event instead of a "thinking-start" event. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_thinking_start(self, part: ThinkingPart, follows_thinking: bool = False) -> AsyncIterator[EventT]: """Handle the start of a `ThinkingPart`. Args: part: The thinking part. follows_thinking: Whether the part is directly preceded by another thinking part. In this case, you may want to yield a "thinking-delta" event instead of a "thinking-start" event. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_thinking_delta ```python handle_thinking_delta( delta: ThinkingPartDelta, ) -> AsyncIterator[EventT] ``` Handle a `ThinkingPartDelta`. Parameters: | Name | Type | Description | Default | | ------- | ------------------- | ------------------------ | ---------- | | `delta` | `ThinkingPartDelta` | The thinking part delta. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_thinking_delta(self, delta: ThinkingPartDelta) -> AsyncIterator[EventT]: """Handle a `ThinkingPartDelta`. Args: delta: The thinking part delta. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_thinking_end ```python handle_thinking_end( part: ThinkingPart, followed_by_thinking: bool = False ) -> AsyncIterator[EventT] ``` Handle the end of a `ThinkingPart`. Parameters: | Name | Type | Description | Default | | ---------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ---------- | | `part` | `ThinkingPart` | The thinking part. | *required* | | `followed_by_thinking` | `bool` | Whether the part is directly followed by another thinking part. In this case, you may not want to yield a "thinking-end" event yet. | `False` | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_thinking_end( self, part: ThinkingPart, followed_by_thinking: bool = False ) -> AsyncIterator[EventT]: """Handle the end of a `ThinkingPart`. Args: part: The thinking part. followed_by_thinking: Whether the part is directly followed by another thinking part. In this case, you may not want to yield a "thinking-end" event yet. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_tool_call_start ```python handle_tool_call_start( part: ToolCallPart, ) -> AsyncIterator[EventT] ``` Handle the start of a `ToolCallPart`. Parameters: | Name | Type | Description | Default | | ------ | -------------- | ------------------- | ---------- | | `part` | `ToolCallPart` | The tool call part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_tool_call_start(self, part: ToolCallPart) -> AsyncIterator[EventT]: """Handle the start of a `ToolCallPart`. Args: part: The tool call part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_tool_call_delta ```python handle_tool_call_delta( delta: ToolCallPartDelta, ) -> AsyncIterator[EventT] ``` Handle a `ToolCallPartDelta`. Parameters: | Name | Type | Description | Default | | ------- | ------------------- | ------------------------- | ---------- | | `delta` | `ToolCallPartDelta` | The tool call part delta. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_tool_call_delta(self, delta: ToolCallPartDelta) -> AsyncIterator[EventT]: """Handle a `ToolCallPartDelta`. Args: delta: The tool call part delta. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_tool_call_end ```python handle_tool_call_end( part: ToolCallPart, ) -> AsyncIterator[EventT] ``` Handle the end of a `ToolCallPart`. Parameters: | Name | Type | Description | Default | | ------ | -------------- | ------------------- | ---------- | | `part` | `ToolCallPart` | The tool call part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_tool_call_end(self, part: ToolCallPart) -> AsyncIterator[EventT]: """Handle the end of a `ToolCallPart`. Args: part: The tool call part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_builtin_tool_call_start ```python handle_builtin_tool_call_start( part: BuiltinToolCallPart, ) -> AsyncIterator[EventT] ``` Handle a `BuiltinToolCallPart` at start. Parameters: | Name | Type | Description | Default | | ------ | --------------------- | --------------------------- | ---------- | | `part` | `BuiltinToolCallPart` | The builtin tool call part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_builtin_tool_call_start(self, part: BuiltinToolCallPart) -> AsyncIterator[EventT]: """Handle a `BuiltinToolCallPart` at start. Args: part: The builtin tool call part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_builtin_tool_call_end ```python handle_builtin_tool_call_end( part: BuiltinToolCallPart, ) -> AsyncIterator[EventT] ``` Handle the end of a `BuiltinToolCallPart`. Parameters: | Name | Type | Description | Default | | ------ | --------------------- | --------------------------- | ---------- | | `part` | `BuiltinToolCallPart` | The builtin tool call part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_builtin_tool_call_end(self, part: BuiltinToolCallPart) -> AsyncIterator[EventT]: """Handle the end of a `BuiltinToolCallPart`. Args: part: The builtin tool call part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_builtin_tool_return ```python handle_builtin_tool_return( part: BuiltinToolReturnPart, ) -> AsyncIterator[EventT] ``` Handle a `BuiltinToolReturnPart`. Parameters: | Name | Type | Description | Default | | ------ | ----------------------- | ----------------------------- | ---------- | | `part` | `BuiltinToolReturnPart` | The builtin tool return part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_builtin_tool_return(self, part: BuiltinToolReturnPart) -> AsyncIterator[EventT]: """Handle a `BuiltinToolReturnPart`. Args: part: The builtin tool return part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_file ```python handle_file(part: FilePart) -> AsyncIterator[EventT] ``` Handle a `FilePart`. Parameters: | Name | Type | Description | Default | | ------ | ---------- | -------------- | ---------- | | `part` | `FilePart` | The file part. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_file(self, part: FilePart) -> AsyncIterator[EventT]: """Handle a `FilePart`. Args: part: The file part. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_final_result ```python handle_final_result( event: FinalResultEvent, ) -> AsyncIterator[EventT] ``` Handle a `FinalResultEvent`. Parameters: | Name | Type | Description | Default | | ------- | ------------------ | ----------------------- | ---------- | | `event` | `FinalResultEvent` | The final result event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_final_result(self, event: FinalResultEvent) -> AsyncIterator[EventT]: """Handle a `FinalResultEvent`. Args: event: The final result event. """ return yield # Make this an async generator ``` #### handle_function_tool_call ```python handle_function_tool_call( event: FunctionToolCallEvent, ) -> AsyncIterator[EventT] ``` Handle a `FunctionToolCallEvent`. Parameters: | Name | Type | Description | Default | | ------- | ----------------------- | ----------------------------- | ---------- | | `event` | `FunctionToolCallEvent` | The function tool call event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_function_tool_call(self, event: FunctionToolCallEvent) -> AsyncIterator[EventT]: """Handle a `FunctionToolCallEvent`. Args: event: The function tool call event. """ return yield # Make this an async generator ``` #### handle_function_tool_result ```python handle_function_tool_result( event: FunctionToolResultEvent, ) -> AsyncIterator[EventT] ``` Handle a `FunctionToolResultEvent`. Parameters: | Name | Type | Description | Default | | ------- | ------------------------- | ------------------------------- | ---------- | | `event` | `FunctionToolResultEvent` | The function tool result event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_function_tool_result(self, event: FunctionToolResultEvent) -> AsyncIterator[EventT]: """Handle a `FunctionToolResultEvent`. Args: event: The function tool result event. """ return # pragma: no cover yield # Make this an async generator ``` #### handle_run_result ```python handle_run_result( event: AgentRunResultEvent, ) -> AsyncIterator[EventT] ``` Handle an `AgentRunResultEvent`. Parameters: | Name | Type | Description | Default | | ------- | --------------------- | --------------------------- | ---------- | | `event` | `AgentRunResultEvent` | The agent run result event. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/ui/_event_stream.py` ```python async def handle_run_result(self, event: AgentRunResultEvent) -> AsyncIterator[EventT]: """Handle an `AgentRunResultEvent`. Args: event: The agent run result event. """ return yield # Make this an async generator ``` ### MessagesBuilder Helper class to build Pydantic AI messages from request/response parts. Source code in `pydantic_ai_slim/pydantic_ai/ui/_messages_builder.py` ```python @dataclass class MessagesBuilder: """Helper class to build Pydantic AI messages from request/response parts.""" messages: list[ModelMessage] = field(default_factory=list[ModelMessage]) def add(self, part: ModelRequestPart | ModelResponsePart) -> None: """Add a new part, creating a new request or response message if necessary.""" last_message = self.messages[-1] if self.messages else None if isinstance(part, get_union_args(ModelRequestPart)): part = cast(ModelRequestPart, part) if isinstance(last_message, ModelRequest): last_message.parts = [*last_message.parts, part] else: self.messages.append(ModelRequest(parts=[part])) else: part = cast(ModelResponsePart, part) if isinstance(last_message, ModelResponse): last_message.parts = [*last_message.parts, part] else: self.messages.append(ModelResponse(parts=[part])) ``` #### add ```python add(part: ModelRequestPart | ModelResponsePart) -> None ``` Add a new part, creating a new request or response message if necessary. Source code in `pydantic_ai_slim/pydantic_ai/ui/_messages_builder.py` ```python def add(self, part: ModelRequestPart | ModelResponsePart) -> None: """Add a new part, creating a new request or response message if necessary.""" last_message = self.messages[-1] if self.messages else None if isinstance(part, get_union_args(ModelRequestPart)): part = cast(ModelRequestPart, part) if isinstance(last_message, ModelRequest): last_message.parts = [*last_message.parts, part] else: self.messages.append(ModelRequest(parts=[part])) else: part = cast(ModelResponsePart, part) if isinstance(last_message, ModelResponse): last_message.parts = [*last_message.parts, part] else: self.messages.append(ModelResponse(parts=[part])) ``` # `pydantic_ai.ui.vercel_ai` Vercel AI protocol adapter for Pydantic AI agents. This module provides classes for integrating Pydantic AI agents with the Vercel AI protocol, enabling streaming event-based communication for interactive AI applications. Converted to Python from: https://github.com/vercel/ai/blob/ai%405.0.34/packages/ai/src/ui/ui-messages.ts ### VercelAIAdapter Bases: `UIAdapter[RequestData, UIMessage, BaseChunk, AgentDepsT, OutputDataT]` UI adapter for the Vercel AI protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_adapter.py` ```python @dataclass class VercelAIAdapter(UIAdapter[RequestData, UIMessage, BaseChunk, AgentDepsT, OutputDataT]): """UI adapter for the Vercel AI protocol.""" _: KW_ONLY sdk_version: Literal[5, 6] = 5 """Vercel AI SDK version to target. Default is 5 for backwards compatibility.""" @classmethod def build_run_input(cls, body: bytes) -> RequestData: """Build a Vercel AI run input object from the request body.""" return request_data_ta.validate_json(body) def build_event_stream(self) -> UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]: """Build a Vercel AI event stream transformer.""" return VercelAIEventStream(self.run_input, accept=self.accept, sdk_version=self.sdk_version) @cached_property def messages(self) -> list[ModelMessage]: """Pydantic AI messages from the Vercel AI run input.""" return self.load_messages(self.run_input.messages) @classmethod def load_messages(cls, messages: Sequence[UIMessage]) -> list[ModelMessage]: # noqa: C901 """Transform Vercel AI messages into Pydantic AI messages.""" builder = MessagesBuilder() for msg in messages: if msg.role == 'system': for part in msg.parts: if isinstance(part, TextUIPart): builder.add(SystemPromptPart(content=part.text)) else: # pragma: no cover raise ValueError(f'Unsupported system message part type: {type(part)}') elif msg.role == 'user': user_prompt_content: str | list[UserContent] = [] for part in msg.parts: if isinstance(part, TextUIPart): user_prompt_content.append(part.text) elif isinstance(part, FileUIPart): try: file = BinaryContent.from_data_uri(part.url) except ValueError: media_type_prefix = part.media_type.split('/', 1)[0] match media_type_prefix: case 'image': file = ImageUrl(url=part.url, media_type=part.media_type) case 'video': file = VideoUrl(url=part.url, media_type=part.media_type) case 'audio': file = AudioUrl(url=part.url, media_type=part.media_type) case _: file = DocumentUrl(url=part.url, media_type=part.media_type) user_prompt_content.append(file) else: # pragma: no cover raise ValueError(f'Unsupported user message part type: {type(part)}') if user_prompt_content: # pragma: no branch if len(user_prompt_content) == 1 and isinstance(user_prompt_content[0], str): user_prompt_content = user_prompt_content[0] builder.add(UserPromptPart(content=user_prompt_content)) elif msg.role == 'assistant': for part in msg.parts: if isinstance(part, TextUIPart): provider_meta = load_provider_metadata(part.provider_metadata) builder.add( TextPart( content=part.text, id=provider_meta.get('id'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, ReasoningUIPart): provider_meta = load_provider_metadata(part.provider_metadata) builder.add( ThinkingPart( content=part.text, id=provider_meta.get('id'), signature=provider_meta.get('signature'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, FileUIPart): try: file = BinaryContent.from_data_uri(part.url) except ValueError as e: # pragma: no cover # We don't yet handle non-data-URI file URLs returned by assistants, as no Pydantic AI models do this. raise ValueError( 'Vercel AI integration can currently only handle assistant file parts with data URIs.' ) from e provider_meta = load_provider_metadata(part.provider_metadata) builder.add( FilePart( content=file, id=provider_meta.get('id'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, ToolUIPart | DynamicToolUIPart): if isinstance(part, DynamicToolUIPart): tool_name = part.tool_name builtin_tool = False else: tool_name = part.type.removeprefix('tool-') builtin_tool = part.provider_executed tool_call_id = part.tool_call_id args: str | dict[str, Any] | None = part.input if isinstance(args, str): try: parsed = json.loads(args) if isinstance(parsed, dict): args = cast(dict[str, Any], parsed) except json.JSONDecodeError: pass elif isinstance(args, dict) or args is None: pass else: assert_never(args) provider_meta = load_provider_metadata(part.call_provider_metadata) part_id = provider_meta.get('id') provider_name = provider_meta.get('provider_name') provider_details = provider_meta.get('provider_details') if builtin_tool: # For builtin tools, we need to create 2 parts (BuiltinToolCall & BuiltinToolReturn) for a single Vercel ToolOutput # The call and return metadata are combined in the output part. # So we extract and return them to the respective parts call_meta = return_meta = {} has_tool_output = isinstance(part, (ToolOutputAvailablePart, ToolOutputErrorPart)) if has_tool_output: call_meta, return_meta = cls._load_builtin_tool_meta(provider_meta) builder.add( BuiltinToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=args, id=call_meta.get('id') or part_id, provider_name=call_meta.get('provider_name') or provider_name, provider_details=call_meta.get('provider_details') or provider_details, ) ) if has_tool_output: output: Any | None = None if isinstance(part, ToolOutputAvailablePart): output = part.output elif isinstance(part, ToolOutputErrorPart): # pragma: no branch output = {'error_text': part.error_text, 'is_error': True} builder.add( BuiltinToolReturnPart( tool_name=tool_name, tool_call_id=tool_call_id, content=output, provider_name=return_meta.get('provider_name') or provider_name, provider_details=return_meta.get('provider_details') or provider_details, ) ) else: builder.add( ToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=args, id=part_id, provider_name=provider_name, provider_details=provider_details, ) ) if part.state == 'output-available': builder.add( ToolReturnPart(tool_name=tool_name, tool_call_id=tool_call_id, content=part.output) ) elif part.state == 'output-error': builder.add( RetryPromptPart( tool_name=tool_name, tool_call_id=tool_call_id, content=part.error_text ) ) elif isinstance(part, DataUIPart): # pragma: no cover # Contains custom data that shouldn't be sent to the model pass elif isinstance(part, SourceUrlUIPart): # pragma: no cover # TODO: Once we support citations: https://github.com/pydantic/pydantic-ai/issues/3126 pass elif isinstance(part, SourceDocumentUIPart): # pragma: no cover # TODO: Once we support citations: https://github.com/pydantic/pydantic-ai/issues/3126 pass elif isinstance(part, StepStartUIPart): # pragma: no cover # Nothing to do here pass else: assert_never(part) else: assert_never(msg.role) return builder.messages @staticmethod def _dump_builtin_tool_meta( call_provider_metadata: ProviderMetadata | None, return_provider_metadata: ProviderMetadata | None ) -> ProviderMetadata | None: """Use special keys (call_meta and return_meta) to dump combined provider metadata.""" return dump_provider_metadata(call_meta=call_provider_metadata, return_meta=return_provider_metadata) @staticmethod def _load_builtin_tool_meta( provider_metadata: ProviderMetadata, ) -> tuple[dict[str, Any], dict[str, Any]]: """Use special keys (call_meta and return_meta) to load combined provider metadata.""" return provider_metadata.get('call_meta') or {}, provider_metadata.get('return_meta') or {} @staticmethod def _dump_request_message(msg: ModelRequest) -> tuple[list[UIMessagePart], list[UIMessagePart]]: """Convert a ModelRequest into a UIMessage.""" system_ui_parts: list[UIMessagePart] = [] user_ui_parts: list[UIMessagePart] = [] for part in msg.parts: if isinstance(part, SystemPromptPart): system_ui_parts.append(TextUIPart(text=part.content, state='done')) elif isinstance(part, UserPromptPart): user_ui_parts.extend(_convert_user_prompt_part(part)) elif isinstance(part, ToolReturnPart): # Tool returns are merged into the tool call in the assistant message pass elif isinstance(part, RetryPromptPart): if part.tool_name: # Tool-related retries are handled when processing ToolCallPart in ModelResponse pass else: # Non-tool retries (e.g., output validation errors) become user text user_ui_parts.append(TextUIPart(text=part.model_response(), state='done')) else: assert_never(part) return system_ui_parts, user_ui_parts @classmethod def _dump_response_message( cls, msg: ModelResponse, tool_results: dict[str, ToolReturnPart | RetryPromptPart] ) -> list[UIMessagePart]: """Convert a ModelResponse into a UIMessage.""" ui_parts: list[UIMessagePart] = [] # For builtin tools, returns can be in the same ModelResponse as calls local_builtin_returns: dict[str, BuiltinToolReturnPart] = { part.tool_call_id: part for part in msg.parts if isinstance(part, BuiltinToolReturnPart) } for part in msg.parts: if isinstance(part, BuiltinToolReturnPart): continue elif isinstance(part, TextPart): # Combine consecutive text parts if ui_parts and isinstance(ui_parts[-1], TextUIPart): ui_parts[-1].text += part.content else: provider_metadata = dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ) ui_parts.append(TextUIPart(text=part.content, state='done', provider_metadata=provider_metadata)) elif isinstance(part, ThinkingPart): provider_metadata = dump_provider_metadata( id=part.id, signature=part.signature, provider_name=part.provider_name, provider_details=part.provider_details, ) ui_parts.append(ReasoningUIPart(text=part.content, state='done', provider_metadata=provider_metadata)) elif isinstance(part, FilePart): ui_parts.append( FileUIPart( url=part.content.data_uri, media_type=part.content.media_type, provider_metadata=dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ), ) ) elif isinstance(part, BuiltinToolCallPart): tool_name = f'tool-{part.tool_name}' if builtin_return := local_builtin_returns.get(part.tool_call_id): # Builtin tool calls are represented by two parts in pydantic_ai: # 1. BuiltinToolCallPart (the tool request) -> part # 2. BuiltinToolReturnPart (the tool's output) -> builtin_return # The Vercel AI SDK only has a single ToolOutputPart (ToolOutputAvailablePart or ToolOutputErrorPart). # So, we need to combine the metadata so that when we later convert back from Vercel AI to pydantic_ai, # we can properly reconstruct both the call and return parts with their respective metadata. # Note: This extra metadata handling is only needed for built-in tools, since normal tool returns # (ToolReturnPart) do not include provider metadata. call_meta = dump_provider_metadata( wrapper_key=None, id=part.id, provider_name=part.provider_name, provider_details=part.provider_details, ) return_meta = dump_provider_metadata( wrapper_key=None, provider_name=builtin_return.provider_name, provider_details=builtin_return.provider_details, ) combined_provider_meta = cls._dump_builtin_tool_meta(call_meta, return_meta) response_object = builtin_return.model_response_object() # These `is_error`/`error_text` fields are only present when the BuiltinToolReturnPart # was parsed from an incoming VercelAI request. We can't detect errors for other sources # until BuiltinToolReturnPart has standardized error fields (see https://github.com/pydantic/pydantic-ai/issues/3561).3 if response_object.get('is_error') is True and ( (error_text := response_object.get('error_text')) is not None ): ui_parts.append( ToolOutputErrorPart( type=tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), error_text=error_text, state='output-error', provider_executed=True, call_provider_metadata=combined_provider_meta, ) ) else: content = builtin_return.model_response_str() ui_parts.append( ToolOutputAvailablePart( type=tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), output=content, state='output-available', provider_executed=True, call_provider_metadata=combined_provider_meta, ) ) else: call_provider_metadata = dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ) ui_parts.append( ToolInputAvailablePart( type=tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), state='input-available', provider_executed=True, call_provider_metadata=call_provider_metadata, ) ) elif isinstance(part, ToolCallPart): tool_result = tool_results.get(part.tool_call_id) call_provider_metadata = dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ) if isinstance(tool_result, ToolReturnPart): content = tool_result.model_response_str() ui_parts.append( DynamicToolOutputAvailablePart( tool_name=part.tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), output=content, state='output-available', call_provider_metadata=call_provider_metadata, ) ) elif isinstance(tool_result, RetryPromptPart): error_text = tool_result.model_response() ui_parts.append( DynamicToolOutputErrorPart( tool_name=part.tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), error_text=error_text, state='output-error', call_provider_metadata=call_provider_metadata, ) ) else: ui_parts.append( DynamicToolInputAvailablePart( tool_name=part.tool_name, tool_call_id=part.tool_call_id, input=part.args_as_json_str(), state='input-available', call_provider_metadata=call_provider_metadata, ) ) else: assert_never(part) return ui_parts @classmethod def dump_messages( cls, messages: Sequence[ModelMessage], ) -> list[UIMessage]: """Transform Pydantic AI messages into Vercel AI messages. Args: messages: A sequence of ModelMessage objects to convert Returns: A list of UIMessage objects in Vercel AI format """ tool_results: dict[str, ToolReturnPart | RetryPromptPart] = {} for msg in messages: if isinstance(msg, ModelRequest): for part in msg.parts: if isinstance(part, ToolReturnPart): tool_results[part.tool_call_id] = part elif isinstance(part, RetryPromptPart) and part.tool_name: tool_results[part.tool_call_id] = part result: list[UIMessage] = [] for msg in messages: if isinstance(msg, ModelRequest): system_ui_parts, user_ui_parts = cls._dump_request_message(msg) if system_ui_parts: result.append(UIMessage(id=str(uuid.uuid4()), role='system', parts=system_ui_parts)) if user_ui_parts: result.append(UIMessage(id=str(uuid.uuid4()), role='user', parts=user_ui_parts)) elif isinstance( # pragma: no branch msg, ModelResponse ): ui_parts: list[UIMessagePart] = cls._dump_response_message(msg, tool_results) if ui_parts: # pragma: no branch result.append(UIMessage(id=str(uuid.uuid4()), role='assistant', parts=ui_parts)) else: assert_never(msg) return result ``` #### sdk_version ```python sdk_version: Literal[5, 6] = 5 ``` Vercel AI SDK version to target. Default is 5 for backwards compatibility. #### build_run_input ```python build_run_input(body: bytes) -> RequestData ``` Build a Vercel AI run input object from the request body. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_adapter.py` ```python @classmethod def build_run_input(cls, body: bytes) -> RequestData: """Build a Vercel AI run input object from the request body.""" return request_data_ta.validate_json(body) ``` #### build_event_stream ```python build_event_stream() -> ( UIEventStream[ RequestData, BaseChunk, AgentDepsT, OutputDataT ] ) ``` Build a Vercel AI event stream transformer. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_adapter.py` ```python def build_event_stream(self) -> UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]: """Build a Vercel AI event stream transformer.""" return VercelAIEventStream(self.run_input, accept=self.accept, sdk_version=self.sdk_version) ``` #### messages ```python messages: list[ModelMessage] ``` Pydantic AI messages from the Vercel AI run input. #### load_messages ```python load_messages( messages: Sequence[UIMessage], ) -> list[ModelMessage] ``` Transform Vercel AI messages into Pydantic AI messages. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_adapter.py` ```python @classmethod def load_messages(cls, messages: Sequence[UIMessage]) -> list[ModelMessage]: # noqa: C901 """Transform Vercel AI messages into Pydantic AI messages.""" builder = MessagesBuilder() for msg in messages: if msg.role == 'system': for part in msg.parts: if isinstance(part, TextUIPart): builder.add(SystemPromptPart(content=part.text)) else: # pragma: no cover raise ValueError(f'Unsupported system message part type: {type(part)}') elif msg.role == 'user': user_prompt_content: str | list[UserContent] = [] for part in msg.parts: if isinstance(part, TextUIPart): user_prompt_content.append(part.text) elif isinstance(part, FileUIPart): try: file = BinaryContent.from_data_uri(part.url) except ValueError: media_type_prefix = part.media_type.split('/', 1)[0] match media_type_prefix: case 'image': file = ImageUrl(url=part.url, media_type=part.media_type) case 'video': file = VideoUrl(url=part.url, media_type=part.media_type) case 'audio': file = AudioUrl(url=part.url, media_type=part.media_type) case _: file = DocumentUrl(url=part.url, media_type=part.media_type) user_prompt_content.append(file) else: # pragma: no cover raise ValueError(f'Unsupported user message part type: {type(part)}') if user_prompt_content: # pragma: no branch if len(user_prompt_content) == 1 and isinstance(user_prompt_content[0], str): user_prompt_content = user_prompt_content[0] builder.add(UserPromptPart(content=user_prompt_content)) elif msg.role == 'assistant': for part in msg.parts: if isinstance(part, TextUIPart): provider_meta = load_provider_metadata(part.provider_metadata) builder.add( TextPart( content=part.text, id=provider_meta.get('id'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, ReasoningUIPart): provider_meta = load_provider_metadata(part.provider_metadata) builder.add( ThinkingPart( content=part.text, id=provider_meta.get('id'), signature=provider_meta.get('signature'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, FileUIPart): try: file = BinaryContent.from_data_uri(part.url) except ValueError as e: # pragma: no cover # We don't yet handle non-data-URI file URLs returned by assistants, as no Pydantic AI models do this. raise ValueError( 'Vercel AI integration can currently only handle assistant file parts with data URIs.' ) from e provider_meta = load_provider_metadata(part.provider_metadata) builder.add( FilePart( content=file, id=provider_meta.get('id'), provider_name=provider_meta.get('provider_name'), provider_details=provider_meta.get('provider_details'), ) ) elif isinstance(part, ToolUIPart | DynamicToolUIPart): if isinstance(part, DynamicToolUIPart): tool_name = part.tool_name builtin_tool = False else: tool_name = part.type.removeprefix('tool-') builtin_tool = part.provider_executed tool_call_id = part.tool_call_id args: str | dict[str, Any] | None = part.input if isinstance(args, str): try: parsed = json.loads(args) if isinstance(parsed, dict): args = cast(dict[str, Any], parsed) except json.JSONDecodeError: pass elif isinstance(args, dict) or args is None: pass else: assert_never(args) provider_meta = load_provider_metadata(part.call_provider_metadata) part_id = provider_meta.get('id') provider_name = provider_meta.get('provider_name') provider_details = provider_meta.get('provider_details') if builtin_tool: # For builtin tools, we need to create 2 parts (BuiltinToolCall & BuiltinToolReturn) for a single Vercel ToolOutput # The call and return metadata are combined in the output part. # So we extract and return them to the respective parts call_meta = return_meta = {} has_tool_output = isinstance(part, (ToolOutputAvailablePart, ToolOutputErrorPart)) if has_tool_output: call_meta, return_meta = cls._load_builtin_tool_meta(provider_meta) builder.add( BuiltinToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=args, id=call_meta.get('id') or part_id, provider_name=call_meta.get('provider_name') or provider_name, provider_details=call_meta.get('provider_details') or provider_details, ) ) if has_tool_output: output: Any | None = None if isinstance(part, ToolOutputAvailablePart): output = part.output elif isinstance(part, ToolOutputErrorPart): # pragma: no branch output = {'error_text': part.error_text, 'is_error': True} builder.add( BuiltinToolReturnPart( tool_name=tool_name, tool_call_id=tool_call_id, content=output, provider_name=return_meta.get('provider_name') or provider_name, provider_details=return_meta.get('provider_details') or provider_details, ) ) else: builder.add( ToolCallPart( tool_name=tool_name, tool_call_id=tool_call_id, args=args, id=part_id, provider_name=provider_name, provider_details=provider_details, ) ) if part.state == 'output-available': builder.add( ToolReturnPart(tool_name=tool_name, tool_call_id=tool_call_id, content=part.output) ) elif part.state == 'output-error': builder.add( RetryPromptPart( tool_name=tool_name, tool_call_id=tool_call_id, content=part.error_text ) ) elif isinstance(part, DataUIPart): # pragma: no cover # Contains custom data that shouldn't be sent to the model pass elif isinstance(part, SourceUrlUIPart): # pragma: no cover # TODO: Once we support citations: https://github.com/pydantic/pydantic-ai/issues/3126 pass elif isinstance(part, SourceDocumentUIPart): # pragma: no cover # TODO: Once we support citations: https://github.com/pydantic/pydantic-ai/issues/3126 pass elif isinstance(part, StepStartUIPart): # pragma: no cover # Nothing to do here pass else: assert_never(part) else: assert_never(msg.role) return builder.messages ``` #### dump_messages ```python dump_messages( messages: Sequence[ModelMessage], ) -> list[UIMessage] ``` Transform Pydantic AI messages into Vercel AI messages. Parameters: | Name | Type | Description | Default | | ---------- | ------------------------ | --------------------------------------------- | ---------- | | `messages` | `Sequence[ModelMessage]` | A sequence of ModelMessage objects to convert | *required* | Returns: | Type | Description | | ----------------- | ----------------------------------------------- | | `list[UIMessage]` | A list of UIMessage objects in Vercel AI format | Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_adapter.py` ```python @classmethod def dump_messages( cls, messages: Sequence[ModelMessage], ) -> list[UIMessage]: """Transform Pydantic AI messages into Vercel AI messages. Args: messages: A sequence of ModelMessage objects to convert Returns: A list of UIMessage objects in Vercel AI format """ tool_results: dict[str, ToolReturnPart | RetryPromptPart] = {} for msg in messages: if isinstance(msg, ModelRequest): for part in msg.parts: if isinstance(part, ToolReturnPart): tool_results[part.tool_call_id] = part elif isinstance(part, RetryPromptPart) and part.tool_name: tool_results[part.tool_call_id] = part result: list[UIMessage] = [] for msg in messages: if isinstance(msg, ModelRequest): system_ui_parts, user_ui_parts = cls._dump_request_message(msg) if system_ui_parts: result.append(UIMessage(id=str(uuid.uuid4()), role='system', parts=system_ui_parts)) if user_ui_parts: result.append(UIMessage(id=str(uuid.uuid4()), role='user', parts=user_ui_parts)) elif isinstance( # pragma: no branch msg, ModelResponse ): ui_parts: list[UIMessagePart] = cls._dump_response_message(msg, tool_results) if ui_parts: # pragma: no branch result.append(UIMessage(id=str(uuid.uuid4()), role='assistant', parts=ui_parts)) else: assert_never(msg) return result ``` ### VercelAIEventStream Bases: `UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]` UI event stream transformer for the Vercel AI protocol. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/_event_stream.py` ```python @dataclass class VercelAIEventStream(UIEventStream[RequestData, BaseChunk, AgentDepsT, OutputDataT]): """UI event stream transformer for the Vercel AI protocol.""" _: KW_ONLY sdk_version: Literal[5, 6] = 5 """Vercel AI SDK version to target.""" _step_started: bool = False _finish_reason: FinishReason = None @property def response_headers(self) -> Mapping[str, str] | None: return VERCEL_AI_DSP_HEADERS def encode_event(self, event: BaseChunk) -> str: return f'data: {event.encode(self.sdk_version)}\n\n' async def before_stream(self) -> AsyncIterator[BaseChunk]: yield StartChunk() async def before_response(self) -> AsyncIterator[BaseChunk]: if self._step_started: yield FinishStepChunk() self._step_started = True yield StartStepChunk() async def after_stream(self) -> AsyncIterator[BaseChunk]: yield FinishStepChunk() yield FinishChunk(finish_reason=self._finish_reason) yield DoneChunk() async def handle_run_result(self, event: AgentRunResultEvent) -> AsyncIterator[BaseChunk]: pydantic_reason = event.result.response.finish_reason if pydantic_reason: self._finish_reason = _FINISH_REASON_MAP.get(pydantic_reason, 'other') return yield async def on_error(self, error: Exception) -> AsyncIterator[BaseChunk]: self._finish_reason = 'error' yield ErrorChunk(error_text=str(error)) async def handle_text_start(self, part: TextPart, follows_text: bool = False) -> AsyncIterator[BaseChunk]: provider_metadata = dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ) if follows_text: message_id = self.message_id else: message_id = self.new_message_id() yield TextStartChunk(id=message_id, provider_metadata=provider_metadata) if part.content: yield TextDeltaChunk(id=message_id, delta=part.content, provider_metadata=provider_metadata) async def handle_text_delta(self, delta: TextPartDelta) -> AsyncIterator[BaseChunk]: if delta.content_delta: # pragma: no branch provider_metadata = dump_provider_metadata( provider_name=delta.provider_name, provider_details=delta.provider_details ) yield TextDeltaChunk(id=self.message_id, delta=delta.content_delta, provider_metadata=provider_metadata) async def handle_text_end(self, part: TextPart, followed_by_text: bool = False) -> AsyncIterator[BaseChunk]: if not followed_by_text: provider_metadata = dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ) yield TextEndChunk(id=self.message_id, provider_metadata=provider_metadata) async def handle_thinking_start( self, part: ThinkingPart, follows_thinking: bool = False ) -> AsyncIterator[BaseChunk]: message_id = self.new_message_id() provider_metadata = dump_provider_metadata( id=part.id, signature=part.signature, provider_name=part.provider_name, provider_details=part.provider_details, ) yield ReasoningStartChunk(id=message_id, provider_metadata=provider_metadata) if part.content: yield ReasoningDeltaChunk(id=message_id, delta=part.content, provider_metadata=provider_metadata) async def handle_thinking_delta(self, delta: ThinkingPartDelta) -> AsyncIterator[BaseChunk]: if delta.content_delta: # pragma: no branch provider_metadata = dump_provider_metadata( provider_name=delta.provider_name, signature=delta.signature_delta, provider_details=delta.provider_details, ) yield ReasoningDeltaChunk( id=self.message_id, delta=delta.content_delta, provider_metadata=provider_metadata ) async def handle_thinking_end( self, part: ThinkingPart, followed_by_thinking: bool = False ) -> AsyncIterator[BaseChunk]: provider_metadata = dump_provider_metadata( id=part.id, signature=part.signature, provider_name=part.provider_name, provider_details=part.provider_details, ) yield ReasoningEndChunk(id=self.message_id, provider_metadata=provider_metadata) def handle_tool_call_start(self, part: ToolCallPart | BuiltinToolCallPart) -> AsyncIterator[BaseChunk]: return self._handle_tool_call_start(part) def handle_builtin_tool_call_start(self, part: BuiltinToolCallPart) -> AsyncIterator[BaseChunk]: return self._handle_tool_call_start(part, provider_executed=True) async def _handle_tool_call_start( self, part: ToolCallPart | BuiltinToolCallPart, tool_call_id: str | None = None, provider_executed: bool | None = None, ) -> AsyncIterator[BaseChunk]: tool_call_id = tool_call_id or part.tool_call_id yield ToolInputStartChunk( tool_call_id=tool_call_id, tool_name=part.tool_name, provider_executed=provider_executed, provider_metadata=dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ), ) if part.args: yield ToolInputDeltaChunk(tool_call_id=tool_call_id, input_text_delta=part.args_as_json_str()) async def handle_tool_call_delta(self, delta: ToolCallPartDelta) -> AsyncIterator[BaseChunk]: tool_call_id = delta.tool_call_id or '' assert tool_call_id, '`ToolCallPartDelta.tool_call_id` must be set' yield ToolInputDeltaChunk( tool_call_id=tool_call_id, input_text_delta=delta.args_delta if isinstance(delta.args_delta, str) else _json_dumps(delta.args_delta), ) async def handle_tool_call_end(self, part: ToolCallPart) -> AsyncIterator[BaseChunk]: yield ToolInputAvailableChunk( tool_call_id=part.tool_call_id, tool_name=part.tool_name, input=part.args_as_dict(), provider_metadata=dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ), ) async def handle_builtin_tool_call_end(self, part: BuiltinToolCallPart) -> AsyncIterator[BaseChunk]: yield ToolInputAvailableChunk( tool_call_id=part.tool_call_id, tool_name=part.tool_name, input=part.args_as_dict(), provider_executed=True, provider_metadata=dump_provider_metadata( id=part.id, provider_name=part.provider_name, provider_details=part.provider_details ), ) async def handle_builtin_tool_return(self, part: BuiltinToolReturnPart) -> AsyncIterator[BaseChunk]: yield ToolOutputAvailableChunk( tool_call_id=part.tool_call_id, output=self._tool_return_output(part), provider_executed=True, ) async def handle_file(self, part: FilePart) -> AsyncIterator[BaseChunk]: file = part.content yield FileChunk(url=file.data_uri, media_type=file.media_type) async def handle_function_tool_result(self, event: FunctionToolResultEvent) -> AsyncIterator[BaseChunk]: part = event.result if isinstance(part, RetryPromptPart): yield ToolOutputErrorChunk(tool_call_id=part.tool_call_id, error_text=part.model_response()) else: yield ToolOutputAvailableChunk(tool_call_id=part.tool_call_id, output=self._tool_return_output(part)) # ToolCallResultEvent.content may hold user parts (e.g. text, images) that Vercel AI does not currently have events for def _tool_return_output(self, part: BaseToolReturnPart) -> Any: output = part.model_response_object() # Unwrap the return value from the output dictionary if it exists return output.get('return_value', output) ``` #### sdk_version ```python sdk_version: Literal[5, 6] = 5 ``` Vercel AI SDK version to target. Vercel AI request types (UI messages). Converted to Python from: https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui/ui-messages.ts ### ProviderMetadata ```python ProviderMetadata = dict[str, dict[str, JSONValue]] ``` Provider metadata. ### BaseUIPart Bases: `CamelBaseModel`, `ABC` Abstract base class for all UI parts. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class BaseUIPart(CamelBaseModel, ABC): """Abstract base class for all UI parts.""" ``` ### TextUIPart Bases: `BaseUIPart` A text part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class TextUIPart(BaseUIPart): """A text part of a message.""" type: Literal['text'] = 'text' text: str """The text content.""" state: Literal['streaming', 'done'] | None = None """The state of the text part.""" provider_metadata: ProviderMetadata | None = None """The provider metadata.""" ``` #### text ```python text: str ``` The text content. #### state ```python state: Literal['streaming', 'done'] | None = None ``` The state of the text part. #### provider_metadata ```python provider_metadata: ProviderMetadata | None = None ``` The provider metadata. ### ReasoningUIPart Bases: `BaseUIPart` A reasoning part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class ReasoningUIPart(BaseUIPart): """A reasoning part of a message.""" type: Literal['reasoning'] = 'reasoning' text: str """The reasoning text.""" state: Literal['streaming', 'done'] | None = None """The state of the reasoning part.""" provider_metadata: ProviderMetadata | None = None """The provider metadata.""" ``` #### text ```python text: str ``` The reasoning text. #### state ```python state: Literal['streaming', 'done'] | None = None ``` The state of the reasoning part. #### provider_metadata ```python provider_metadata: ProviderMetadata | None = None ``` The provider metadata. ### SourceUrlUIPart Bases: `BaseUIPart` A source part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class SourceUrlUIPart(BaseUIPart): """A source part of a message.""" type: Literal['source-url'] = 'source-url' source_id: str url: str title: str | None = None provider_metadata: ProviderMetadata | None = None ``` ### SourceDocumentUIPart Bases: `BaseUIPart` A document source part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class SourceDocumentUIPart(BaseUIPart): """A document source part of a message.""" type: Literal['source-document'] = 'source-document' source_id: str media_type: str title: str filename: str | None = None provider_metadata: ProviderMetadata | None = None ``` ### FileUIPart Bases: `BaseUIPart` A file part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class FileUIPart(BaseUIPart): """A file part of a message.""" type: Literal['file'] = 'file' media_type: str """ IANA media type of the file. @see https://www.iana.org/assignments/media-types/media-types.xhtml """ filename: str | None = None """Optional filename of the file.""" url: str """ The URL of the file. It can either be a URL to a hosted file or a [Data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs). """ provider_metadata: ProviderMetadata | None = None """The provider metadata.""" ``` #### media_type ```python media_type: str ``` IANA media type of the file. @see https://www.iana.org/assignments/media-types/media-types.xhtml #### filename ```python filename: str | None = None ``` Optional filename of the file. #### url ```python url: str ``` The URL of the file. It can either be a URL to a hosted file or a [Data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs). #### provider_metadata ```python provider_metadata: ProviderMetadata | None = None ``` The provider metadata. ### StepStartUIPart Bases: `BaseUIPart` A step boundary part of a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class StepStartUIPart(BaseUIPart): """A step boundary part of a message.""" type: Literal['step-start'] = 'step-start' ``` ### DataUIPart Bases: `BaseUIPart` Data part with dynamic type based on data name. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class DataUIPart(BaseUIPart): """Data part with dynamic type based on data name.""" type: Annotated[str, Field(pattern=r'^data-')] id: str | None = None data: Any ``` ### ToolInputStreamingPart Bases: `BaseUIPart` Tool part in input-streaming state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class ToolInputStreamingPart(BaseUIPart): """Tool part in input-streaming state.""" type: Annotated[str, Field(pattern=r'^tool-')] tool_call_id: str state: Literal['input-streaming'] = 'input-streaming' input: Any | None = None provider_executed: bool | None = None call_provider_metadata: ProviderMetadata | None = None ``` ### ToolInputAvailablePart Bases: `BaseUIPart` Tool part in input-available state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class ToolInputAvailablePart(BaseUIPart): """Tool part in input-available state.""" type: Annotated[str, Field(pattern=r'^tool-')] tool_call_id: str state: Literal['input-available'] = 'input-available' input: Any | None = None provider_executed: bool | None = None call_provider_metadata: ProviderMetadata | None = None ``` ### ToolOutputAvailablePart Bases: `BaseUIPart` Tool part in output-available state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class ToolOutputAvailablePart(BaseUIPart): """Tool part in output-available state.""" type: Annotated[str, Field(pattern=r'^tool-')] tool_call_id: str state: Literal['output-available'] = 'output-available' input: Any | None = None output: Any | None = None provider_executed: bool | None = None call_provider_metadata: ProviderMetadata | None = None preliminary: bool | None = None ``` ### ToolOutputErrorPart Bases: `BaseUIPart` Tool part in output-error state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class ToolOutputErrorPart(BaseUIPart): """Tool part in output-error state.""" type: Annotated[str, Field(pattern=r'^tool-')] tool_call_id: str state: Literal['output-error'] = 'output-error' input: Any | None = None raw_input: Any | None = None error_text: str provider_executed: bool | None = None call_provider_metadata: ProviderMetadata | None = None ``` ### ToolUIPart ```python ToolUIPart = ( ToolInputStreamingPart | ToolInputAvailablePart | ToolOutputAvailablePart | ToolOutputErrorPart ) ``` Union of all tool part types. ### DynamicToolInputStreamingPart Bases: `BaseUIPart` Dynamic tool part in input-streaming state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class DynamicToolInputStreamingPart(BaseUIPart): """Dynamic tool part in input-streaming state.""" type: Literal['dynamic-tool'] = 'dynamic-tool' tool_name: str tool_call_id: str state: Literal['input-streaming'] = 'input-streaming' input: Any | None = None call_provider_metadata: ProviderMetadata | None = None ``` ### DynamicToolInputAvailablePart Bases: `BaseUIPart` Dynamic tool part in input-available state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class DynamicToolInputAvailablePart(BaseUIPart): """Dynamic tool part in input-available state.""" type: Literal['dynamic-tool'] = 'dynamic-tool' tool_name: str tool_call_id: str state: Literal['input-available'] = 'input-available' input: Any call_provider_metadata: ProviderMetadata | None = None ``` ### DynamicToolOutputAvailablePart Bases: `BaseUIPart` Dynamic tool part in output-available state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class DynamicToolOutputAvailablePart(BaseUIPart): """Dynamic tool part in output-available state.""" type: Literal['dynamic-tool'] = 'dynamic-tool' tool_name: str tool_call_id: str state: Literal['output-available'] = 'output-available' input: Any output: Any call_provider_metadata: ProviderMetadata | None = None preliminary: bool | None = None ``` ### DynamicToolOutputErrorPart Bases: `BaseUIPart` Dynamic tool part in output-error state. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class DynamicToolOutputErrorPart(BaseUIPart): """Dynamic tool part in output-error state.""" type: Literal['dynamic-tool'] = 'dynamic-tool' tool_name: str tool_call_id: str state: Literal['output-error'] = 'output-error' input: Any error_text: str call_provider_metadata: ProviderMetadata | None = None ``` ### DynamicToolUIPart ```python DynamicToolUIPart = ( DynamicToolInputStreamingPart | DynamicToolInputAvailablePart | DynamicToolOutputAvailablePart | DynamicToolOutputErrorPart ) ``` Union of all dynamic tool part types. ### UIMessagePart ```python UIMessagePart = ( TextUIPart | ReasoningUIPart | ToolUIPart | DynamicToolUIPart | SourceUrlUIPart | SourceDocumentUIPart | FileUIPart | DataUIPart | StepStartUIPart ) ``` Union of all message part types. ### UIMessage Bases: `CamelBaseModel` A message as displayed in the UI by Vercel AI Elements. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class UIMessage(CamelBaseModel): """A message as displayed in the UI by Vercel AI Elements.""" id: str """A unique identifier for the message.""" role: Literal['system', 'user', 'assistant'] """The role of the message.""" metadata: Any | None = None """The metadata of the message.""" parts: list[UIMessagePart] """ The parts of the message. Use this for rendering the message in the UI. System messages should be avoided (set the system prompt on the server instead). They can have text parts. User messages can have text parts and file parts. Assistant messages can have text, reasoning, tool invocation, and file parts. """ ``` #### id ```python id: str ``` A unique identifier for the message. #### role ```python role: Literal['system', 'user', 'assistant'] ``` The role of the message. #### metadata ```python metadata: Any | None = None ``` The metadata of the message. #### parts ```python parts: list[UIMessagePart] ``` The parts of the message. Use this for rendering the message in the UI. System messages should be avoided (set the system prompt on the server instead). They can have text parts. User messages can have text parts and file parts. Assistant messages can have text, reasoning, tool invocation, and file parts. ### SubmitMessage Bases: `CamelBaseModel` Submit message request. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class SubmitMessage(CamelBaseModel, extra='allow'): """Submit message request.""" trigger: Literal['submit-message'] = 'submit-message' id: str messages: list[UIMessage] ``` ### RegenerateMessage Bases: `CamelBaseModel` Ask the agent to regenerate a message. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/request_types.py` ```python class RegenerateMessage(CamelBaseModel, extra='allow'): """Ask the agent to regenerate a message.""" trigger: Literal['regenerate-message'] id: str messages: list[UIMessage] message_id: str ``` ### RequestData ```python RequestData = Annotated[ SubmitMessage | RegenerateMessage, Discriminator("trigger"), ] ``` Union of all request data types. Vercel AI response types (SSE chunks). Converted to Python from: https://github.com/vercel/ai/blob/ai%406.0.57/packages/ai/src/ui-message-stream/ui-message-chunks.ts ### ProviderMetadata ```python ProviderMetadata = dict[str, dict[str, JSONValue]] ``` Provider metadata. ### FinishReason ```python FinishReason = ( Literal[ "stop", "length", "content-filter", "tool-calls", "error", "other", ] | None ) ``` Reason why the model finished generating. ### BaseChunk Bases: `CamelBaseModel`, `ABC` Abstract base class for response SSE events. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class BaseChunk(CamelBaseModel, ABC): """Abstract base class for response SSE events.""" def encode(self, sdk_version: int) -> str: return self.model_dump_json(by_alias=True, exclude_none=True) ``` ### TextStartChunk Bases: `BaseChunk` Text start chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class TextStartChunk(BaseChunk): """Text start chunk.""" type: Literal['text-start'] = 'text-start' id: str provider_metadata: ProviderMetadata | None = None ``` ### TextDeltaChunk Bases: `BaseChunk` Text delta chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class TextDeltaChunk(BaseChunk): """Text delta chunk.""" type: Literal['text-delta'] = 'text-delta' delta: str id: str provider_metadata: ProviderMetadata | None = None ``` ### TextEndChunk Bases: `BaseChunk` Text end chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class TextEndChunk(BaseChunk): """Text end chunk.""" type: Literal['text-end'] = 'text-end' id: str provider_metadata: ProviderMetadata | None = None ``` ### ReasoningStartChunk Bases: `BaseChunk` Reasoning start chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ReasoningStartChunk(BaseChunk): """Reasoning start chunk.""" type: Literal['reasoning-start'] = 'reasoning-start' id: str provider_metadata: ProviderMetadata | None = None ``` ### ReasoningDeltaChunk Bases: `BaseChunk` Reasoning delta chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ReasoningDeltaChunk(BaseChunk): """Reasoning delta chunk.""" type: Literal['reasoning-delta'] = 'reasoning-delta' id: str delta: str provider_metadata: ProviderMetadata | None = None ``` ### ReasoningEndChunk Bases: `BaseChunk` Reasoning end chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ReasoningEndChunk(BaseChunk): """Reasoning end chunk.""" type: Literal['reasoning-end'] = 'reasoning-end' id: str provider_metadata: ProviderMetadata | None = None ``` ### ErrorChunk Bases: `BaseChunk` Error chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ErrorChunk(BaseChunk): """Error chunk.""" type: Literal['error'] = 'error' error_text: str ``` ### ToolInputStartChunk Bases: `BaseChunk` Tool input start chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolInputStartChunk(BaseChunk): """Tool input start chunk.""" type: Literal['tool-input-start'] = 'tool-input-start' tool_call_id: str tool_name: str provider_executed: bool | None = None provider_metadata: ProviderMetadata | None = None dynamic: bool | None = None def encode(self, sdk_version: int) -> str: exclude = {'provider_metadata'} if sdk_version < 6 else None return self.model_dump_json(by_alias=True, exclude_none=True, exclude=exclude) ``` ### ToolInputDeltaChunk Bases: `BaseChunk` Tool input delta chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolInputDeltaChunk(BaseChunk): """Tool input delta chunk.""" type: Literal['tool-input-delta'] = 'tool-input-delta' tool_call_id: str input_text_delta: str ``` ### ToolOutputAvailableChunk Bases: `BaseChunk` Tool output available chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolOutputAvailableChunk(BaseChunk): """Tool output available chunk.""" type: Literal['tool-output-available'] = 'tool-output-available' tool_call_id: str output: Any provider_executed: bool | None = None dynamic: bool | None = None preliminary: bool | None = None ``` ### ToolInputAvailableChunk Bases: `BaseChunk` Tool input available chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolInputAvailableChunk(BaseChunk): """Tool input available chunk.""" type: Literal['tool-input-available'] = 'tool-input-available' tool_call_id: str tool_name: str input: Any provider_executed: bool | None = None provider_metadata: ProviderMetadata | None = None dynamic: bool | None = None ``` ### ToolInputErrorChunk Bases: `BaseChunk` Tool input error chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolInputErrorChunk(BaseChunk): """Tool input error chunk.""" type: Literal['tool-input-error'] = 'tool-input-error' tool_call_id: str tool_name: str input: Any provider_executed: bool | None = None provider_metadata: ProviderMetadata | None = None dynamic: bool | None = None error_text: str ``` ### ToolOutputErrorChunk Bases: `BaseChunk` Tool output error chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolOutputErrorChunk(BaseChunk): """Tool output error chunk.""" type: Literal['tool-output-error'] = 'tool-output-error' tool_call_id: str error_text: str provider_executed: bool | None = None dynamic: bool | None = None ``` ### ToolApprovalRequestChunk Bases: `BaseChunk` Tool approval request chunk for human-in-the-loop approval. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolApprovalRequestChunk(BaseChunk): """Tool approval request chunk for human-in-the-loop approval.""" type: Literal['tool-approval-request'] = 'tool-approval-request' approval_id: str tool_call_id: str ``` ### ToolOutputDeniedChunk Bases: `BaseChunk` Tool output denied chunk when user denies tool execution. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class ToolOutputDeniedChunk(BaseChunk): """Tool output denied chunk when user denies tool execution.""" type: Literal['tool-output-denied'] = 'tool-output-denied' tool_call_id: str ``` ### SourceUrlChunk Bases: `BaseChunk` Source URL chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class SourceUrlChunk(BaseChunk): """Source URL chunk.""" type: Literal['source-url'] = 'source-url' source_id: str url: str title: str | None = None provider_metadata: ProviderMetadata | None = None ``` ### SourceDocumentChunk Bases: `BaseChunk` Source document chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class SourceDocumentChunk(BaseChunk): """Source document chunk.""" type: Literal['source-document'] = 'source-document' source_id: str media_type: str title: str filename: str | None = None provider_metadata: ProviderMetadata | None = None ``` ### FileChunk Bases: `BaseChunk` File chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class FileChunk(BaseChunk): """File chunk.""" type: Literal['file'] = 'file' url: str media_type: str ``` ### DataChunk Bases: `BaseChunk` Data chunk with dynamic type. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class DataChunk(BaseChunk): """Data chunk with dynamic type.""" type: Annotated[str, Field(pattern=r'^data-')] id: str | None = None data: Any transient: bool | None = None ``` ### StartStepChunk Bases: `BaseChunk` Start step chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class StartStepChunk(BaseChunk): """Start step chunk.""" type: Literal['start-step'] = 'start-step' ``` ### FinishStepChunk Bases: `BaseChunk` Finish step chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class FinishStepChunk(BaseChunk): """Finish step chunk.""" type: Literal['finish-step'] = 'finish-step' ``` ### StartChunk Bases: `BaseChunk` Start chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class StartChunk(BaseChunk): """Start chunk.""" type: Literal['start'] = 'start' message_id: str | None = None message_metadata: Any | None = None ``` ### FinishChunk Bases: `BaseChunk` Finish chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class FinishChunk(BaseChunk): """Finish chunk.""" type: Literal['finish'] = 'finish' finish_reason: FinishReason = None message_metadata: Any | None = None ``` ### AbortChunk Bases: `BaseChunk` Abort chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class AbortChunk(BaseChunk): """Abort chunk.""" type: Literal['abort'] = 'abort' reason: str | None = None ``` ### MessageMetadataChunk Bases: `BaseChunk` Message metadata chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class MessageMetadataChunk(BaseChunk): """Message metadata chunk.""" type: Literal['message-metadata'] = 'message-metadata' message_metadata: Any ``` ### DoneChunk Bases: `BaseChunk` Done chunk. Source code in `pydantic_ai_slim/pydantic_ai/ui/vercel_ai/response_types.py` ```python class DoneChunk(BaseChunk): """Done chunk.""" type: Literal['done'] = 'done' def encode(self, sdk_version: int) -> str: return '[DONE]' ``` # Evals # Pydantic Evals **Pydantic Evals** is a powerful evaluation framework for systematically testing and evaluating AI systems, from simple LLM calls to complex multi-agent applications. ## Design Philosophy Code-First Approach Pydantic Evals follows a code-first philosophy where all evaluation components are defined in Python. This differs from platforms with web-based configuration. You write and run evals in code, and can write the results to disk or view them in your terminal or in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). Evals are an Emerging Practice Unlike unit tests, evals are an emerging art/science. Anyone who claims to know exactly how your evals should be defined can safely be ignored. We've designed Pydantic Evals to be flexible and useful without being too opinionated. ## Quick Navigation **Getting Started:** - [Installation](#installation) - [Quick Start](https://ai.pydantic.dev/evals/quick-start/index.md) - [Core Concepts](https://ai.pydantic.dev/evals/core-concepts/index.md) **Evaluators:** - [Evaluators Overview](https://ai.pydantic.dev/evals/evaluators/overview/index.md) - Compare evaluator types and learn when to use each approach - [Built-in Evaluators](https://ai.pydantic.dev/evals/evaluators/built-in/index.md) - Complete reference for exact match, instance checks, and other ready-to-use evaluators - [LLM as a Judge](https://ai.pydantic.dev/evals/evaluators/llm-judge/index.md) - Use LLMs to evaluate subjective qualities, complex criteria, and natural language outputs - [Custom Evaluators](https://ai.pydantic.dev/evals/evaluators/custom/index.md) - Implement domain-specific scoring logic and custom evaluation metrics - [Span-Based Evaluation](https://ai.pydantic.dev/evals/evaluators/span-based/index.md) - Evaluate internal agent behavior (tool calls, execution flow) using OpenTelemetry traces. Essential for complex agents where correctness depends on *how* the answer was reached, not just the final output. Also ensures eval assertions align with production telemetry. **How-To Guides:** - [Logfire Integration](https://ai.pydantic.dev/evals/how-to/logfire-integration/index.md) - Visualize results - [Dataset Management](https://ai.pydantic.dev/evals/how-to/dataset-management/index.md) - Save, load, generate - [Concurrency & Performance](https://ai.pydantic.dev/evals/how-to/concurrency/index.md) - Control parallel execution - [Retry Strategies](https://ai.pydantic.dev/evals/how-to/retry-strategies/index.md) - Handle transient failures - [Metrics & Attributes](https://ai.pydantic.dev/evals/how-to/metrics-attributes/index.md) - Track custom data **Examples:** - [Simple Validation](https://ai.pydantic.dev/evals/examples/simple-validation/index.md) - Basic example **Reference:** - [API Documentation](https://ai.pydantic.dev/api/pydantic_evals/dataset/index.md) ## Code-First Evaluation Pydantic Evals follows a **code-first approach** where you define all evaluation components (datasets, experiments, tasks, cases and evaluators) in Python code, or as serialized data loaded by Python code. This differs from platforms with fully web-based configuration. When you run an *Experiment* you'll see a progress indicator and can print the results wherever you run your python code (IDE, terminal, etc). You also get a report object back that you can serialize and store or send to a notebook or other application for further visualization and analysis. If you are using [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/), your experiment results automatically appear in the Logfire web interface for visualization, comparison, and collaborative analysis. Logfire serves as a observability layer - you write and run evals in code, then view and analyze results in the web UI. ## Installation To install the Pydantic Evals package, run: ```bash pip install pydantic-evals ``` ```bash uv add pydantic-evals ``` `pydantic-evals` does not depend on `pydantic-ai`, but has an optional dependency on `logfire` if you'd like to use OpenTelemetry traces in your evals, or send evaluation results to [logfire](https://pydantic.dev/logfire). ```bash pip install 'pydantic-evals[logfire]' ``` ```bash uv add 'pydantic-evals[logfire]' ``` ## Pydantic Evals Data Model Pydantic Evals is built around a simple data model: ### Data Model Diagram ```text Dataset (1) ──────────── (Many) Case │ │ │ │ └─── (Many) Experiment ──┴─── (Many) Case results │ └─── (1) Task │ └─── (Many) Evaluator ``` ### Key Relationships 1. **Dataset → Cases**: One Dataset contains many Cases 1. **Dataset → Experiments**: One Dataset can be used across many Experiments over time 1. **Experiment → Case results**: One Experiment generates results by executing each Case 1. **Experiment → Task**: One Experiment evaluates one defined Task 1. **Experiment → Evaluators**: One Experiment uses multiple Evaluators. Dataset-wide Evaluators are run against all Cases, and Case-specific Evaluators against their respective Cases ### Data Flow 1. **Dataset creation**: Define cases and evaluators in YAML/JSON, or directly in Python 1. **Experiment execution**: Run `dataset.evaluate_sync(task_function)` 1. **Cases run**: Each Case is executed against the Task 1. **Evaluation**: Evaluators score the Task outputs for each Case 1. **Results**: All Case results are collected into a summary report A metaphor A useful metaphor (although not perfect) is to think of evals like a **Unit Testing** framework: - **Cases + Evaluators** are your individual unit tests - each one defines a specific scenario you want to test, complete with inputs and expected outcomes. Just like a unit test, a case asks: *"Given this input, does my system produce the right output?"* - **Datasets** are like test suites - they are the scaffolding that holds your unit tests together. They group related cases and define shared evaluation criteria that should apply across all tests in the suite. - **Experiments** are like running your entire test suite and getting a report. When you execute `dataset.evaluate_sync(my_ai_function)`, you're running all your cases against your AI system and collecting the results - just like running `pytest` and getting a summary of passes, failures, and performance metrics. The key difference from traditional unit testing is that AI systems are probabilistic. If you're type checking you'll still get a simple pass/fail, but scores for text outputs are likely qualitative and/or categorical, and more open to interpretation. For a deeper understanding, see [Core Concepts](https://ai.pydantic.dev/evals/core-concepts/index.md). ## Datasets and Cases In Pydantic Evals, everything begins with Datasets and Cases: - **Dataset**: A collection of test Cases designed for the evaluation of a specific task or function - **Case**: A single test scenario corresponding to Task inputs, with optional expected outputs, metadata, and case-specific evaluators simple_eval_dataset.py ```python from pydantic_evals import Case, Dataset case1 = Case( name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) dataset = Dataset(cases=[case1]) ``` *(This example is complete, it can be run "as is")* See [Dataset Management](https://ai.pydantic.dev/evals/how-to/dataset-management/index.md) to learn about saving, loading, and generating datasets. ## Evaluators Evaluators analyze and score the results of your Task when tested against a Case. These can be deterministic, code-based checks (such as testing model output format with a regex, or checking for the appearance of PII or sensitive data), or they can assess non-deterministic model outputs for qualities like accuracy, precision/recall, hallucinations, or instruction-following. While both kinds of testing are useful in LLM systems, classical code-based tests are cheaper and easier than tests which require either human or machine review of model outputs. Pydantic Evals includes several [built-in evaluators](https://ai.pydantic.dev/evals/evaluators/built-in/index.md) and allows you to define [custom evaluators](https://ai.pydantic.dev/evals/evaluators/custom/index.md): simple_eval_evaluator.py ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.evaluators.common import IsInstance from simple_eval_dataset import dataset dataset.add_evaluator(IsInstance(type_name='str')) # (1)! @dataclass class MyEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: # (2)! if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset.add_evaluator(MyEvaluator()) ``` 1. You can add built-in evaluators to a dataset using the add_evaluator method. 1. This custom evaluator returns a simple score based on whether the output matches the expected output. *(This example is complete, it can be run "as is")* Learn more: - [Evaluators Overview](https://ai.pydantic.dev/evals/evaluators/overview/index.md) - When to use different types - [Built-in Evaluators](https://ai.pydantic.dev/evals/evaluators/built-in/index.md) - Complete reference - [LLM Judge](https://ai.pydantic.dev/evals/evaluators/llm-judge/index.md) - Using LLMs as evaluators - [Custom Evaluators](https://ai.pydantic.dev/evals/evaluators/custom/index.md) - Write your own logic - [Span-Based Evaluation](https://ai.pydantic.dev/evals/evaluators/span-based/index.md) - Analyze execution traces ## Running Experiments Performing evaluations involves running a task against all cases in a dataset, also known as running an "experiment". Putting the above two examples together and using the more declarative `evaluators` kwarg to Dataset: simple_eval_complete.py ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance case1 = Case( # (1)! name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) class MyEvaluator(Evaluator[str, str]): def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset = Dataset( cases=[case1], evaluators=[IsInstance(type_name='str'), MyEvaluator()], # (2)! ) async def guess_city(question: str) -> str: # (3)! return 'Paris' report = dataset.evaluate_sync(guess_city) # (4)! report.print(include_input=True, include_output=True, include_durations=False) # (5)! """ Evaluation Summary: guess_city ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ simple_case │ What is the capital of France? │ Paris │ MyEvaluator: 1.00 │ ✔ │ ├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤ │ Averages │ │ │ MyEvaluator: 1.00 │ 100.0% ✔ │ └─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘ """ ``` 1. Create a test case as above 1. Create a Dataset with test cases and evaluators 1. Our function to evaluate. 1. Run the evaluation with evaluate_sync, which runs the function against all test cases in the dataset, and returns an EvaluationReport object. 1. Print the report with print, which shows the results of the evaluation. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* See [Quick Start](https://ai.pydantic.dev/evals/quick-start/index.md) for more examples and [Concurrency & Performance](https://ai.pydantic.dev/evals/how-to/concurrency/index.md) to learn about controlling parallel execution. ## API Reference For comprehensive coverage of all classes, methods, and configuration options, see the detailed [API Reference documentation](https://ai.pydantic.dev/api/pydantic_evals/dataset/). ## Next Steps 1. **Start with simple evaluations** using [Quick Start](https://ai.pydantic.dev/evals/quick-start/index.md) 1. **Understand the data model** with [Core Concepts](https://ai.pydantic.dev/evals/core-concepts/index.md) 1. **Explore built-in evaluators** in [Built-in Evaluators](https://ai.pydantic.dev/evals/evaluators/built-in/index.md) 1. **Integrate with Logfire** for visualization: [Logfire Integration](https://ai.pydantic.dev/evals/how-to/logfire-integration/index.md) 1. **Build comprehensive test suites** with [Dataset Management](https://ai.pydantic.dev/evals/how-to/dataset-management/index.md) 1. **Implement custom evaluators** for domain-specific metrics: [Custom Evaluators](https://ai.pydantic.dev/evals/evaluators/custom/index.md) # Durable Execution # Durable Execution with DBOS [DBOS](https://www.dbos.dev/) is a lightweight [durable execution](https://docs.dbos.dev/architecture) library natively integrated with Pydantic AI. ## Durable Execution DBOS workflows make your program **durable** by checkpointing its state in a database. If your program ever fails, when it restarts all your workflows will automatically resume from the last completed step. - **Workflows** must be deterministic and generally cannot include I/O. - **Steps** may perform I/O (network, disk, API calls). If a step fails, it restarts from the beginning. Every workflow input and step output is durably stored in the system database. When workflow execution fails, whether from crashes, network issues, or server restarts, DBOS leverages these checkpoints to recover workflows from their last completed step. DBOS **queues** provide durable, database-backed alternatives to systems like Celery or BullMQ, supporting features such as concurrency limits, rate limits, timeouts, and prioritization. See the [DBOS docs](https://docs.dbos.dev/architecture) for details. The diagram below shows the overall architecture of an agentic application in DBOS. DBOS runs fully in-process as a library. Functions remain normal Python functions but are checkpointed into a database (Postgres or SQLite). ```text Clients (HTTP, RPC, Kafka, etc.) | v +------------------------------------------------------+ | Application Servers | | | | +----------------------------------------------+ | | | Pydantic AI + DBOS Libraries | | | | | | | | [ Workflows (Agent Run Loop) ] | | | | [ Steps (Tool, MCP, Model) ] | | | | [ Queues ] [ Cron Jobs ] [ Messaging ] | | | +----------------------------------------------+ | | | +------------------------------------------------------+ | v +------------------------------------------------------+ | Database | | (Stores workflow and step state, schedules tasks) | +------------------------------------------------------+ ``` See the [DBOS documentation](https://docs.dbos.dev/architecture) for more information. ## Durable Agent Any agent can be wrapped in a DBOSAgent to get durable execution. `DBOSAgent` automatically:, - Wraps `Agent.run` and `Agent.run_sync` as DBOS workflows. - Wraps [model requests](https://ai.pydantic.dev/models/overview/index.md) and [MCP communication](https://ai.pydantic.dev/mcp/client/index.md) as DBOS steps. Custom tool functions and event stream handlers are **not automatically wrapped** by DBOS. If they involve non-deterministic behavior or perform I/O, you should explicitly decorate them with `@DBOS.step`. The original agent, model, and MCP server can still be used as normal outside the DBOS workflow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with the DBOS [open-source library](https://github.com/dbos-inc/dbos-transact-py): ```bash pip install pydantic-ai[dbos] ``` ```bash uv add pydantic-ai[dbos] ``` Or if you're using the slim package, you can install it with the `dbos` optional group: ```bash pip install pydantic-ai-slim[dbos] ``` ```bash uv add pydantic-ai-slim[dbos] ``` dbos_agent.py ```python from dbos import DBOS, DBOSConfig from pydantic_ai import Agent from pydantic_ai.durable_exec.dbos import DBOSAgent dbos_config: DBOSConfig = { 'name': 'pydantic_dbos_agent', 'system_database_url': 'sqlite:///dbostest.sqlite', # (3)! } DBOS(config=dbos_config) agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', # (4)! ) dbos_agent = DBOSAgent(agent) # (1)! async def main(): DBOS.launch() result = await dbos_agent.run('What is the capital of Mexico?') # (2)! print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` 1. Workflows and `DBOSAgent` must be defined before `DBOS.launch()` so that recovery can correctly find all workflows. 1. DBOSAgent.run() works like Agent.run(), but runs as a DBOS workflow and executes model requests, decorated tool calls, and MCP communication as DBOS steps. 1. This example uses SQLite. Postgres is recommended for production. 1. The agent's `name` is used to uniquely identify its workflows. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Because DBOS workflows need to be defined before calling `DBOS.launch()` and the `DBOSAgent` instance automatically registers `run` and `run_sync` as workflows, it needs to be defined before calling `DBOS.launch()` as well. For more information on how to use DBOS in Python applications, see their [Python SDK guide](https://docs.dbos.dev/python/programming-guide). ## DBOS Integration Considerations When using DBOS with Pydantic AI agents, there are a few important considerations to ensure workflows and toolsets behave correctly. ### Agent and Toolset Requirements Each agent instance must have a unique `name` so DBOS can correctly resume workflows after a failure or restart. Tools and event stream handlers are not automatically wrapped by DBOS. You can decide how to integrate them: - Decorate with `@DBOS.step` if the function involves non-determinism or I/O. - Skip the decorator if durability isn't needed, so you avoid the extra DB checkpoint write. - If the function needs to enqueue tasks or invoke other DBOS workflows, run it inside the agent's main workflow (not as a step). Other than that, any agent and toolset will just work! ### Agent Run Context and Dependencies DBOS checkpoints workflow inputs/outputs and step outputs into a database using [`pickle`](https://docs.python.org/3/library/pickle.html). This means you need to make sure [dependencies](https://ai.pydantic.dev/dependencies/index.md) object provided to DBOSAgent.run() or DBOSAgent.run_sync(), and tool outputs can be serialized using pickle. You may also want to keep the inputs and outputs small (under ~2 MB). PostgreSQL and SQLite support up to 1 GB per field, but large objects may impact performance. ### Streaming Because DBOS cannot stream output directly to the workflow or step call site, Agent.run_stream() and Agent.run_stream_events() are not supported when running inside of a DBOS workflow. Instead, you can implement streaming by setting an event_stream_handler on the `Agent` or `DBOSAgent` instance and using DBOSAgent.run(). The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](https://ai.pydantic.dev/agent/#streaming-all-events). ### Parallel Tool Execution When using `DBOSAgent`, tools are executed in parallel by default to minimize latency. To guarantee deterministic replay and reliable recovery, DBOS waits for all parallel tool calls to complete before emitting events **in order**. It's equivalent to the behavior of with agent.parallel_tool_call_execution_mode('parallel_ordered_events'). If you prefer strict ordering, you can configure the agent to run tools sequentially by setting parallel_execution_mode='sequential' when initializing the `DBOSAgent`. ## Step Configuration You can customize DBOS step behavior, such as retries, by passing StepConfig objects to the `DBOSAgent` constructor: - `mcp_step_config`: The DBOS step config to use for MCP server communication. No retries if omitted. - `model_step_config`: The DBOS step config to use for model request steps. No retries if omitted. For custom tools, you can annotate them directly with [`@DBOS.step`](https://docs.dbos.dev/python/reference/decorators#step) or [`@DBOS.workflow`](https://docs.dbos.dev/python/reference/decorators#workflow) decorators as needed. These decorators have no effect outside DBOS workflows, so tools remain usable in non-DBOS agents. ## Step Retries On top of the automatic retries for request failures that DBOS will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using DBOS, it's recommended to not use [HTTP Request Retries](https://ai.pydantic.dev/retries/index.md) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](https://ai.pydantic.dev/models/openai/#custom-openai-client). You can customize DBOS's retry policy using [step configuration](#step-configuration). ## Observability with Logfire DBOS can be configured to generate OpenTelemetry spans for each workflow and step execution, and Pydantic AI emits spans for each agent run, model request, and tool invocation. You can send these spans to [Pydantic Logfire](https://ai.pydantic.dev/logfire/index.md) to get a full, end-to-end view of what's happening in your application. For more information about DBOS logging and tracing, please see the [DBOS docs](https://docs.dbos.dev/python/tutorials/logging-and-tracing) for details. # Durable Execution Pydantic AI allows you to build durable agents that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. Durable agents have full support for [streaming](https://ai.pydantic.dev/agent/#streaming-all-events) and [MCP](https://ai.pydantic.dev/mcp/client/index.md), with the added benefit of fault tolerance. Pydantic AI natively supports three durable execution solutions: - [Temporal](https://ai.pydantic.dev/durable_execution/temporal/index.md) - [DBOS](https://ai.pydantic.dev/durable_execution/dbos/index.md) - [Prefect](https://ai.pydantic.dev/durable_execution/prefect/index.md) These integrations only use Pydantic AI's public interface, so they also serve as a reference for integrating with other durable systems. # Durable Execution with Prefect [Prefect](https://www.prefect.io/) is a workflow orchestration framework for building resilient data pipelines in Python, natively integrated with Pydantic AI. ## Durable Execution Prefect 3.0 brings [transactional semantics](https://www.prefect.io/blog/transactional-ml-pipelines-with-prefect-3-0) to your Python workflows, allowing you to group tasks into atomic units and define failure modes. If any part of a transaction fails, the entire transaction can be rolled back to a clean state. - **Flows** are the top-level entry points for your workflow. They can contain tasks and other flows. - **Tasks** are individual units of work that can be retried, cached, and monitored independently. Prefect 3.0's approach to transactional orchestration makes your workflows automatically **idempotent**: rerunnable without duplication or inconsistency across any environment. Every task is executed within a transaction that governs when and where the task's result record is persisted. If the task runs again under an identical context, it will not re-execute but instead load its previous result. The diagram below shows the overall architecture of an agentic application with Prefect. Prefect uses client-side task orchestration by default, with optional server connectivity for advanced features like scheduling and monitoring. ```text +---------------------+ | Prefect Server | (Monitoring, | or Cloud | scheduling, UI, +---------------------+ orchestration) ^ | Flow state, | Schedule flows, metadata, | track execution logs | | +------------------------------------------------------+ | Application Process | | +----------------------------------------------+ | | | Flow (Agent.run) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Task | | Task | | Task | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | | Cache & Cache & Cache & | | persist persist persist | | to to to | | v v v | | +----------------------------------------------+ | | | Result Storage (Local FS, S3, etc.) | | | +----------------------------------------------+ | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Prefect documentation](https://docs.prefect.io/) for more information. ## Durable Agent Any agent can be wrapped in a PrefectAgent to get durable execution. `PrefectAgent` automatically: - Wraps Agent.run and Agent.run_sync as Prefect flows. - Wraps [model requests](https://ai.pydantic.dev/models/overview/index.md) as Prefect tasks. - Wraps [tool calls](https://ai.pydantic.dev/tools/index.md) as Prefect tasks (configurable per-tool). - Wraps [MCP communication](https://ai.pydantic.dev/mcp/client/index.md) as Prefect tasks. Event stream handlers are **automatically wrapped** by Prefect when running inside a Prefect flow. Each event from the stream is processed in a separate Prefect task for durability. You can customize the task behavior using the `event_stream_handler_task_config` parameter when creating the `PrefectAgent`. Do **not** manually decorate event stream handlers with `@task`. For examples, see the [streaming docs](https://ai.pydantic.dev/agent/#streaming-all-events) The original agent, model, and MCP server can still be used as normal outside the Prefect flow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with Prefect: ```bash pip install pydantic-ai[prefect] ``` ```bash uv add pydantic-ai[prefect] ``` Or if you're using the slim package, you can install it with the `prefect` optional group: ```bash pip install pydantic-ai-slim[prefect] ``` ```bash uv add pydantic-ai-slim[prefect] ``` prefect_agent.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', # (1)! ) prefect_agent = PrefectAgent(agent) # (2)! async def main(): result = await prefect_agent.run('What is the capital of Mexico?') # (3)! print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` 1. The agent's `name` is used to uniquely identify its flows and tasks. 1. Wrapping the agent with `PrefectAgent` enables durable execution for all agent runs. 1. PrefectAgent.run() works like Agent.run(), but runs as a Prefect flow and executes model requests, decorated tool calls, and MCP communication as Prefect tasks. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* For more information on how to use Prefect in Python applications, see their [Python documentation](https://docs.prefect.io/v3/how-to-guides/workflows/write-and-run). ## Prefect Integration Considerations When using Prefect with Pydantic AI agents, there are a few important considerations to ensure workflows behave correctly. ### Agent Requirements Each agent instance must have a unique `name` so Prefect can correctly identify and track its flows and tasks. ### Tool Wrapping Agent tools are automatically wrapped as Prefect tasks, which means they benefit from: - **Retry logic**: Failed tool calls can be retried automatically - **Caching**: Tool results are cached based on their inputs - **Observability**: Tool execution is tracked in the Prefect UI You can customize tool task behavior using `tool_task_config` (applies to all tools) or `tool_task_config_by_name` (per-tool configuration): prefect_agent_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent('gpt-5.2', name='my_agent') @agent.tool_plain def fetch_data(url: str) -> str: # This tool will be wrapped as a Prefect task ... prefect_agent = PrefectAgent( agent, tool_task_config=TaskConfig(retries=3), # Default for all tools tool_task_config_by_name={ 'fetch_data': TaskConfig(timeout_seconds=10.0), # Specific to fetch_data 'simple_tool': None, # Disable task wrapping for simple_tool }, ) ``` Set a tool's config to `None` in `tool_task_config_by_name` to disable task wrapping for that specific tool. ### Streaming When running inside a Prefect flow, Agent.run_stream() works but doesn't provide real-time streaming because Prefect tasks consume their entire execution before returning results. The method will execute fully and return the complete result at once. For real-time streaming behavior inside Prefect flows, you can set an event_stream_handler on the `Agent` or `PrefectAgent` instance and use PrefectAgent.run(). **Note**: Event stream handlers behave differently when running inside a Prefect flow versus outside: - **Outside a flow**: The handler receives events as they stream from the model - **Inside a flow**: Each event is wrapped as a Prefect task for durability, which may affect timing but ensures reliability The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](https://ai.pydantic.dev/agent/#streaming-all-events). ## Task Configuration You can customize Prefect task behavior, such as retries and timeouts, by passing TaskConfig objects to the `PrefectAgent` constructor: - `mcp_task_config`: Configuration for MCP server communication tasks - `model_task_config`: Configuration for model request tasks - `tool_task_config`: Default configuration for all tool calls - `tool_task_config_by_name`: Per-tool task configuration (overrides `tool_task_config`) - `event_stream_handler_task_config`: Configuration for event stream handler tasks (applies when running inside a Prefect flow) Available `TaskConfig` options: - `retries`: Maximum number of retries for the task (default: `0`) - `retry_delay_seconds`: Delay between retries in seconds (can be a single value or list for exponential backoff, default: `1.0`) - `timeout_seconds`: Maximum time in seconds for the task to complete - `cache_policy`: Custom Prefect cache policy for the task - `persist_result`: Whether to persist the task result - `result_storage`: Prefect result storage for the task (e.g., `'s3-bucket/my-storage'` or a `WritableFileSystem` block) - `log_prints`: Whether to log print statements from the task (default: `False`) Example: prefect_agent_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent( 'gpt-5.2', instructions="You're an expert in geography.", name='geography', ) prefect_agent = PrefectAgent( agent, model_task_config=TaskConfig( retries=3, retry_delay_seconds=[1.0, 2.0, 4.0], # Exponential backoff timeout_seconds=30.0, ), ) async def main(): result = await prefect_agent.run('What is the capital of France?') print(result.output) #> Paris ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Retry Considerations Pydantic AI and provider API clients have their own retry logic. When using Prefect, you may want to: - Disable [HTTP Request Retries](https://ai.pydantic.dev/retries/index.md) in Pydantic AI - Turn off your provider API client's retry logic (e.g., `max_retries=0` on a [custom OpenAI client](https://ai.pydantic.dev/models/openai/#custom-openai-client)) - Rely on Prefect's task-level retry configuration for consistency This prevents requests from being retried multiple times at different layers. ## Caching and Idempotency Prefect 3.0 provides built-in caching and transactional semantics. Tasks with identical inputs will not re-execute if their results are already cached, making workflows naturally idempotent and resilient to failures. - **Task inputs**: Messages, settings, parameters, tool arguments, and serializable dependencies **Note**: For user dependencies to be included in cache keys, they must be serializable (e.g., Pydantic models or basic Python types). Non-serializable dependencies are automatically excluded from cache computation. ## Observability with Prefect and Logfire Prefect provides a built-in UI for monitoring flow runs, task executions, and failures. You can: - View real-time flow run status - Debug failures with full stack traces - Set up alerts and notifications To access the Prefect UI, you can either: 1. Use [Prefect Cloud](https://www.prefect.io/cloud) (managed service) 1. Run a local [Prefect server](https://docs.prefect.io/v3/how-to-guides/self-hosted/server-cli) with `prefect server start` You can also use [Pydantic Logfire](https://ai.pydantic.dev/logfire/index.md) for detailed observability. When using both Prefect and Logfire, you'll get complementary views: - **Prefect**: Workflow-level orchestration, task status, and retry history - **Logfire**: Fine-grained tracing of agent runs, model requests, and tool invocations When using Logfire with Prefect, you can enable distributed tracing to see spans for your Prefect runs included with your agent runs, model requests, and tool invocations. For more information about Prefect monitoring, see the [Prefect documentation](https://docs.prefect.io/). ## Deployments and Scheduling To deploy and schedule a `PrefectAgent`, wrap it in a Prefect flow and use the flow's [`serve()`](https://docs.prefect.io/v3/how-to-guides/deployments/create-deployments#create-a-deployment-with-serve) or [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) methods: [Learn about Gateway](https://ai.pydantic.dev/gateway) serve_agent.py ```python from prefect import flow from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent @flow async def daily_report_flow(user_prompt: str): """Generate a daily report using the agent.""" agent = Agent( # (1)! 'gateway/openai:gpt-5.2', name='daily_report_agent', instructions='Generate a daily summary report.', ) prefect_agent = PrefectAgent(agent) result = await prefect_agent.run(user_prompt) return result.output # Serve the flow with a daily schedule if __name__ == '__main__': daily_report_flow.serve( name='daily-report-deployment', cron='0 9 * * *', # Run daily at 9am parameters={'user_prompt': "Generate today's report"}, tags=['production', 'reports'], ) ``` 1. Each flow run executes in an isolated process, and all inputs and dependencies must be serializable. Because Agent instances cannot be serialized, instantiate the agent inside the flow rather than at the module level. serve_agent.py ```python from prefect import flow from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent @flow async def daily_report_flow(user_prompt: str): """Generate a daily report using the agent.""" agent = Agent( # (1)! 'openai:gpt-5.2', name='daily_report_agent', instructions='Generate a daily summary report.', ) prefect_agent = PrefectAgent(agent) result = await prefect_agent.run(user_prompt) return result.output # Serve the flow with a daily schedule if __name__ == '__main__': daily_report_flow.serve( name='daily-report-deployment', cron='0 9 * * *', # Run daily at 9am parameters={'user_prompt': "Generate today's report"}, tags=['production', 'reports'], ) ``` 1. Each flow run executes in an isolated process, and all inputs and dependencies must be serializable. Because Agent instances cannot be serialized, instantiate the agent inside the flow rather than at the module level. The `serve()` method accepts scheduling options: - **`cron`**: Cron schedule string (e.g., `'0 9 * * *'` for daily at 9am) - **`interval`**: Schedule interval in seconds or as a timedelta - **`rrule`**: iCalendar RRule schedule string For production deployments with Docker, Kubernetes, or other infrastructure, use the flow's [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) method. See the [Prefect deployment documentation](https://docs.prefect.io/v3/how-to-guides/deployments/create-deploymentsy) for more information. # Durable Execution with Temporal [Temporal](https://temporal.io) is a popular [durable execution](https://docs.temporal.io/evaluate/understanding-temporal#durable-execution) platform that's natively supported by Pydantic AI. ## Durable Execution In Temporal's durable execution implementation, a program that crashes or encounters an exception while interacting with a model or API will retry until it can successfully complete. Temporal relies primarily on a replay mechanism to recover from failures. As the program makes progress, Temporal saves key inputs and decisions, allowing a re-started program to pick up right where it left off. The key to making this work is to separate the application's repeatable (deterministic) and non-repeatable (non-deterministic) parts: 1. Deterministic pieces, termed [**workflows**](https://docs.temporal.io/workflow-definition), execute the same way when re-run with the same inputs. 1. Non-deterministic pieces, termed [**activities**](https://docs.temporal.io/activities), can run arbitrary code, performing I/O and any other operations. Workflow code can run for extended periods and, if interrupted, resume exactly where it left off. Critically, workflow code generally *cannot* include any kind of I/O, over the network, disk, etc. Activity code faces no restrictions on I/O or external interactions, but if an activity fails part-way through it is restarted from the beginning. Note If you are familiar with celery, it may be helpful to think of Temporal activities as similar to celery tasks, but where you wait for the task to complete and obtain its result before proceeding to the next step in the workflow. However, Temporal workflows and activities offer a great deal more flexibility and functionality than celery tasks. See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information In the case of Pydantic AI agents, integration with Temporal means that [model requests](https://ai.pydantic.dev/models/overview/index.md), [tool calls](https://ai.pydantic.dev/tools/index.md) that may require I/O, and [MCP server communication](https://ai.pydantic.dev/mcp/client/index.md) all need to be offloaded to Temporal activities due to their I/O requirements, while the logic that coordinates them (i.e. the agent run) lives in the workflow. Code that handles a scheduled job or web request can then execute the workflow, which will in turn execute the activities as needed. The diagram below shows the overall architecture of an agentic application in Temporal. The Temporal Server is responsible for tracking program execution and making sure the associated state is preserved reliably (i.e., stored to an internal database, and possibly replicated across cloud regions). Temporal Server manages data in encrypted form, so all data processing occurs on the Worker, which runs the workflow and activities. ```text +---------------------+ | Temporal Server | (Stores workflow state, +---------------------+ schedules activities, ^ persists progress) | Save state, | Schedule Tasks, progress, | load state on resume timeouts | | +------------------------------------------------------+ | Worker | | +----------------------------------------------+ | | | Workflow Code | | | | (Agent Run Loop) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Activity | | Activity | | Activity | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information. ## Durable Agent Any agent can be wrapped in a TemporalAgent to get a durable agent that can be used inside a deterministic Temporal workflow, by automatically offloading all work that requires I/O (namely model requests, tool calls, and MCP server communication) to non-deterministic activities. At the time of wrapping, the agent's [model](https://ai.pydantic.dev/models/overview/index.md) and [toolsets](https://ai.pydantic.dev/toolsets/index.md) (including function tools registered on the agent and MCP servers) are frozen, activities are dynamically created for each, and the original model and toolsets are wrapped to call on the worker to execute the corresponding activities instead of directly performing the actions inside the workflow. The original agent can still be used as normal outside the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Here is a simple but complete example of wrapping an agent for durable execution, creating a Temporal workflow with durable execution logic, connecting to a Temporal server, and running the workflow from non-durable code. All it requires is a Temporal server to be [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally): ```sh brew install temporal temporal server start-dev ``` [Learn about Gateway](https://ai.pydantic.dev/gateway) temporal_agent.py ```python import uuid from temporalio import workflow from temporalio.client import Client from temporalio.worker import Worker from pydantic_ai import Agent from pydantic_ai.durable_exec.temporal import ( PydanticAIPlugin, PydanticAIWorkflow, TemporalAgent, ) agent = Agent( 'gateway/openai:gpt-5.2', instructions="You're an expert in geography.", name='geography', # (10)! ) temporal_agent = TemporalAgent(agent) # (1)! @workflow.defn class GeographyWorkflow(PydanticAIWorkflow): # (2)! __pydantic_ai_agents__ = [temporal_agent] # (3)! @workflow.run async def run(self, prompt: str) -> str: result = await temporal_agent.run(prompt) # (4)! return result.output async def main(): client = await Client.connect( # (5)! 'localhost:7233', # (6)! plugins=[PydanticAIPlugin()], # (7)! ) async with Worker( # (8)! client, task_queue='geography', workflows=[GeographyWorkflow], ): output = await client.execute_workflow( # (10)! GeographyWorkflow.run, args=['What is the capital of Mexico?'], id=f'geography-{uuid.uuid4()}', task_queue='geography', ) print(output) #> Mexico City (Ciudad de México, CDMX) ``` 1. The original `Agent` cannot be used inside a deterministic Temporal workflow, but the `TemporalAgent` can. 1. As explained above, the workflow represents a deterministic piece of code that can use non-deterministic activities for operations that require I/O. Subclassing PydanticAIWorkflow is optional but provides proper typing for the `__pydantic_ai_agents__` class variable. 1. List the `TemporalAgent`s used by this workflow. The PydanticAIPlugin will automatically register their activities with the worker. Alternatively, if modifying the worker initialization is easier than the workflow class, you can use AgentPlugin to register agents directly on the worker. 1. TemporalAgent.run() works just like Agent.run(), but it will automatically offload model requests, tool calls, and MCP server communication to Temporal activities. 1. We connect to the Temporal server which keeps track of workflow and activity execution. 1. This assumes the Temporal server is [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally). 1. The PydanticAIPlugin tells Temporal to use Pydantic for serialization and deserialization, treats UserError exceptions as non-retryable, and automatically registers activities for agents listed in `__pydantic_ai_agents__`. 1. We start the worker that will listen on the specified task queue and run workflows and activities. In a real world application, this might be run in a separate service. 1. The agent's `name` is used to uniquely identify its activities. 1. We call on the server to execute the workflow on a worker that's listening on the specified task queue. temporal_agent.py ```python import uuid from temporalio import workflow from temporalio.client import Client from temporalio.worker import Worker from pydantic_ai import Agent from pydantic_ai.durable_exec.temporal import ( PydanticAIPlugin, PydanticAIWorkflow, TemporalAgent, ) agent = Agent( 'openai:gpt-5.2', instructions="You're an expert in geography.", name='geography', # (10)! ) temporal_agent = TemporalAgent(agent) # (1)! @workflow.defn class GeographyWorkflow(PydanticAIWorkflow): # (2)! __pydantic_ai_agents__ = [temporal_agent] # (3)! @workflow.run async def run(self, prompt: str) -> str: result = await temporal_agent.run(prompt) # (4)! return result.output async def main(): client = await Client.connect( # (5)! 'localhost:7233', # (6)! plugins=[PydanticAIPlugin()], # (7)! ) async with Worker( # (8)! client, task_queue='geography', workflows=[GeographyWorkflow], ): output = await client.execute_workflow( # (10)! GeographyWorkflow.run, args=['What is the capital of Mexico?'], id=f'geography-{uuid.uuid4()}', task_queue='geography', ) print(output) #> Mexico City (Ciudad de México, CDMX) ``` 1. The original `Agent` cannot be used inside a deterministic Temporal workflow, but the `TemporalAgent` can. 1. As explained above, the workflow represents a deterministic piece of code that can use non-deterministic activities for operations that require I/O. Subclassing PydanticAIWorkflow is optional but provides proper typing for the `__pydantic_ai_agents__` class variable. 1. List the `TemporalAgent`s used by this workflow. The PydanticAIPlugin will automatically register their activities with the worker. Alternatively, if modifying the worker initialization is easier than the workflow class, you can use AgentPlugin to register agents directly on the worker. 1. TemporalAgent.run() works just like Agent.run(), but it will automatically offload model requests, tool calls, and MCP server communication to Temporal activities. 1. We connect to the Temporal server which keeps track of workflow and activity execution. 1. This assumes the Temporal server is [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally). 1. The PydanticAIPlugin tells Temporal to use Pydantic for serialization and deserialization, treats UserError exceptions as non-retryable, and automatically registers activities for agents listed in `__pydantic_ai_agents__`. 1. We start the worker that will listen on the specified task queue and run workflows and activities. In a real world application, this might be run in a separate service. 1. The agent's `name` is used to uniquely identify its activities. 1. We call on the server to execute the workflow on a worker that's listening on the specified task queue. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* In a real world application, the agent, workflow, and worker are typically defined separately from the code that calls for a workflow to be executed. Because Temporal workflows need to be defined at the top level of the file and the `TemporalAgent` instance is needed inside the workflow and when starting the worker (to register the activities), it needs to be defined at the top level of the file as well. For more information on how to use Temporal in Python applications, see their [Python SDK guide](https://docs.temporal.io/develop/python). ## Temporal Integration Considerations There are a few considerations specific to agents and toolsets when using Temporal for durable execution. These are important to understand to ensure that your agents and toolsets work correctly with Temporal's workflow and activity model. ### Agent Names and Toolset IDs To ensure that Temporal knows what code to run when an activity fails or is interrupted and then restarted, even if your code is changed in between, each activity needs to have a name that's stable and unique. When `TemporalAgent` dynamically creates activities for the wrapped agent's model requests and toolsets (specifically those that implement their own tool listing and calling, i.e. FunctionToolset and MCPServer), their names are derived from the agent's name and the toolsets' ids. These fields are normally optional, but are required to be set when using Temporal. They should not be changed once the durable agent has been deployed to production as this would break active workflows. For dynamic toolsets created with the @agent.toolset decorator, the `id` parameter must be set explicitly. Note that with Temporal, `per_run_step=False` is not respected, as the toolset always needs to be created on-the-fly in the activity. Other than that, any agent and toolset will just work! ### Agent Run Context and Dependencies As workflows and activities run in separate processes, any values passed between them need to be serializable. As these payloads are stored in the workflow execution event history, Temporal limits their size to 2MB. To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's RunContext, and it's your responsibility to make sure that the [dependencies](https://ai.pydantic.dev/dependencies/index.md) object provided to TemporalAgent.run() can be serialized using Pydantic. Specifically, only the `deps`, `run_id`, `metadata`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `tool_call_metadata`, `retry`, `max_retries`, `run_step`, `usage`, and `partial_output` fields are available by default, and trying to access `model`, `prompt`, `messages`, or `tracer` will raise an error. If you need one or more of these attributes to be available inside activities, you can create a TemporalRunContext subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to TemporalAgent as `run_context_type`. ### Streaming Because Temporal activities cannot stream output directly to the activity call site, Agent.run_stream(), Agent.run_stream_events(), and Agent.iter() are not supported. Instead, you can implement streaming by setting an event_stream_handler on the `Agent` or `TemporalAgent` instance and using TemporalAgent.run() inside the workflow. The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](https://ai.pydantic.dev/agent/#streaming-all-events). As the streaming model request activity, workflow, and workflow execution call all take place in separate processes, passing data between them requires some care: - To get data from the workflow call site or workflow to the event stream handler, you can use a [dependencies object](#agent-run-context-and-dependencies). - To get data from the event stream handler to the workflow, workflow call site, or a frontend, you need to use an external system that the event stream handler can write to and the event consumer can read from, like a message queue. You can use the dependency object to make sure the same connection string or other unique ID is available in all the places that need it. ### Model Selection at Runtime Agent.run(model=...) normally supports both model strings (like `'openai:gpt-5.2'`) and model instances. However, `TemporalAgent` does not support arbitrary model instances because they cannot be serialized for Temporal's replay mechanism. To use model instances with `TemporalAgent`, you need to pre-register them by passing a dict of model instances to `TemporalAgent(models={...})`. You can then reference them by name or by passing the registered instance directly. If the wrapped agent doesn't have a model set, the first registered model will be used as the default. Model strings work as expected. For scenarios where you need to customize the provider used by the model string (e.g., inject API keys from deps), you can pass a `provider_factory` to `TemporalAgent`, which is passed the RunContext and provider name. Here's an example showing how to pre-register and use multiple models: multi_model_temporal.py ```python from dataclasses import dataclass from typing import Any from temporalio import workflow from pydantic_ai import Agent from pydantic_ai.durable_exec.temporal import TemporalAgent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.google import GoogleModel from pydantic_ai.models.openai import OpenAIResponsesModel from pydantic_ai.providers import Provider from pydantic_ai.tools import RunContext @dataclass class Deps: openai_api_key: str | None = None anthropic_api_key: str | None = None # Create models from different providers default_model = OpenAIResponsesModel('gpt-5.2') fast_model = AnthropicModel('claude-sonnet-4-5') reasoning_model = GoogleModel('gemini-3-pro-preview') # Optional: provider factory for dynamic model configuration def my_provider_factory(run_context: RunContext[Deps], provider_name: str) -> Provider[Any]: """Create providers with custom configuration based on run context.""" if provider_name == 'openai': from pydantic_ai.providers.openai import OpenAIProvider return OpenAIProvider(api_key=run_context.deps.openai_api_key) elif provider_name == 'anthropic': from pydantic_ai.providers.anthropic import AnthropicProvider return AnthropicProvider(api_key=run_context.deps.anthropic_api_key) else: raise ValueError(f'Unknown provider: {provider_name}') agent = Agent(default_model, name='multi_model_agent', deps_type=Deps) temporal_agent = TemporalAgent( agent, models={ 'fast': fast_model, 'reasoning': reasoning_model, }, provider_factory=my_provider_factory, # Optional ) @workflow.defn class MultiModelWorkflow: @workflow.run async def run(self, prompt: str, use_reasoning: bool, use_fast: bool) -> str: if use_reasoning: # Select by registered name result = await temporal_agent.run(prompt, model='reasoning') elif use_fast: # Or pass the registered instance directly result = await temporal_agent.run(prompt, model=fast_model) else: # Or pass a model string (uses provider_factory if set) result = await temporal_agent.run(prompt, model='openai:gpt-5-mini') return result.output ``` ## Activity Configuration Temporal activity configuration, like timeouts and retry policies, can be customized by passing [`temporalio.workflow.ActivityConfig`](https://python.temporal.io/temporalio.workflow.ActivityConfig.html) objects to the `TemporalAgent` constructor: - `activity_config`: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. - `model_activity_config`: The Temporal activity config to use for model request activities. This is merged with the base activity config. - `toolset_activity_config`: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. - `tool_activity_config`: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use I/O, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. ## Activity Retries On top of the automatic retries for request failures that Temporal will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using Temporal, it's recommended to not use [HTTP Request Retries](https://ai.pydantic.dev/retries/index.md) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](https://ai.pydantic.dev/models/openai/#custom-openai-client). You can customize Temporal's retry policy using [activity configuration](#activity-configuration). ## Observability with Logfire Temporal generates telemetry events and metrics for each workflow and activity execution, and Pydantic AI generates events for each agent run, model request and tool call. These can be sent to [Pydantic Logfire](https://ai.pydantic.dev/logfire/index.md) to get a complete picture of what's happening in your application. To use Logfire with Temporal, you need to pass a LogfirePlugin object to Temporal's `Client.connect()`: logfire_plugin.py ```python from temporalio.client import Client from pydantic_ai.durable_exec.temporal import LogfirePlugin, PydanticAIPlugin async def main(): client = await Client.connect( 'localhost:7233', plugins=[PydanticAIPlugin(), LogfirePlugin()], ) ``` By default, the `LogfirePlugin` will instrument Temporal (including metrics) and Pydantic AI and send all data to Logfire. To customize Logfire configuration and instrumentation, you can pass a `logfire_setup` function to the `LogfirePlugin` constructor and return a custom `Logfire` instance (i.e. the result of `logfire.configure()`). To disable sending Temporal metrics to Logfire, you can pass `metrics=False` to the `LogfirePlugin` constructor. ## Known Issues ### Pandas When `logfire.info` is used inside an activity and the `pandas` package is among your project's dependencies, you may encounter the following error which seems to be the result of an import race condition: ```text AttributeError: partially initialized module 'pandas' has no attribute '_pandas_parser_CAPI' (most likely due to a circular import) ``` To fix this, you can use the [`temporalio.workflow.unsafe.imports_passed_through()`](https://python.temporal.io/temporalio.workflow.unsafe.html#imports_passed_through) context manager to proactively import the package and not have it be reloaded in the workflow sandbox: temporal_activity.py ```python from temporalio import workflow with workflow.unsafe.imports_passed_through(): import pandas ``` # MCP # Client Pydantic AI can act as an [MCP client](https://modelcontextprotocol.io/quickstart/client), connecting to MCP servers to use their tools. ## Install You need to either install [`pydantic-ai`](https://ai.pydantic.dev/install/index.md), or [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) with the `mcp` optional group: ```bash pip install "pydantic-ai-slim[mcp]" ``` ```bash uv add "pydantic-ai-slim[mcp]" ``` ## Usage Pydantic AI comes with three ways to connect to MCP servers: - MCPServerStreamableHTTP which connects to an MCP server using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport - MCPServerSSE which connects to an MCP server using the [HTTP SSE](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) transport - MCPServerStdio which runs the server as a subprocess and connects to it using the [stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) transport Examples of all three are shown below. Each MCP server instance is a [toolset](https://ai.pydantic.dev/toolsets/index.md) and can be registered with an Agent using the `toolsets` argument. You can use the async with agent context manager to open and close connections to all registered servers (and in the case of stdio servers, start and stop the subprocesses) around the context where they'll be used in agent runs. You can also use async with server to manage the connection or subprocess of a specific server, for example if you'd like to use it with multiple agents. If you don't explicitly enter one of these context managers to set up the server, this will be done automatically when it's needed (e.g. to list the available tools or call a specific tool), but it's more efficient to do so around the entire context where you expect the servers to be used. ### Streamable HTTP Client MCPServerStreamableHTTP connects over HTTP using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport to a server. Note MCPServerStreamableHTTP requires an MCP server to be running and accepting HTTP connections before running the agent. Running the server is not managed by Pydantic AI. Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport. streamable_http_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='streamable-http') ``` Then we can create the client: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_streamable_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('gateway/openai:gpt-5.2', toolsets=[server]) # (2)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. mcp_streamable_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-5.2', toolsets=[server]) # (2)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* **What's happening here?** - The model receives the prompt "What is 7 plus 5?" - The model decides "Oh, I've got this `add` tool, that will be a good way to answer this question" - The model returns a tool call - Pydantic AI sends the tool call to the MCP server using the Streamable HTTP transport - The model is called again with the return value of running the `add` tool (12) - The model returns the final answer You can visualise this clearly, and even see the tool call, by adding three lines of code to instrument the example with [logfire](https://logfire.pydantic.dev/docs): mcp_sse_client_logfire.py ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() ``` ### SSE Client MCPServerSSE connects over HTTP using the [HTTP + Server Sent Events transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) to a server. Note The SSE transport in MCP is deprecated, you should use Streamable HTTP instead. Before creating the SSE client, we need to run a server that supports the SSE transport. sse_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='sse') ``` Then we can create the client: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_sse_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') # (1)! agent = Agent('gateway/openai:gpt-5.2', toolsets=[server]) # (2)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. mcp_sse_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') # (1)! agent = Agent('openai:gpt-5.2', toolsets=[server]) # (2)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### MCP "stdio" Server MCP also offers [stdio transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) where the server is run as a subprocess and communicates with the client over `stdin` and `stdout`. In this case, you'd use the MCPServerStdio class. In this example we use a simple MCP server that provides weather tools. [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_stdio_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['mcp_server.py'], timeout=10) agent = Agent('gateway/openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('What is the weather in Paris?') print(result.output) #> The weather in Paris is sunny and 26 degrees Celsius. ``` mcp_stdio_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['mcp_server.py'], timeout=10) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('What is the weather in Paris?') print(result.output) #> The weather in Paris is sunny and 26 degrees Celsius. ``` ## Loading MCP Servers from Configuration Instead of creating MCP server instances individually in code, you can load multiple servers from a JSON configuration file using load_mcp_servers(). This is particularly useful when you need to manage multiple MCP servers or want to configure servers externally without modifying code. ### Configuration Format The configuration file should be a JSON file with an `mcpServers` object containing server definitions. Each server is identified by a unique key and contains the configuration for that server type: mcp_config.json ```json { "mcpServers": { "python-runner": { "command": "uv", "args": ["run", "mcp-run-python", "stdio"] }, "weather": { "command": "python", "args": ["mcp_server.py"] }, "weather-api": { "url": "http://localhost:3001/sse" }, "calculator": { "url": "http://localhost:8000/mcp" } } } ``` Note The MCP server is only inferred to be an SSE server because of the `/sse` suffix. Any other server with the "url" field will be inferred to be a Streamable HTTP server. We made this decision given that the SSE transport is deprecated. ### Environment Variables The configuration file supports environment variable expansion using the `${VAR}` and `${VAR:-default}` syntax, [like Claude Code](https://code.claude.com/docs/en/mcp#environment-variable-expansion-in-mcp-json). This is useful for keeping sensitive information like API keys or host names out of your configuration files: mcp_config_with_env.json ```json { "mcpServers": { "python-runner": { "command": "${PYTHON_CMD:-python3}", "args": ["run", "${MCP_MODULE}", "stdio"], "env": { "API_KEY": "${MY_API_KEY}" } }, "weather-api": { "url": "https://${SERVER_HOST:-localhost}:${SERVER_PORT:-8080}/sse" } } } ``` When loading this configuration with load_mcp_servers(): - `${VAR}` references will be replaced with the corresponding environment variable values. - `${VAR:-default}` references will use the environment variable value if set, otherwise the default value. Warning If a referenced environment variable using `${VAR}` syntax is not defined, a `ValueError` will be raised. Use the `${VAR:-default}` syntax to provide a fallback value. ### Usage [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_config_loader.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import load_mcp_servers # Load all servers from configuration file servers = load_mcp_servers('mcp_config.json') # Create agent with all loaded servers agent = Agent('gateway/openai:gpt-5.2', toolsets=servers) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) ``` mcp_config_loader.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import load_mcp_servers # Load all servers from configuration file servers = load_mcp_servers('mcp_config.json') # Create agent with all loaded servers agent = Agent('openai:gpt-5.2', toolsets=servers) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Tool call customization The MCP servers provide the ability to set a `process_tool_call` which allows the customization of tool call requests and their responses. A common use case for this is to inject metadata to the requests which the server call needs: mcp_process_tool_call.py ```python from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.mcp import CallToolFunc, MCPServerStdio, ToolResult from pydantic_ai.models.test import TestModel async def process_tool_call( ctx: RunContext[int], call_tool: CallToolFunc, name: str, tool_args: dict[str, Any], ) -> ToolResult: """A tool call processor that passes along the deps.""" return await call_tool(name, tool_args, {'deps': ctx.deps}) server = MCPServerStdio('python', args=['mcp_server.py'], process_tool_call=process_tool_call) agent = Agent( model=TestModel(call_tools=['echo_deps']), deps_type=int, toolsets=[server] ) async def main(): result = await agent.run('Echo with deps set to 42', deps=42) print(result.output) #> {"echo_deps":{"echo":"This is an echo message","deps":42}} ``` How to access the metadata is MCP server SDK specific. For example with the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk), it is accessible via the [`ctx: Context`](https://github.com/modelcontextprotocol/python-sdk#context) argument that can be included on tool call handlers: mcp_server.py ```python from typing import Any from mcp.server.fastmcp import Context, FastMCP from mcp.server.session import ServerSession mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.tool() async def echo_deps(ctx: Context[ServerSession, None]) -> dict[str, Any]: """Echo the run context. Args: ctx: Context object containing request and session information. Returns: Dictionary with an echo message and the deps. """ await ctx.info('This is an info message') deps: Any = getattr(ctx.request_context.meta, 'deps') return {'echo': 'This is an echo message', 'deps': deps} if __name__ == '__main__': mcp.run() ``` ## Using Tool Prefixes to Avoid Naming Conflicts When connecting to multiple MCP servers that might provide tools with the same name, you can use the `tool_prefix` parameter to avoid naming conflicts. This parameter adds a prefix to all tool names from a specific server. This allows you to use multiple servers that might have overlapping tool names without conflicts: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_tool_prefix_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Create two servers with different prefixes weather_server = MCPServerSSE( 'http://localhost:3001/sse', tool_prefix='weather' # Tools will be prefixed with 'weather_' ) calculator_server = MCPServerSSE( 'http://localhost:3002/sse', tool_prefix='calc' # Tools will be prefixed with 'calc_' ) # Both servers might have a tool named 'get_data', but they'll be exposed as: # - 'weather_get_data' # - 'calc_get_data' agent = Agent('gateway/openai:gpt-5.2', toolsets=[weather_server, calculator_server]) ``` mcp_tool_prefix_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Create two servers with different prefixes weather_server = MCPServerSSE( 'http://localhost:3001/sse', tool_prefix='weather' # Tools will be prefixed with 'weather_' ) calculator_server = MCPServerSSE( 'http://localhost:3002/sse', tool_prefix='calc' # Tools will be prefixed with 'calc_' ) # Both servers might have a tool named 'get_data', but they'll be exposed as: # - 'weather_get_data' # - 'calc_get_data' agent = Agent('openai:gpt-5.2', toolsets=[weather_server, calculator_server]) ``` ## Server Instructions MCP servers can provide instructions during initialization that give context about how to best interact with the server's tools. These instructions are accessible via the instructions property after the server connection is established. [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server_instructions.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') agent = Agent('gateway/openai:gpt-5.2', toolsets=[server]) @agent.instructions async def mcp_server_instructions(): return server.instructions # (1)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. The server connection is guaranteed to be established by this point, so `server.instructions` is available. mcp_server_instructions.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[server]) @agent.instructions async def mcp_server_instructions(): return server.instructions # (1)! async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. The server connection is guaranteed to be established by this point, so `server.instructions` is available. ## Tool metadata MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when filtering tools. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the ToolDefinition object that's passed to filter functions. ## Resources MCP servers can provide [resources](https://modelcontextprotocol.io/docs/concepts/resources) - files, data, or content that can be accessed by the client. Resources in MCP are application-driven, with host applications determining how to incorporate context manually, based on their needs. This means they will *not* be exposed to the LLM automatically (unless a tool returns a `ResourceLink` or `EmbeddedResource`). Pydantic AI provides methods to discover and read resources from MCP servers: - list_resources() - List all available resources on the server - list_resource_templates() - List resource templates with parameter placeholders - read_resource(uri) - Read the contents of a specific resource by URI Resources are automatically converted: text content is returned as `str`, and binary content is returned as BinaryContent. Before consuming resources, we need to run a server that exposes some: mcp_resource_server.py ```python from mcp.server.fastmcp import FastMCP mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.resource('resource://user_name.txt', mime_type='text/plain') async def user_name_resource() -> str: return 'Alice' if __name__ == '__main__': mcp.run() ``` Then we can create the client: mcp_resources.py ```python import asyncio from pydantic_ai.mcp import MCPServerStdio async def main(): server = MCPServerStdio('python', args=['-m', 'mcp_resource_server']) async with server: # List all available resources resources = await server.list_resources() for resource in resources: print(f' - {resource.name}: {resource.uri} ({resource.mime_type})') #> - user_name_resource: resource://user_name.txt (text/plain) # Read a text resource user_name = await server.read_resource('resource://user_name.txt') print(f'Text content: {user_name}') #> Text content: Alice if __name__ == '__main__': asyncio.run(main()) ``` *(This example is complete, it can be run "as is")* ## Custom TLS / SSL configuration In some environments you need to tweak how HTTPS connections are established – for example to trust an internal Certificate Authority, present a client certificate for **mTLS**, or (during local development only!) disable certificate verification altogether. All HTTP-based MCP client classes (MCPServerStreamableHTTP and MCPServerSSE) expose an `http_client` parameter that lets you pass your own pre-configured [`httpx.AsyncClient`](https://www.python-httpx.org/async/). [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_custom_tls_client.py ```python import ssl import httpx from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Trust an internal / self-signed CA ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem') # OPTIONAL: if the server requires **mutual TLS** load your client certificate ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',) http_client = httpx.AsyncClient( verify=ssl_ctx, timeout=httpx.Timeout(10.0), ) server = MCPServerSSE( 'http://localhost:3001/sse', http_client=http_client, # (1)! ) agent = Agent('gateway/openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. When you supply `http_client`, Pydantic AI re-uses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic. mcp_custom_tls_client.py ```python import ssl import httpx from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Trust an internal / self-signed CA ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem') # OPTIONAL: if the server requires **mutual TLS** load your client certificate ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',) http_client = httpx.AsyncClient( verify=ssl_ctx, timeout=httpx.Timeout(10.0), ) server = MCPServerSSE( 'http://localhost:3001/sse', http_client=http_client, # (1)! ) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. When you supply `http_client`, Pydantic AI re-uses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic. ## Client Identification When connecting to an MCP server, you can optionally specify an [Implementation](https://modelcontextprotocol.io/specification/2025-11-25/schema#implementation) object as client information that will be sent to the server during initialization. This is useful for: - Identifying your application in server logs - Allowing servers to provide custom behavior based on the client - Debugging and monitoring MCP connections - Version-specific feature negotiation All MCP client classes (MCPServerStdio, MCPServerStreamableHTTP, and MCPServerSSE) support the `client_info` parameter: mcp_client_with_name.py ```python from mcp import types as mcp_types from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE( 'http://localhost:3001/sse', client_info=mcp_types.Implementation( name='MyApplication', version='2.1.0', ), ) ``` ## MCP Sampling What is MCP Sampling? In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used. Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls. Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain. Sampling Diagram Here's a mermaid diagram that may or may not make the data flow clearer: ``` sequenceDiagram participant LLM participant MCP_Client as MCP client participant MCP_Server as MCP server MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM tool call response MCP_Client->>MCP_Server: tool call MCP_Server->>MCP_Client: sampling "create message" MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM text response MCP_Client->>MCP_Server: sampling response MCP_Server->>MCP_Client: tool call response ``` Pydantic AI supports sampling as both a client and server. See the [server](https://ai.pydantic.dev/mcp/server/#mcp-sampling) documentation for details on how to use sampling within a server. Sampling is automatically supported by Pydantic AI agents when they act as a client. To be able to use sampling, an MCP server instance needs to have a sampling_model set. This can be done either directly on the server using the constructor keyword argument or the property, or by using agent.set_mcp_sampling_model() to set the agent's model or one specified as an argument as the sampling model on all MCP servers registered with that agent. Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments). Sampling MCP Server generate_svg.py ````python import re from pathlib import Path from mcp import SamplingMessage from mcp.server.fastmcp import Context, FastMCP from mcp.types import TextContent app = FastMCP() @app.tool() async def image_generator(ctx: Context, subject: str, style: str) -> str: prompt = f'{subject=} {style=}' # `ctx.session.create_message` is the sampling call result = await ctx.session.create_message( [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))], max_tokens=1_024, system_prompt='Generate an SVG image as per the user input', ) assert isinstance(result.content, TextContent) path = Path(f'{subject}_{style}.svg') # remove triple backticks if the svg was returned within markdown if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M): path.write_text(m.group(1), encoding='utf-8') else: path.write_text(result.content.text, encoding='utf-8') return f'See {path}' if __name__ == '__main__': # run the server via stdio app.run() ```` Using this server with an `Agent` will automatically allow sampling: sampling_mcp_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['generate_svg.py']) agent = Agent('openai:gpt-5.2', toolsets=[server]) async def main(): agent.set_mcp_sampling_model() result = await agent.run('Create an image of a robot in a punk style.') print(result.output) #> Image file written to robot_punk.svg. ``` *(This example is complete, it can be run "as is")* You can disallow sampling by setting allow_sampling=False when creating the server reference, e.g.: sampling_disallowed.py ```python from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( 'python', args=['generate_svg.py'], allow_sampling=False, ) ``` ## Elicitation In MCP, [elicitation](https://modelcontextprotocol.io/docs/concepts/elicitation) allows a server to request for [structured input](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) from the client for missing or additional context during a session. Elicitation let models essentially say "Hold on - I need to know X before i can continue" rather than requiring everything upfront or taking a shot in the dark. ### How Elicitation works Elicitation introduces a new protocol message type called [`ElicitRequest`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitrequest), which is sent from the server to the client when it needs additional information. The client can then respond with an [`ElicitResult`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitresult) or an `ErrorData` message. Here's a typical interaction: - User makes a request to the MCP server (e.g. "Book a table at that Italian place") - The server identifies that it needs more information (e.g. "Which Italian place?", "What date and time?") - The server sends an `ElicitRequest` to the client asking for the missing information. - The client receives the request, presents it to the user (e.g. via a terminal prompt, GUI dialog, or web interface). - User provides the requested information, `decline` or `cancel` the request. - The client sends an `ElicitResult` back to the server with the user's response. - With the structured data, the server can continue processing the original request. This allows for a more interactive and user-friendly experience, especially for multi-staged workflows. Instead of requiring all information upfront, the server can ask for it as needed, making the interaction feel more natural. ### Setting up Elicitation To enable elicitation, provide an elicitation_callback function when creating your MCP server instance: restaurant_server.py ```python from mcp.server.fastmcp import Context, FastMCP from pydantic import BaseModel, Field mcp = FastMCP(name='Restaurant Booking') class BookingDetails(BaseModel): """Schema for restaurant booking information.""" restaurant: str = Field(description='Choose a restaurant') party_size: int = Field(description='Number of people', ge=1, le=8) date: str = Field(description='Reservation date (DD-MM-YYYY)') @mcp.tool() async def book_table(ctx: Context) -> str: """Book a restaurant table with user input.""" # Ask user for booking details using Pydantic schema result = await ctx.elicit(message='Please provide your booking details:', schema=BookingDetails) if result.action == 'accept' and result.data: booking = result.data return f'✅ Booked table for {booking.party_size} at {booking.restaurant} on {booking.date}' elif result.action == 'decline': return 'No problem! Maybe another time.' else: # cancel return 'Booking cancelled.' if __name__ == '__main__': mcp.run(transport='stdio') ``` This server demonstrates elicitation by requesting structured booking details from the client when the `book_table` tool is called. Here's how to create a client that handles these elicitation requests: [Learn about Gateway](https://ai.pydantic.dev/gateway) client_example.py ```python import asyncio from typing import Any from mcp.client.session import ClientSession from mcp.shared.context import RequestContext from mcp.types import ElicitRequestParams, ElicitResult from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio async def handle_elicitation( context: RequestContext[ClientSession, Any, Any], params: ElicitRequestParams, ) -> ElicitResult: """Handle elicitation requests from MCP server.""" print(f'\n{params.message}') if not params.requestedSchema: response = input('Response: ') return ElicitResult(action='accept', content={'response': response}) # Collect data for each field properties = params.requestedSchema['properties'] data = {} for field, info in properties.items(): description = info.get('description', field) value = input(f'{description}: ') # Convert to proper type based on JSON schema if info.get('type') == 'integer': data[field] = int(value) else: data[field] = value # Confirm confirm = input('\nConfirm booking? (y/n/c): ').lower() if confirm == 'y': print('Booking details:', data) return ElicitResult(action='accept', content=data) elif confirm == 'n': return ElicitResult(action='decline') else: return ElicitResult(action='cancel') # Set up MCP server connection restaurant_server = MCPServerStdio( 'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation ) # Create agent agent = Agent('gateway/openai:gpt-5.2', toolsets=[restaurant_server]) async def main(): """Run the agent to book a restaurant table.""" result = await agent.run('Book me a table') print(f'\nResult: {result.output}') if __name__ == '__main__': asyncio.run(main()) ``` client_example.py ```python import asyncio from typing import Any from mcp.client.session import ClientSession from mcp.shared.context import RequestContext from mcp.types import ElicitRequestParams, ElicitResult from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio async def handle_elicitation( context: RequestContext[ClientSession, Any, Any], params: ElicitRequestParams, ) -> ElicitResult: """Handle elicitation requests from MCP server.""" print(f'\n{params.message}') if not params.requestedSchema: response = input('Response: ') return ElicitResult(action='accept', content={'response': response}) # Collect data for each field properties = params.requestedSchema['properties'] data = {} for field, info in properties.items(): description = info.get('description', field) value = input(f'{description}: ') # Convert to proper type based on JSON schema if info.get('type') == 'integer': data[field] = int(value) else: data[field] = value # Confirm confirm = input('\nConfirm booking? (y/n/c): ').lower() if confirm == 'y': print('Booking details:', data) return ElicitResult(action='accept', content=data) elif confirm == 'n': return ElicitResult(action='decline') else: return ElicitResult(action='cancel') # Set up MCP server connection restaurant_server = MCPServerStdio( 'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation ) # Create agent agent = Agent('openai:gpt-5.2', toolsets=[restaurant_server]) async def main(): """Run the agent to book a restaurant table.""" result = await agent.run('Book me a table') print(f'\nResult: {result.output}') if __name__ == '__main__': asyncio.run(main()) ``` ### Supported Schema Types MCP elicitation supports string, number, boolean, and enum types with flat object structures only. These limitations ensure reliable cross-client compatibility. See [supported schema types](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) for details. ### Security MCP Elicitation requires careful handling - servers must not request sensitive information, and clients must implement user approval controls with clear explanations. See [security considerations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#security-considerations) for details. # FastMCP Client [FastMCP](https://gofastmcp.com/) is a higher-level MCP framework that bills itself as "The fast, Pythonic way to build MCP servers and clients." It supports additional capabilities on top of the MCP specification like [Tool Transformation](https://gofastmcp.com/patterns/tool-transformation), [OAuth](https://gofastmcp.com/clients/auth/oauth), and more. As an alternative to Pydantic AI's standard [`MCPServer` MCP client](https://ai.pydantic.dev/mcp/client/index.md) built on the [MCP SDK](https://github.com/modelcontextprotocol/python-sdk), you can use the FastMCPToolset [toolset](https://ai.pydantic.dev/toolsets/index.md) that leverages the [FastMCP Client](https://gofastmcp.com/clients/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers/). Note that it does not yet support integration elicitation or sampling, which are supported by the [standard `MCPServer` client](https://ai.pydantic.dev/mcp/client/index.md). ## Install To use the `FastMCPToolset`, you will need to install [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) with the `fastmcp` optional group: ```bash pip install "pydantic-ai-slim[fastmcp]" ``` ```bash uv add "pydantic-ai-slim[fastmcp]" ``` ## Usage A `FastMCPToolset` can then be created from: - A FastMCP Server: `FastMCPToolset(fastmcp.FastMCP('my_server'))` - A FastMCP Client: `FastMCPToolset(fastmcp.Client(...))` - A FastMCP Transport: `FastMCPToolset(fastmcp.StdioTransport(command='python', args=['mcp_server.py']))` - A Streamable HTTP URL: `FastMCPToolset('http://localhost:8000/mcp')` - An HTTP SSE URL: `FastMCPToolset('http://localhost:8000/sse')` - A Python Script: `FastMCPToolset('my_server.py')` - A Node.js Script: `FastMCPToolset('my_server.js')` - A JSON MCP Configuration: `FastMCPToolset({'mcpServers': {'my_server': {'command': 'python', 'args': ['mcp_server.py']}}})` If you already have a [FastMCP Server](https://gofastmcp.com/servers) in the same codebase as your Pydantic AI agent, you can create a `FastMCPToolset` directly from it and save agent a network round trip: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from fastmcp import FastMCP from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset fastmcp_server = FastMCP('my_server') @fastmcp_server.tool() async def add(a: int, b: int) -> int: return a + b toolset = FastMCPToolset(fastmcp_server) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` ```python from fastmcp import FastMCP from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset fastmcp_server = FastMCP('my_server') @fastmcp_server.tool() async def add(a: int, b: int) -> int: return a + b toolset = FastMCPToolset(fastmcp_server) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) async def main(): result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Connecting your agent to a Streamable HTTP MCP Server is as simple as: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset toolset = FastMCPToolset('http://localhost:8000/mcp') agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) ``` ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset toolset = FastMCPToolset('http://localhost:8000/mcp') agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* You can also create a `FastMCPToolset` from a JSON MCP Configuration: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset mcp_config = { 'mcpServers': { 'time_mcp_server': { 'command': 'uvx', 'args': ['mcp-run-python', 'stdio'] }, 'weather_server': { 'command': 'python', 'args': ['mcp_server.py'] } } } toolset = FastMCPToolset(mcp_config) agent = Agent('gateway/openai:gpt-5.2', toolsets=[toolset]) ``` ```python from pydantic_ai import Agent from pydantic_ai.toolsets.fastmcp import FastMCPToolset mcp_config = { 'mcpServers': { 'time_mcp_server': { 'command': 'uvx', 'args': ['mcp-run-python', 'stdio'] }, 'weather_server': { 'command': 'python', 'args': ['mcp_server.py'] } } } toolset = FastMCPToolset(mcp_config) agent = Agent('openai:gpt-5.2', toolsets=[toolset]) ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* # Model Context Protocol (MCP) Pydantic AI supports [Model Context Protocol (MCP)](https://modelcontextprotocol.io) in multiple ways: 1. [Agents](https://ai.pydantic.dev/agent/index.md) can connect to MCP servers and use their tools using three different methods: 1. Pydantic AI can act as an MCP client and connect directly to local and remote MCP servers. [Learn more](https://ai.pydantic.dev/mcp/client/index.md) about MCPServer. 1. Pydantic AI can use the [FastMCP Client](https://gofastmcp.com/clients/client/) to connect to local and remote MCP servers, whether or not they're built using [FastMCP Server](https://gofastmcp.com/servers). [Learn more](https://ai.pydantic.dev/mcp/fastmcp-client/index.md) about FastMCPToolset. 1. Some model providers can themselves connect to remote MCP servers using a "built-in tool". [Learn more](https://ai.pydantic.dev/builtin-tools/#mcp-server-tool) about MCPServerTool. 1. Agents can be used within MCP servers. [Learn more](https://ai.pydantic.dev/mcp/server/index.md) ## What is MCP? The Model Context Protocol is a standardized protocol that allow AI applications (including programmatic agents like Pydantic AI, coding agents like [cursor](https://www.cursor.com/), and desktop applications like [Claude Desktop](https://claude.ai/download)) to connect to external tools and services using a common interface. As with other protocols, the dream of MCP is that a wide range of applications can speak to each other without the need for specific integrations. There is a great list of MCP servers at [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers). Some examples of what this means: - Pydantic AI could use a web search service implemented as an MCP server to implement a deep research agent - Cursor could connect to the [Pydantic Logfire](https://github.com/pydantic/logfire-mcp) MCP server to search logs, traces and metrics to gain context while fixing a bug - Pydantic AI, or any other MCP client could connect to our [Run Python](https://github.com/pydantic/mcp-run-python) MCP server to run arbitrary Python code in a sandboxed environment # Server Pydantic AI models can also be used within MCP Servers. ## MCP Server Here's a simple example of a [Python MCP server](https://github.com/modelcontextprotocol/python-sdk) using Pydantic AI within a tool call: [Learn about Gateway](https://ai.pydantic.dev/gateway) mcp_server.py ```python from mcp.server.fastmcp import FastMCP from pydantic_ai import Agent server = FastMCP('Pydantic AI Server') server_agent = Agent( 'gateway/anthropic:claude-haiku-4-5', instructions='always reply in rhyme' ) @server.tool() async def poet(theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}') return r.output if __name__ == '__main__': server.run() ``` mcp_server.py ```python from mcp.server.fastmcp import FastMCP from pydantic_ai import Agent server = FastMCP('Pydantic AI Server') server_agent = Agent( 'anthropic:claude-haiku-4-5', instructions='always reply in rhyme' ) @server.tool() async def poet(theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}') return r.output if __name__ == '__main__': server.run() ``` ## Simple client This server can be queried with any MCP client. Here is an example using the Python SDK directly: mcp_client.py ```python import asyncio import os from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client async def client(): server_params = StdioServerParameters( command='python', args=['mcp_server.py'], env=os.environ ) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) """ Oh, socks, those garments soft and sweet, That nestle softly 'round our feet, From cotton, wool, or blended thread, They keep our toes from feeling dread. """ if __name__ == '__main__': asyncio.run(client()) ``` ## MCP Sampling What is MCP Sampling? See the [MCP client docs](https://ai.pydantic.dev/mcp/client/#mcp-sampling) for details of what MCP sampling is, and how you can support it when using Pydantic AI as an MCP client. When Pydantic AI agents are used within MCP servers, they can use sampling via MCPSamplingModel. We can extend the above example to use sampling so instead of connecting directly to the LLM, the agent calls back through the MCP client to make LLM calls. mcp_server_sampling.py ```python from mcp.server.fastmcp import Context, FastMCP from pydantic_ai import Agent from pydantic_ai.models.mcp_sampling import MCPSamplingModel server = FastMCP('Pydantic AI Server with sampling') server_agent = Agent(instructions='always reply in rhyme') @server.tool() async def poet(ctx: Context, theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}', model=MCPSamplingModel(session=ctx.session)) return r.output if __name__ == '__main__': server.run() # run the server over stdio ``` The [above](#simple-client) client does not support sampling, so if you tried to use it with this server you'd get an error. The simplest way to support sampling in an MCP client is to [use](https://ai.pydantic.dev/mcp/client/#mcp-sampling) a Pydantic AI agent as the client, but if you wanted to support sampling with the vanilla MCP SDK, you could do so like this: mcp_client_sampling.py ```python import asyncio from typing import Any from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client from mcp.shared.context import RequestContext from mcp.types import ( CreateMessageRequestParams, CreateMessageResult, ErrorData, TextContent, ) async def sampling_callback( context: RequestContext[ClientSession, Any], params: CreateMessageRequestParams ) -> CreateMessageResult | ErrorData: print('sampling system prompt:', params.systemPrompt) #> sampling system prompt: always reply in rhyme print('sampling messages:', params.messages) """ sampling messages: [ SamplingMessage( role='user', content=TextContent( type='text', text='write a poem about socks', annotations=None, meta=None, ), meta=None, ) ] """ # TODO get the response content by calling an LLM... response_content = 'Socks for a fox.' return CreateMessageResult( role='assistant', content=TextContent(type='text', text=response_content), model='fictional-llm', ) async def client(): server_params = StdioServerParameters(command='python', args=['mcp_server_sampling.py']) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write, sampling_callback=sampling_callback) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) #> Socks for a fox. if __name__ == '__main__': asyncio.run(client()) ``` *(This example is complete, it can be run "as is")* # UI Event Streams # Agent-User Interaction (AG-UI) Protocol The [Agent-User Interaction (AG-UI) Protocol](https://docs.ag-ui.com/introduction) is an open standard introduced by the [CopilotKit](https://webflow.copilotkit.ai/blog/introducing-ag-ui-the-protocol-where-agents-meet-users) team that standardises how frontend applications communicate with AI agents, with support for streaming, frontend tools, shared state, and custom events. Note The AG-UI integration was originally built by the team at [Rocket Science](https://www.rocketscience.gg/) and contributed in collaboration with the Pydantic AI and CopilotKit teams. Thanks Rocket Science! ## Installation The only dependencies are: - [ag-ui-protocol](https://docs.ag-ui.com/introduction): to provide the AG-UI types and encoder. - [starlette](https://www.starlette.io): to handle [ASGI](https://asgi.readthedocs.io/en/latest/) requests from a framework like FastAPI. You can install Pydantic AI with the `ag-ui` extra to ensure you have all the required AG-UI dependencies: ```bash pip install 'pydantic-ai-slim[ag-ui]' ``` ```bash uv add 'pydantic-ai-slim[ag-ui]' ``` To run the examples you'll also need: - [uvicorn](https://www.uvicorn.org/) or another ASGI compatible server ```bash pip install uvicorn ``` ```bash uv add uvicorn ``` ## Usage There are three ways to run a Pydantic AI agent based on AG-UI run input with streamed AG-UI events as output, from most to least flexible. If you're using a Starlette-based web framework like FastAPI, you'll typically want to use the second method. 1. The AGUIAdapter.run_stream() method, when called on an AGUIAdapter instantiated with an agent and an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object, will run the agent and return a stream of AG-UI events. It also takes optional Agent.iter() arguments including `deps`. Use this if you're using a web framework not based on Starlette (e.g. Django or Flask) or want to modify the input or output some way. 1. The AGUIAdapter.dispatch_request() class method takes an agent and a Starlette request (e.g. from FastAPI) coming from an AG-UI frontend, and returns a streaming Starlette response of AG-UI events that you can return directly from your endpoint. It also takes optional Agent.iter() arguments including `deps`, that you can vary for each request (e.g. based on the authenticated user). This is a convenience method that combines AGUIAdapter.from_request(), AGUIAdapter.run_stream(), and AGUIAdapter.streaming_response(). 1. AGUIApp represents an ASGI application that handles every AG-UI request by running the agent. It also takes optional Agent.iter() arguments including `deps`, but these will be the same for each request, with the exception of the AG-UI state that's injected as described under [state management](#state-management). This ASGI app can be [mounted](https://fastapi.tiangolo.com/advanced/sub-applications/) at a given path in an existing FastAPI app. ### Handle run input and output directly This example uses AGUIAdapter.run_stream() and performs its own request parsing and response generation. This can be modified to work with any web framework. [Learn about Gateway](https://ai.pydantic.dev/gateway) run_ag_ui.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('gateway/openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = AGUIAdapter.build_run_input(await request.body()) # (1) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = AGUIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() # (2) sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) # (3) ``` 1. AGUIAdapter.build_run_input() takes the request body as bytes and returns an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object. You can also use the AGUIAdapter.from_request() class method to build an adapter directly from a request. 1. AGUIAdapter.run_stream() runs the agent and returns a stream of AG-UI events. It supports the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), including `deps`. You can also use AGUIAdapter.run_stream_native() to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into AG-UI events using AGUIAdapter.transform_stream(). 1. AGUIAdapter.encode_stream() encodes the stream of AG-UI events as strings according to the accept header value. You can also use AGUIAdapter.streaming_response() to generate a streaming response directly from the AG-UI event stream returned by `run_stream()`. run_ag_ui.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = AGUIAdapter.build_run_input(await request.body()) # (1) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = AGUIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() # (2) sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) # (3) ``` 1. AGUIAdapter.build_run_input() takes the request body as bytes and returns an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object. You can also use the AGUIAdapter.from_request() class method to build an adapter directly from a request. 1. AGUIAdapter.run_stream() runs the agent and returns a stream of AG-UI events. It supports the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), including `deps`. You can also use AGUIAdapter.run_stream_native() to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into AG-UI events using AGUIAdapter.transform_stream(). 1. AGUIAdapter.encode_stream() encodes the stream of AG-UI events as strings according to the accept header value. You can also use AGUIAdapter.streaming_response() to generate a streaming response directly from the AG-UI event stream returned by `run_stream()`. Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn run_ag_ui:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Handle a Starlette request This example uses AGUIAdapter.dispatch_request() to directly handle a FastAPI request and return a response. Something analogous to this will work with any Starlette-based web framework. [Learn about Gateway](https://ai.pydantic.dev/gateway) handle_ag_ui_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('gateway/openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: return await AGUIAdapter.dispatch_request(request, agent=agent) # (1) ``` 1. This method essentially does the same as the previous example, but it's more convenient to use when you're already using a Starlette/FastAPI app. handle_ag_ui_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.ag_ui import AGUIAdapter agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: return await AGUIAdapter.dispatch_request(request, agent=agent) # (1) ``` 1. This method essentially does the same as the previous example, but it's more convenient to use when you're already using a Starlette/FastAPI app. Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn handle_ag_ui_request:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Stand-alone ASGI app This example uses AGUIApp to turn the agent into a stand-alone ASGI application: [Learn about Gateway](https://ai.pydantic.dev/gateway) ag_ui_app.py ```python from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('gateway/openai:gpt-5.2', instructions='Be fun!') app = AGUIApp(agent) ``` ag_ui_app.py ```python from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5.2', instructions='Be fun!') app = AGUIApp(agent) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn ag_ui_app:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ## Design The Pydantic AI AG-UI integration supports all features of the spec: - [Events](https://docs.ag-ui.com/concepts/events) - [Messages](https://docs.ag-ui.com/concepts/messages) - [State Management](https://docs.ag-ui.com/concepts/state) - [Tools](https://docs.ag-ui.com/concepts/tools) The integration receives messages in the form of a [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object that describes the details of the requested agent run including message history, state, and available tools. These are converted to Pydantic AI types and passed to the agent's run method. Events from the agent, including tool calls, are converted to AG-UI events and streamed back to the caller as Server-Sent Events (SSE). A user request may require multiple round trips between client UI and Pydantic AI server, depending on the tools and events needed. ## Features ### State management The integration provides full support for [AG-UI state management](https://docs.ag-ui.com/concepts/state), which enables real-time synchronization between agents and frontend applications. In the example below we have document state which is shared between the UI and server using the StateDeps [dependencies type](https://ai.pydantic.dev/dependencies/index.md) that can be used to automatically validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/js/core/types#runagentinput) using a Pydantic `BaseModel` specified as a generic parameter. Custom dependencies type with AG-UI state If you want to use your own dependencies type to hold AG-UI state as well as other things, it needs to implements the StateHandler protocol, meaning it needs to be a [dataclass](https://docs.python.org/3/library/dataclasses.html) with a non-optional `state` field. This lets Pydantic AI ensure that state is properly isolated between requests by building a new dependencies object each time. If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method. If AG-UI state is provided but your dependencies do not implement StateHandler, Pydantic AI will emit a warning and ignore the state. Use StateDeps or a custom StateHandler implementation to receive and validate the incoming state. [Learn about Gateway](https://ai.pydantic.dev/gateway) ag_ui_state.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'gateway/openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` ag_ui_state.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```bash uvicorn ag_ui_state:app --host 0.0.0.0 --port 9000 ``` ### Tools AG-UI frontend tools are seamlessly provided to the Pydantic AI agent, enabling rich user experiences with frontend user interfaces. ### Events Pydantic AI tools can send [AG-UI events](https://docs.ag-ui.com/concepts/events) simply by returning a [`ToolReturn`](https://ai.pydantic.dev/tools-advanced/#advanced-tool-returns) object with a [`BaseEvent`](https://docs.ag-ui.com/sdk/python/core/events#baseevent) (or a list of events) as `metadata`, which allows for custom events and state updates. [Learn about Gateway](https://ai.pydantic.dev/gateway) ag_ui_tool_events.py ```python from ag_ui.core import CustomEvent, EventType, StateSnapshotEvent from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ToolReturn from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'gateway/openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) @agent.tool async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> ToolReturn: return ToolReturn( return_value='State updated', metadata=[ StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=ctx.deps.state, ), ], ) @agent.tool_plain async def custom_events() -> ToolReturn: return ToolReturn( return_value='Count events sent', metadata=[ CustomEvent( type=EventType.CUSTOM, name='count', value=1, ), CustomEvent( type=EventType.CUSTOM, name='count', value=2, ), ] ) ``` ag_ui_tool_events.py ```python from ag_ui.core import CustomEvent, EventType, StateSnapshotEvent from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ToolReturn from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-5.2', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) @agent.tool async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> ToolReturn: return ToolReturn( return_value='State updated', metadata=[ StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=ctx.deps.state, ), ], ) @agent.tool_plain async def custom_events() -> ToolReturn: return ToolReturn( return_value='Count events sent', metadata=[ CustomEvent( type=EventType.CUSTOM, name='count', value=1, ), CustomEvent( type=EventType.CUSTOM, name='count', value=2, ), ] ) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```bash uvicorn ag_ui_tool_events:app --host 0.0.0.0 --port 9000 ``` ## Examples For more examples of how to use AGUIApp see [`pydantic_ai_examples.ag_ui`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/ag_ui), which includes a server for use with the [AG-UI Dojo](https://docs.ag-ui.com/tutorials/debugging#the-ag-ui-dojo). # UI Event Streams If you're building a chat app or other interactive frontend for an AI agent, your backend will need to receive agent run input (like a chat message or complete [message history](https://ai.pydantic.dev/message-history/index.md)) from the frontend, and will need to stream the [agent's events](https://ai.pydantic.dev/agent/#streaming-all-events) (like text, thinking, and tool calls) to the frontend so that the user knows what's happening in real time. While your frontend could use Pydantic AI's ModelRequest and AgentStreamEvent directly, you'll typically want to use a UI event stream protocol that's natively supported by your frontend framework. Pydantic AI natively supports two UI event stream protocols: - [Agent-User Interaction (AG-UI) Protocol](https://ai.pydantic.dev/ui/ag-ui/index.md) - [Vercel AI Data Stream Protocol](https://ai.pydantic.dev/ui/vercel-ai/index.md) These integrations are implemented as subclasses of the abstract UIAdapter class, so they also serve as a reference for integrating with other UI event stream protocols. ## Usage The protocol-specific UIAdapter subclass (i.e. AGUIAdapter or VercelAIAdapter) is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), running the agent, and then transforming Pydantic AI events into protocol-specific events. The event stream transformation is handled by a protocol-specific UIEventStream subclass, but you typically won't use this directly. If you're using a Starlette-based web framework like FastAPI, you can use the UIAdapter.dispatch_request() class method from an endpoint function to directly handle a request and return a streaming response of protocol-specific events. This is demonstrated in the next section. If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below. ### Usage with Starlette/FastAPI Besides the request, UIAdapter.dispatch_request() takes the agent, the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), and an optional `on_complete` callback function that receives the completed AgentRunResult and can optionally yield additional protocol-specific events. Note These examples use the `VercelAIAdapter`, but the same patterns apply to all `UIAdapter` subclasses. [Learn about Gateway](https://ai.pydantic.dev/gateway) dispatch_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('gateway/openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` dispatch_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` ### Advanced Usage If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `UIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `UIAdapter.dispatch_request()` class method shown above: 1. The UIAdapter.build_run_input() class method takes the request body as bytes and returns a protocol-specific run input object, which you can then pass to the UIAdapter() constructor along with the agent. - You can also use the UIAdapter.from_request() class method to build an adapter directly from a Starlette/FastAPI request. 1. The UIAdapter.run_stream() method runs the agent and returns a stream of protocol-specific events. It supports the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents) and an optional `on_complete` callback function that receives the completed AgentRunResult and can optionally yield additional protocol-specific events. - You can also use UIAdapter.run_stream_native() to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into protocol-specific events using UIAdapter.transform_stream(). 1. The UIAdapter.encode_stream() method encodes the stream of protocol-specific events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response. - You can also use UIAdapter.streaming_response() to generate a Starlette/FastAPI streaming response directly from the protocol-specific event stream returned by `run_stream()`. Note This example uses FastAPI, but can be modified to work with any web framework. [Learn about Gateway](https://ai.pydantic.dev/gateway) run_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('gateway/openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` run_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` # Vercel AI Data Stream Protocol Pydantic AI natively supports the [Vercel AI Data Stream Protocol](https://ai-sdk.dev/docs/ai-sdk-ui/stream-protocol#data-stream-protocol) to receive agent run input from, and stream events to, a [Vercel AI Elements](https://ai-sdk.dev/elements) frontend. Note By default, the adapter targets AI SDK v5 for backwards compatibility. To use features introduced in AI SDK v6, set `sdk_version=6` on the adapter. ## Usage The VercelAIAdapter class is responsible for transforming agent run input received from the frontend into arguments for [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), running the agent, and then transforming Pydantic AI events into Vercel AI events. The event stream transformation is handled by the VercelAIEventStream class, but you typically won't use this directly. If you're using a Starlette-based web framework like FastAPI, you can use the VercelAIAdapter.dispatch_request() class method from an endpoint function to directly handle a request and return a streaming response of Vercel AI events. This is demonstrated in the next section. If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods. This is demonstrated in "Advanced Usage" section below. ### Usage with Starlette/FastAPI Besides the request, VercelAIAdapter.dispatch_request() takes the agent, the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents), and an optional `on_complete` callback function that receives the completed AgentRunResult and can optionally yield additional Vercel AI events. [Learn about Gateway](https://ai.pydantic.dev/gateway) dispatch_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('gateway/openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` dispatch_request.py ```python from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: return await VercelAIAdapter.dispatch_request(request, agent=agent) ``` ### Advanced Usage If you're using a web framework not based on Starlette (e.g. Django or Flask) or need fine-grained control over the input or output, you can create a `VercelAIAdapter` instance and directly use its methods, which can be chained to accomplish the same thing as the `VercelAIAdapter.dispatch_request()` class method shown above: 1. The VercelAIAdapter.build_run_input() class method takes the request body as bytes and returns a Vercel AI RequestData run input object, which you can then pass to the VercelAIAdapter() constructor along with the agent. - You can also use the VercelAIAdapter.from_request() class method to build an adapter directly from a Starlette/FastAPI request. 1. The VercelAIAdapter.run_stream() method runs the agent and returns a stream of Vercel AI events. It supports the same optional arguments as [`Agent.run_stream_events()`](https://ai.pydantic.dev/agent/#running-agents) and an optional `on_complete` callback function that receives the completed AgentRunResult and can optionally yield additional Vercel AI events. - You can also use VercelAIAdapter.run_stream_native() to run the agent and return a stream of Pydantic AI events instead, which can then be transformed into Vercel AI events using VercelAIAdapter.transform_stream(). 1. The VercelAIAdapter.encode_stream() method encodes the stream of Vercel AI events as SSE (HTTP Server-Sent Events) strings, which you can then return as a streaming response. - You can also use VercelAIAdapter.streaming_response() to generate a Starlette/FastAPI streaming response directly from the Vercel AI event stream returned by `run_stream()`. Note This example uses FastAPI, but can be modified to work with any web framework. [Learn about Gateway](https://ai.pydantic.dev/gateway) run_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('gateway/openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` run_stream.py ```python import json from http import HTTPStatus from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ui import SSE_CONTENT_TYPE from pydantic_ai.ui.vercel_ai import VercelAIAdapter agent = Agent('openai:gpt-5.2') app = FastAPI() @app.post('/chat') async def chat(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = VercelAIAdapter.build_run_input(await request.body()) except ValidationError as e: return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) adapter = VercelAIAdapter(agent=agent, run_input=run_input, accept=accept) event_stream = adapter.run_stream() sse_event_stream = adapter.encode_stream(event_stream) return StreamingResponse(sse_event_stream, media_type=accept) ``` # Optional # Unit testing Writing unit tests for Pydantic AI code is just like unit tests for any other Python code. Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests. Unless you're really sure you know better, you'll probably want to follow roughly this strategy: - Use [`pytest`](https://docs.pytest.org/en/stable/) as your test harness - If you find yourself typing out long assertions, use [inline-snapshot](https://15r10nk.github.io/inline-snapshot/latest/) - Similarly, [dirty-equals](https://dirty-equals.helpmanual.io/latest/) can be useful for comparing large data structures - Use TestModel or FunctionModel in place of your actual model to avoid the usage, latency and variability of real LLM calls - Use Agent.override to replace an agent's model, dependencies, or toolsets inside your application logic - Set ALLOW_MODEL_REQUESTS=False globally to block any requests from being made to non-test models accidentally ### Unit testing with `TestModel` The simplest and fastest way to exercise most of your application code is using TestModel, this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent. `TestModel` is not magic The "clever" (but not too clever) part of `TestModel` is that it will attempt to generate valid structured data for [function tools](https://ai.pydantic.dev/tools/index.md) and [output types](https://ai.pydantic.dev/output/#structured-output) based on the schema of the registered tools. There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool. The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases. If you want something more sophisticated, use FunctionModel and write your own data generation logic. Let's write unit tests for the following application code: [Learn about Gateway](https://ai.pydantic.dev/gateway) weather_app.py ```python import asyncio from datetime import date from pydantic_ai import Agent, RunContext from fake_database import DatabaseConn # (1)! from weather_service import WeatherService # (2)! weather_agent = Agent( 'gateway/openai:gpt-5.2', deps_type=WeatherService, instructions='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date ) -> str: if forecast_date < date.today(): # (3)! return ctx.deps.get_historic_weather(location, forecast_date) else: return ctx.deps.get_forecast(location, forecast_date) async def run_weather_forecast( # (4)! user_prompts: list[tuple[str, int]], conn: DatabaseConn ): """Run weather forecast for a list of user prompts and save.""" async with WeatherService() as weather_service: async def run_forecast(prompt: str, user_id: int): result = await weather_agent.run(prompt, deps=weather_service) await conn.store_forecast(user_id, result.output) # run all prompts in parallel await asyncio.gather( *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts) ) ``` 1. `DatabaseConn` is a class that holds a database connection 1. `WeatherService` has methods to get weather forecasts and historic data about the weather 1. We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below 1. This function is the code we want to test, together with the agent it uses weather_app.py ```python import asyncio from datetime import date from pydantic_ai import Agent, RunContext from fake_database import DatabaseConn # (1)! from weather_service import WeatherService # (2)! weather_agent = Agent( 'openai:gpt-5.2', deps_type=WeatherService, instructions='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date ) -> str: if forecast_date < date.today(): # (3)! return ctx.deps.get_historic_weather(location, forecast_date) else: return ctx.deps.get_forecast(location, forecast_date) async def run_weather_forecast( # (4)! user_prompts: list[tuple[str, int]], conn: DatabaseConn ): """Run weather forecast for a list of user prompts and save.""" async with WeatherService() as weather_service: async def run_forecast(prompt: str, user_id: int): result = await weather_agent.run(prompt, deps=weather_service) await conn.store_forecast(user_id, result.output) # run all prompts in parallel await asyncio.gather( *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts) ) ``` 1. `DatabaseConn` is a class that holds a database connection 1. `WeatherService` has methods to get weather forecasts and historic data about the weather 1. We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below 1. This function is the code we want to test, together with the agent it uses Here we have a function that takes a list of `(user_prompt, user_id)` tuples, gets a weather forecast for each prompt, and stores the result in the database. **We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.** Here's how we would write tests using TestModel: test_weather_app.py ```python from datetime import timezone import pytest from dirty_equals import IsNow, IsStr from pydantic_ai import models, capture_run_messages, RequestUsage from pydantic_ai.models.test import TestModel from pydantic_ai import ( ModelResponse, TextPart, ToolCallPart, ToolReturnPart, UserPromptPart, ModelRequest, ) from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio # (1)! models.ALLOW_MODEL_REQUESTS = False # (2)! async def test_forecast(): conn = DatabaseConn() user_id = 1 with capture_run_messages() as messages: with weather_agent.override(model=TestModel()): # (3)! prompt = 'What will the weather be like in London on 2024-11-28?' await run_weather_forecast([(prompt, user_id)], conn) # (4)! forecast = await conn.get_forecast(user_id) assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}' # (5)! assert messages == [ # (6)! ModelRequest( parts=[ UserPromptPart( content='What will the weather be like in London on 2024-11-28?', timestamp=IsNow(tz=timezone.utc), # (7)! ), ], instructions='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelResponse( parts=[ ToolCallPart( tool_name='weather_forecast', args={ 'location': 'a', 'forecast_date': '2024-01-01', # (8)! }, tool_call_id=IsStr(), ) ], usage=RequestUsage( input_tokens=60, output_tokens=7, ), model_name='test', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelRequest( parts=[ ToolReturnPart( tool_name='weather_forecast', content='Sunny with a chance of rain', tool_call_id=IsStr(), timestamp=IsNow(tz=timezone.utc), ), ], instructions='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ModelResponse( parts=[ TextPart( content='{"weather_forecast":"Sunny with a chance of rain"}', ) ], usage=RequestUsage( input_tokens=66, output_tokens=16, ), model_name='test', timestamp=IsNow(tz=timezone.utc), run_id=IsStr(), ), ] ``` 1. We're using [anyio](https://anyio.readthedocs.io/en/stable/) to run async tests. 1. This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see ALLOW_MODEL_REQUESTS for more details. 1. We're using Agent.override to replace the agent's model with TestModel, the nice thing about `override` is that we can replace the model inside agent without needing access to the agent `run*` methods call site. 1. Now we call the function we want to test inside the `override` context manager. 1. But default, `TestModel` will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add custom_output_text='Sunny' when defining `TestModel`. 1. So far we don't actually know which tools were called and with which values, we can use capture_run_messages to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected. 1. The IsNow helper allows us to use declarative asserts even with data which will contain timestamps that change over time. 1. `TestModel` isn't doing anything clever to extract values from the prompt, so these values are hardcoded. ### Unit testing with `FunctionModel` The above tests are a great start, but careful readers will notice that the `WeatherService.get_forecast` is never called since `TestModel` calls `weather_forecast` with a date in the past. To fully exercise `weather_forecast`, we need to use FunctionModel to customise how the tools is called. Here's an example of using `FunctionModel` to test the `weather_forecast` tool with custom inputs test_weather_app2.py ```python import re import pytest from pydantic_ai import models from pydantic_ai import ( ModelMessage, ModelResponse, TextPart, ToolCallPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio models.ALLOW_MODEL_REQUESTS = False def call_weather_forecast( # (1)! messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: if len(messages) == 1: # first call, call the weather forecast tool user_prompt = messages[0].parts[-1] m = re.search(r'\d{4}-\d{2}-\d{2}', user_prompt.content) assert m is not None args = {'location': 'London', 'forecast_date': m.group()} # (2)! return ModelResponse(parts=[ToolCallPart('weather_forecast', args)]) else: # second call, return the forecast msg = messages[-1].parts[0] assert msg.part_kind == 'tool-return' return ModelResponse(parts=[TextPart(f'The forecast is: {msg.content}')]) async def test_forecast_future(): conn = DatabaseConn() user_id = 1 with weather_agent.override(model=FunctionModel(call_weather_forecast)): # (3)! prompt = 'What will the weather be like in London on 2032-01-01?' await run_weather_forecast([(prompt, user_id)], conn) forecast = await conn.get_forecast(user_id) assert forecast == 'The forecast is: Rainy with a chance of sun' ``` 1. We define a function `call_weather_forecast` that will be called by `FunctionModel` in place of the LLM, this function has access to the list of ModelMessages that make up the run, and AgentInfo which contains information about the agent and the function tools and return tools. 1. Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location. 1. We use FunctionModel to replace the agent's model with our custom function. ### Overriding model via pytest fixtures If you're writing lots of tests that all require model to be overridden, you can use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override the model with TestModel or FunctionModel in a reusable way. Here's an example of a fixture that overrides the model with `TestModel`: test_agent.py ```python import pytest from pydantic_ai.models.test import TestModel from weather_app import weather_agent @pytest.fixture def override_weather_agent(): with weather_agent.override(model=TestModel()): yield async def test_forecast(override_weather_agent: None): ... # test code here ``` # Command Line Interface (CLI) **Pydantic AI** comes with a CLI, `clai` (pronounced "clay"). You can use it to chat with various LLMs and quickly get answers, right from the command line, or spin up a uvicorn server to chat with your Pydantic AI agents from your browser. ## Installation You can run the `clai` using [`uvx`](https://docs.astral.sh/uv/guides/tools/): ```bash uvx clai ``` Or install `clai` globally [with `uv`](https://docs.astral.sh/uv/guides/tools/#installing-tools): ```bash uv tool install clai ... clai ``` Or with `pip`: ```bash pip install clai ... clai ``` ## CLI Usage You'll need to set an environment variable depending on the provider you intend to use. E.g. if you're using OpenAI, set the `OPENAI_API_KEY` environment variable: ```bash export OPENAI_API_KEY='your-api-key-here' ``` Then running `clai` will start an interactive session where you can chat with the AI model. Special commands available in interactive mode: - `/exit`: Exit the session - `/markdown`: Show the last response in markdown format - `/multiline`: Toggle multiline input mode (use Ctrl+D to submit) - `/cp`: Copy the last response to clipboard ### CLI Options | Option | Description | | --------------------- | ---------------------------------------------------------------------------------------------- | | `prompt` | AI prompt for one-shot mode (positional). If omitted, starts interactive mode. | | `-m`, `--model` | Model to use in `provider:model` format (e.g., `openai:gpt-5.2`) | | `-a`, `--agent` | Custom agent in `module:variable` format | | `-t`, `--code-theme` | Syntax highlighting theme (`dark`, `light`, or [pygments theme](https://pygments.org/styles/)) | | `--no-stream` | Disable streaming from the model | | `-l`, `--list-models` | List all available models and exit | | `--version` | Show version and exit | ### Choose a model You can specify which model to use with the `--model` flag: ```bash clai --model anthropic:claude-sonnet-4-5 ``` (a full list of models available can be printed with `clai --list-models`) ### Custom Agents You can specify a custom agent using the `--agent` flag with a module path and variable name: [Learn about Gateway](https://ai.pydantic.dev/gateway) custom_agent.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='You always respond in Italian.') ``` custom_agent.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') ``` Then run: ```bash clai --agent custom_agent:agent "What's the weather today?" ``` The format must be `module:variable` where: - `module` is the importable Python module path - `variable` is the name of the Agent instance in that module Additionally, you can directly launch CLI mode from an `Agent` instance using `Agent.to_cli_sync()`: [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() ``` agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') agent.to_cli_sync() ``` You can also use the async interface with `Agent.to_cli()`: [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('gateway/openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5.2', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` *(You'll need to add `asyncio.run(main())` to run `main`)* ### Message History Both `Agent.to_cli()` and `Agent.to_cli_sync()` support a `message_history` parameter, allowing you to continue an existing conversation or provide conversation context: [Learn about Gateway](https://ai.pydantic.dev/gateway) agent_with_history.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) agent = Agent('gateway/openai:gpt-5.2') # Create some conversation history message_history: list[ModelMessage] = [ ModelRequest([UserPromptPart(content='What is 2+2?')]), ModelResponse([TextPart(content='2+2 equals 4.')]) ] # Start CLI with existing conversation context agent.to_cli_sync(message_history=message_history) ``` agent_with_history.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) agent = Agent('openai:gpt-5.2') # Create some conversation history message_history: list[ModelMessage] = [ ModelRequest([UserPromptPart(content='What is 2+2?')]), ModelResponse([TextPart(content='2+2 equals 4.')]) ] # Start CLI with existing conversation context agent.to_cli_sync(message_history=message_history) ``` The CLI will start with the provided conversation history, allowing the agent to refer back to previous exchanges and maintain context throughout the session. ## Web Chat UI Launch a web-based chat interface by running: ```bash clai web -m openai:gpt-5.2 ``` This will start a web server (default: http://127.0.0.1:7932) with a chat interface. You can also serve an existing agent. For example, if you have an agent defined in `my_agent.py`: [Learn about Gateway](https://ai.pydantic.dev/gateway) ```python from pydantic_ai import Agent my_agent = Agent('gateway/openai:gpt-5.2', instructions='You are a helpful assistant.') ``` ```python from pydantic_ai import Agent my_agent = Agent('openai:gpt-5.2', instructions='You are a helpful assistant.') ``` Launch the web UI: ```bash # With a custom agent clai web --agent my_module:my_agent # With specific models (first is default when no --agent) clai web -m openai:gpt-5.2 -m anthropic:claude-sonnet-4-5 # With builtin tools clai web -m openai:gpt-5.2 -t web_search -t code_execution # Generic agent with system instructions clai web -m openai:gpt-5.2 -i 'You are a helpful coding assistant' # Custom agent with extra instructions for each run clai web --agent my_module:my_agent -i 'Always respond in Spanish' ``` Memory Tool The [`memory`](https://ai.pydantic.dev/builtin-tools/#memory-tool) builtin tool cannot be enabled via `-t memory`. If your agent needs memory, configure the MemoryTool directly on the agent and provide it via `--agent`. ### Web UI Options | Option | Description | | ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `--agent`, `-a` | Agent to serve in [`module:variable` format](#custom-agents) | | `--model`, `-m` | Models to list as options in the UI (repeatable) | | `--tool`, `-t` | [Builtin tool](https://ai.pydantic.dev/builtin-tools/index.md)s to list as options in the UI (repeatable). See [available tools](https://ai.pydantic.dev/web/#builtin-tool-support). | | `--instructions`, `-i` | System instructions. When `--agent` is specified, these are additional to the agent's existing instructions. | | `--host` | Host to bind server (default: 127.0.0.1) | | `--port` | Port to bind server (default: 7932) | | `--html-source` | URL or file path for the chat UI HTML. | When using `--agent`, the agent's configured model becomes the default. CLI models (`-m`) are additional options. Without `--agent`, the first `-m` model is the default. The web chat UI can also be launched programmatically using Agent.to_web(), see the [Web UI documentation](https://ai.pydantic.dev/web/index.md). Run the `web` command with `--help` to see all available options: ```bash clai web --help ``` # Pydantic Logfire Debugging and Monitoring Applications that use LLMs have some challenges that are well known and understood: LLMs are **slow**, **unreliable** and **expensive**. These applications also have some challenges that most developers have encountered much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a prompt can completely change a model's performance, and there's no `EXPLAIN` query you can run to understand why. Warning From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse. If LLMs weren't so bloody useful, we'd never touch them. To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them. LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard. ## Pydantic Logfire [Pydantic Logfire](https://pydantic.dev/logfire) is an observability platform developed by the team who created and maintain Pydantic Validation and Pydantic AI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs, all using OpenTelemetry. Pydantic Logfire is a commercial product Logfire is a commercially supported, hosted platform with an extremely generous and perpetual [free tier](https://pydantic.dev/pricing/). You can sign up and start using Logfire in a couple of minutes. Logfire can also be self-hosted on the enterprise tier. Pydantic AI has built-in (but optional) support for Logfire. That means if the `logfire` package is installed and configured and agent instrumentation is enabled then detailed information about agent runs is sent to Logfire. Otherwise there's virtually no overhead and nothing is sent. Here's an example showing details of running the [Weather Agent](https://ai.pydantic.dev/examples/weather-agent/index.md) in Logfire: A trace is generated for the agent run, and spans are emitted for each model request and tool call. ## Using Logfire To use Logfire, you'll need a Logfire [account](https://logfire.pydantic.dev). The Logfire Python SDK is included with `pydantic-ai`: ```bash pip install pydantic-ai ``` ```bash uv add pydantic-ai ``` Or if you're using the slim package, you can install it with the `logfire` optional group: ```bash pip install "pydantic-ai-slim[logfire]" ``` ```bash uv add "pydantic-ai-slim[logfire]" ``` Then authenticate your local environment with Logfire: ```bash logfire auth ``` ```bash uv run logfire auth ``` And configure a project to send data to: ```bash logfire projects new ``` ```bash uv run logfire projects new ``` (Or use an existing project with `logfire projects use`) This will write to a `.logfire` directory in the current working directory, which the Logfire SDK will use for configuration at run time. With that, you can start using Logfire to instrument Pydantic AI code: [Learn about Gateway](https://ai.pydantic.dev/gateway) instrument_pydantic_ai.py ```python import logfire from pydantic_ai import Agent logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! agent = Agent('gateway/openai:gpt-5.2', instructions='Be concise, reply with one sentence.') result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. logfire.configure() configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly. 1. logfire.instrument_pydantic_ai() enables instrumentation of Pydantic AI. 1. Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution instrument_pydantic_ai.py ```python import logfire from pydantic_ai import Agent logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! agent = Agent('openai:gpt-5.2', instructions='Be concise, reply with one sentence.') result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. logfire.configure() configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly. 1. logfire.instrument_pydantic_ai() enables instrumentation of Pydantic AI. 1. Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution *(This example is complete, it can be run "as is")* Which will display in Logfire thus: The [Logfire documentation](https://logfire.pydantic.dev/docs/) has more details on how to use Logfire, including how to instrument other libraries like [HTTPX](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) and [FastAPI](https://logfire.pydantic.dev/docs/integrations/web-frameworks/fastapi/). Since Logfire is built on [OpenTelemetry](https://opentelemetry.io/), you can use the Logfire Python SDK to send data to any OpenTelemetry collector, see [below](#using-opentelemetry). ### Debugging To demonstrate how Logfire can let you visualise the flow of a Pydantic AI run, here's the view you get from Logfire while running the [chat app examples](https://ai.pydantic.dev/examples/chat-app/index.md): ### Monitoring Performance We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor Pydantic AI runs inside Logfire itself: ### Monitoring HTTP Requests As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/) (bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers. To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs (except for [Bedrock](https://ai.pydantic.dev/models/bedrock/index.md)) use the [HTTPX](https://www.python-httpx.org/) library internally: [Learn about Gateway](https://ai.pydantic.dev/gateway) with_logfire_instrument_httpx.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) # (1)! agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` 1. See the logfire.instrument_httpx docs more details, `capture_all=True` means both headers and body are captured for both the request and response. with_logfire_instrument_httpx.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) # (1)! agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` 1. See the logfire.instrument_httpx docs more details, `capture_all=True` means both headers and body are captured for both the request and response. ## Using OpenTelemetry Pydantic AI's instrumentation uses [OpenTelemetry](https://opentelemetry.io/) (OTel), which Logfire is based on. This means you can debug and monitor Pydantic AI with any OpenTelemetry backend. Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so while we think you'll have the best experience using the Logfire platform , you should be able to use any OTel service with GenAI support. ### Logfire with an alternative OTel backend You can use the Logfire SDK completely freely and send the data to any OpenTelemetry backend. Here's an example of configuring the Logfire library to send data to the excellent [otel-tui](https://github.com/ymtdzzz/otel-tui) — an open source terminal based OTel backend and viewer (no association with Pydantic Validation). Run `otel-tui` with docker (see [the otel-tui readme](https://github.com/ymtdzzz/otel-tui) for more instructions): Terminal ```text docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest ``` then run, [Learn about Gateway](https://ai.pydantic.dev/gateway) otel_tui.py ```python import os import logfire from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' # (1)! logfire.configure(send_to_logfire=False) # (2)! logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` 1. Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. 1. We configure Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend. otel_tui.py ```python import os import logfire from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' # (1)! logfire.configure(send_to_logfire=False) # (2)! logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` 1. Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. 1. We configure Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend. Running the above code will send tracing data to `otel-tui`, which will display like this: Running the [weather agent](https://ai.pydantic.dev/examples/weather-agent/index.md) example connected to `otel-tui` shows how it can be used to visualise a more complex trace: For more information on using the Logfire SDK to send data to alternative backends, see [the Logfire documentation](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/). ### OTel without Logfire You can also emit OpenTelemetry data from Pydantic AI without using Logfire at all. To do this, you'll need to install and configure the OpenTelemetry packages you need. To run the following examples, use Terminal ```text uv run \ --with 'pydantic-ai-slim[openai]' \ --with opentelemetry-sdk \ --with opentelemetry-exporter-otlp \ raw_otel.py ``` [Learn about Gateway](https://ai.pydantic.dev/gateway) raw_otel.py ```python import os from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import set_tracer_provider from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' exporter = OTLPSpanExporter() span_processor = BatchSpanProcessor(exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) Agent.instrument_all() agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` raw_otel.py ```python import os from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import set_tracer_provider from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' exporter = OTLPSpanExporter() span_processor = BatchSpanProcessor(exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) Agent.instrument_all() agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` ### Alternative Observability backends Because Pydantic AI uses OpenTelemetry for observability, you can easily configure it to send data to any OpenTelemetry-compatible backend, not just our observability platform [Pydantic Logfire](#pydantic-logfire). The following providers have dedicated documentation on Pydantic AI: - [Langfuse](https://langfuse.com/docs/integrations/pydantic-ai) - [W&B Weave](https://weave-docs.wandb.ai/guides/integrations/pydantic_ai/) - [Arize](https://arize.com/docs/ax/observe/tracing-integrations-auto/pydantic-ai) - [Openlayer](https://www.openlayer.com/docs/integrations/pydantic-ai) - [OpenLIT](https://docs.openlit.io/latest/integrations/pydantic) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) - [Patronus AI](https://docs.patronus.ai/docs/percival/pydantic) - [Opik](https://www.comet.com/docs/opik/tracing/integrations/pydantic-ai) - [mlflow](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/pydantic_ai) - [Agenta](https://docs.agenta.ai/observability/integrations/pydanticai) - [Confident AI](https://documentation.confident-ai.com/docs/llm-tracing/integrations/pydanticai) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) - [Braintrust](https://www.braintrust.dev/docs/integrations/sdk-integrations/pydantic-ai) - [SigNoz](https://signoz.io/docs/pydantic-ai-observability/) ## Advanced usage ### Configuring data format Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Specifically, it follows version 1.37.0 of the conventions by default, with a few exceptions. Certain span and attribute names are not spec compliant by default for compatibility reasons, but can be made compliant by passing InstrumentationSettings(version=3) (the default is currently `version=2`). This will change the following: - The span name `agent run` becomes `invoke_agent {gen_ai.agent.name}` (with the agent name filled in) - The span name `running tool` becomes `execute_tool {gen_ai.tool.name}` (with the tool name filled in) - The attribute name `tool_arguments` becomes `gen_ai.tool.call.arguments` - The attribute name `tool_response` becomes `gen_ai.tool.call.result` To use [OpenTelemetry semantic conventions version 1.36.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/gen-ai/README.md) or older, pass InstrumentationSettings(version=1). Moreover, those semantic conventions specify that messages should be captured as individual events (logs) that are children of the request span, whereas by default, Pydantic AI instead collects these events into a JSON array which is set as a single large attribute called `events` on the request span. To change this, use `event_mode='logs'`: [Learn about Gateway](https://ai.pydantic.dev/gateway) instrumentation_settings_event_mode.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai(version=1, event_mode='logs') agent = Agent('gateway/openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` instrumentation_settings_event_mode.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai(version=1, event_mode='logs') agent = Agent('openai:gpt-5.2') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` This won't look as good in the Logfire UI, and will also be removed from Pydantic AI in a future release, but may be useful for backwards compatibility. Note that the OpenTelemetry Semantic Conventions are still experimental and are likely to change. ### Setting OpenTelemetry SDK providers By default, the global `TracerProvider` and `LoggerProvider` are used. These are set automatically by `logfire.configure()`. They can also be set by the `set_tracer_provider` and `set_logger_provider` functions in the OpenTelemetry Python SDK. You can set custom providers with InstrumentationSettings. [Learn about Gateway](https://ai.pydantic.dev/gateway) instrumentation_settings_providers.py ```python from opentelemetry.sdk._logs import LoggerProvider from opentelemetry.sdk.trace import TracerProvider from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings( tracer_provider=TracerProvider(), logger_provider=LoggerProvider(), ) agent = Agent('gateway/openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` instrumentation_settings_providers.py ```python from opentelemetry.sdk._logs import LoggerProvider from opentelemetry.sdk.trace import TracerProvider from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings( tracer_provider=TracerProvider(), logger_provider=LoggerProvider(), ) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Instrumenting a specific `Model` instrumented_model_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings, InstrumentedModel settings = InstrumentationSettings() model = InstrumentedModel('openai:gpt-5.2', settings) agent = Agent(model) ``` ### Excluding binary content [Learn about Gateway](https://ai.pydantic.dev/gateway) excluding_binary_content.py ```python from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_binary_content=False) agent = Agent('gateway/openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` excluding_binary_content.py ```python from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_binary_content=False) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Excluding prompts and completions For privacy and security reasons, you may want to monitor your agent's behavior and performance without exposing sensitive user data or proprietary prompts in your observability platform. Pydantic AI allows you to exclude the actual content from telemetry while preserving the structural information needed for debugging and monitoring. When `include_content=False` is set, Pydantic AI will exclude sensitive content from telemetry, including user prompts and model completions, tool call arguments and responses, and any other message content. [Learn about Gateway](https://ai.pydantic.dev/gateway) excluding_sensitive_content.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_content=False) agent = Agent('gateway/openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` excluding_sensitive_content.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_content=False) agent = Agent('openai:gpt-5.2', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` This setting is particularly useful in production environments where compliance requirements or data sensitivity concerns make it necessary to limit what content is sent to your observability platform. ### Adding Custom Metadata Use the agent's `metadata` parameter to attach additional data to the agent's span. When instrumentation is enabled, the computed metadata is recorded on the agent span under the `metadata` attribute. See the [usage and metadata example in the agents guide](https://ai.pydantic.dev/agent/#run-metadata) for details and usage. We'd love you to contribute to Pydantic AI! ## Installation and Setup Clone your fork and cd into the repo directory ```bash git clone git@github.com:/pydantic-ai.git cd pydantic-ai ``` Install `uv` (version 0.4.30 or later) and `pre-commit`: - [`uv` install docs](https://docs.astral.sh/uv/getting-started/installation/) - [`pre-commit` install docs](https://pre-commit.com/#install) To install `pre-commit` you can run the following command: ```bash uv tool install pre-commit ``` Install `pydantic-ai`, all dependencies and pre-commit hooks ```bash make install ``` ## Running Tests etc. We use `make` to manage most commands you'll need to run. For details on available commands, run: ```bash make help ``` To run code formatting, linting, static type checks, and tests with coverage report generation, run: ```bash make ``` ## Documentation Changes To run the documentation page locally, run: ```bash uv run mkdocs serve ``` ## Rules for adding new models to Pydantic AI To avoid an excessive workload for the maintainers of Pydantic AI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work. - To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more - To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total - For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use - For any other model that requires more logic, we recommend you release your own Python package `pydantic-ai-xxx`, which depends on [`pydantic-ai-slim`](https://ai.pydantic.dev/install/#slim-install) and implements a model that inherits from our Model ABC If you're unsure about adding a model, please [create an issue](https://github.com/pydantic/pydantic-ai/issues). # Upgrade Guide In September 2025, Pydantic AI reached V1, which means we're committed to API stability: we will not introduce changes that break your code until V2. For more information, review our [Version Policy](https://ai.pydantic.dev/version-policy/index.md). ## Breaking Changes Here's a filtered list of the breaking changes for each version to help you upgrade Pydantic AI. ### v1.0.1 (2025-09-05) The following breaking change was accidentally left out of v1.0.0: - See [#2808](https://github.com/pydantic/pydantic-ai/pull/2808) - Remove `Python` evaluator from `pydantic_evals` for security reasons ### v1.0.0 (2025-09-04) - See [#2725](https://github.com/pydantic/pydantic-ai/pull/2725) - Drop support for Python 3.9 - See [#2738](https://github.com/pydantic/pydantic-ai/pull/2738) - Make many dataclasses require keyword arguments - See [#2715](https://github.com/pydantic/pydantic-ai/pull/2715) - Remove `cases` and `averages` attributes from `pydantic_evals` spans - See [#2798](https://github.com/pydantic/pydantic-ai/pull/2798) - Change `ModelRequest.parts` and `ModelResponse.parts` types from `list` to `Sequence` - See [#2726](https://github.com/pydantic/pydantic-ai/pull/2726) - Default `InstrumentationSettings` version to 2 - See [#2717](https://github.com/pydantic/pydantic-ai/pull/2717) - Remove errors when passing `AsyncRetrying` or `Retrying` object to `AsyncTenacityTransport` or `TenacityTransport` instead of `RetryConfig` ### v0.x.x Before V1, minor versions were used to introduce breaking changes: **v0.8.0 (2025-08-26)** See [#2689](https://github.com/pydantic/pydantic-ai/pull/2689) - `AgentStreamEvent` was expanded to be a union of `ModelResponseStreamEvent` and `HandleResponseEvent`, simplifying the `event_stream_handler` function signature. Existing code accepting `AgentStreamEvent | HandleResponseEvent` will continue to work. **v0.7.6 (2025-08-26)** The following breaking change was inadvertently released in a patch version rather than a minor version: See [#2670](https://github.com/pydantic/pydantic-ai/pull/2670) - `TenacityTransport` and `AsyncTenacityTransport` now require the use of `pydantic_ai.retries.RetryConfig` (which is just a `TypedDict` containing the kwargs to `tenacity.retry`) instead of `tenacity.Retrying` or `tenacity.AsyncRetrying`. **v0.7.0 (2025-08-12)** See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now yields a `FinalResultEvent` along with the existing `PartStartEvent` and `PartDeltaEvent`. If you're using `pydantic_ai.direct.model_request_stream` or `pydantic_ai.direct.model_request_stream_sync`, you may need to update your code to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.Model.request_stream` now receives a `run_context` argument. If you've implemented a custom `Model` subclass, you will need to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now requires a `model_request_parameters` field and constructor argument. If you've implemented a custom `Model` subclass and implemented `request_stream`, you will need to account for this. **v0.6.0 (2025-08-06)** This release was meant to clean some old deprecated code, so we can get a step closer to V1. See [#2440](https://github.com/pydantic/pydantic-ai/pull/2440) - The `next` method was removed from the `Graph` class. Use `async with graph.iter(...) as run: run.next()` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_type`, `result_tool_name` and `result_tool_description` arguments were removed from the `Agent` class. Use `output_type` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_retries` argument was also removed from the `Agent` class. Use `output_retries` instead. See [#2443](https://github.com/pydantic/pydantic-ai/pull/2443) - The `data` property was removed from the `FinalResult` class. Use `output` instead. See [#2445](https://github.com/pydantic/pydantic-ai/pull/2445) - The `get_data` and `validate_structured_result` methods were removed from the `StreamedRunResult` class. Use `get_output` and `validate_structured_output` instead. See [#2446](https://github.com/pydantic/pydantic-ai/pull/2446) - The `format_as_xml` function was moved to the `pydantic_ai.format_as_xml` module. Import it via `from pydantic_ai import format_as_xml` instead. See [#2451](https://github.com/pydantic/pydantic-ai/pull/2451) - Removed deprecated `Agent.result_validator` method, `Agent.last_run_messages` property, `AgentRunResult.data` property, and `result_tool_return_content` parameters from result classes. **v0.5.0 (2025-08-04)** See [#2388](https://github.com/pydantic/pydantic-ai/pull/2388) - The `source` field of an `EvaluationResult` is now of type `EvaluatorSpec` rather than the actual source `Evaluator` instance, to help with serialization/deserialization. See [#2163](https://github.com/pydantic/pydantic-ai/pull/2163) - The `EvaluationReport.print` and `EvaluationReport.console_table` methods now require most arguments be passed by keyword. **v0.4.0 (2025-07-08)** See [#1799](https://github.com/pydantic/pydantic-ai/pull/1799) - Pydantic Evals `EvaluationReport` and `ReportCase` are now generic dataclasses instead of Pydantic models. If you were serializing them using `model_dump()`, you will now need to use the `EvaluationReportAdapter` and `ReportCaseAdapter` type adapters instead. See [#1507](https://github.com/pydantic/pydantic-ai/pull/1507) - The `ToolDefinition` `description` argument is now optional and the order of positional arguments has changed from `name, description, parameters_json_schema, ...` to `name, parameters_json_schema, description, ...` to account for this. **v0.3.0 (2025-06-18)** See [#1142](https://github.com/pydantic/pydantic-ai/pull/1142) — Adds support for thinking parts. We now convert the thinking blocks (`"...""`) in provider specific text parts to Pydantic AI `ThinkingPart`s. Also, as part of this release, we made the choice to not send back the `ThinkingPart`s to the provider - the idea is to save costs on behalf of the user. In the future, we intend to add a setting to customize this behavior. **v0.2.0 (2025-05-12)** See [#1647](https://github.com/pydantic/pydantic-ai/pull/1647) — usage makes sense as part of `ModelResponse`, and could be really useful in "messages" (really a sequence of requests and response). In this PR: - Adds `usage` to `ModelResponse` (field has a default factory of `Usage()` so it'll work to load data that doesn't have usage) - changes the return type of `Model.request` to just `ModelResponse` instead of `tuple[ModelResponse, Usage]` **v0.1.0 (2025-04-15)** See [#1248](https://github.com/pydantic/pydantic-ai/pull/1248) — the attribute/parameter name `result` was renamed to `output` in many places. Hopefully all changes keep a deprecated attribute or parameter with the old name, so you should get many deprecation warnings. See [#1484](https://github.com/pydantic/pydantic-ai/pull/1484) — `format_as_xml` was moved and made available to import from the package root, e.g. `from pydantic_ai import format_as_xml`. ## Full Changelog For the full changelog, see [GitHub Releases](https://github.com/pydantic/pydantic-ai/releases). ## Version Policy We will not intentionally make breaking changes in minor releases of V1. V2 will be released in April 2026 at the earliest, 6 months after the release of V1 in September 2025. Once we release V2, we'll continue to provide security fixes for V1 for another 6 months minimum, so you have time to upgrade your applications. Functionality marked as deprecated will not be removed until V2. Of course, some apparently safe changes and bug fixes will inevitably break some users' code — obligatory link to [xkcd](https://xkcd.com/1172/). The following changes will **NOT** be considered breaking changes, and may occur in minor releases: - Bug fixes that may result in existing code breaking, provided that such code was relying on undocumented features/constructs/assumptions. - Adding new message parts, stream events, or optional fields (including fields with default values) on existing message (part) and event types. Always code defensively when consuming message parts or event streams, and use the ModelMessagesTypeAdapter to (de)serialize message histories. - Changing OpenTelemetry span attributes. Because different [observability platforms](https://ai.pydantic.dev/logfire/#using-opentelemetry) support different versions of the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), Pydantic AI lets you configure the [instrumentation version](https://ai.pydantic.dev/logfire/#configuring-data-format), but the default version may change in a minor release. Span attributes for [Pydantic Evals](https://ai.pydantic.dev/evals/index.md) may also change as we iterate on Evals support in [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). - Changing how `__repr__` behaves, even of public classes. In all cases we will aim to minimize churn and do so only when justified by the increase of quality of Pydantic AI for users. ## Beta Features At Pydantic, we like to move quickly and innovate! To that end, minor releases may introduce beta features (indicated by a `beta` module) that are active works in progress. While in its beta phase, a feature's API and behaviors may not be stable, and it's very possible that changes made to the feature will not be backward-compatible. We aim to move beta features out of beta within a few months after initial release, once users have had a chance to provide feedback and test the feature in production. ## Support for Python versions Pydantic will drop support for a Python version when the following conditions are met: - The Python version has reached its [expected end of life](https://devguide.python.org/versions/). - less than 5% of downloads of the most recent minor release are using that version. # Examples # Agent User Interaction (AG-UI) Example of using Pydantic AI agents with the [AG-UI Dojo](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo) example app. See the [AG-UI docs](https://ai.pydantic.dev/ui/ag-ui/index.md) for more information about the AG-UI integration. Demonstrates: - [AG-UI](https://ai.pydantic.dev/ui/ag-ui/index.md) - [Tools](https://ai.pydantic.dev/tools/index.md) ## Prerequisites - An [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage) you will need two command line windows. ### Pydantic AI AG-UI backend Setup your OpenAI API Key ```bash export OPENAI_API_KEY= ``` Start the Pydantic AI AG-UI example backend. ```bash python -m pydantic_ai_examples.ag_ui ``` ```bash uv run -m pydantic_ai_examples.ag_ui ``` ### AG-UI Dojo example frontend Next run the AG-UI Dojo example frontend. 1. Clone the [AG-UI repository](https://github.com/ag-ui-protocol/ag-ui) ```shell git clone https://github.com/ag-ui-protocol/ag-ui.git ``` 1. Change into to the `ag-ui/typescript-sdk` directory ```shell cd ag-ui/sdks/typescript ``` 1. Run the Dojo app following the [official instructions](https://github.com/ag-ui-protocol/ag-ui/tree/main/apps/dojo#development-setup) 1. Visit 1. Select View `Pydantic AI` from the sidebar ## Feature Examples ### Agentic Chat This demonstrates a basic agent interaction including Pydantic AI server side tools and AG-UI client side tools. If you've [run the example](#running-the-example), you can view it at . #### Agent Tools - `time` - Pydantic AI tool to check the current time for a time zone - `background` - AG-UI tool to set the background color of the client window #### Agent Prompts ```text What is the time in New York? ``` ```text Change the background to blue ``` A complex example which mixes both AG-UI and Pydantic AI tools: ```text Perform the following steps, waiting for the response of each step before continuing: 1. Get the time 2. Set the background to red 3. Get the time 4. Report how long the background set took by diffing the two times ``` #### Agentic Chat - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/agentic_chat.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_chat.py) ```python """Agentic Chat feature.""" from __future__ import annotations from datetime import datetime from zoneinfo import ZoneInfo from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('gateway/openai:gpt-5-mini') @agent.tool_plain async def current_time(timezone: str = 'UTC') -> str: """Get the current time in ISO format. Args: timezone: The timezone to use. Returns: The current time in ISO format string. """ tz: ZoneInfo = ZoneInfo(timezone) return datetime.now(tz=tz).isoformat() app = AGUIApp(agent) ``` [ag_ui/api/agentic_chat.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_chat.py) ```python """Agentic Chat feature.""" from __future__ import annotations from datetime import datetime from zoneinfo import ZoneInfo from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5-mini') @agent.tool_plain async def current_time(timezone: str = 'UTC') -> str: """Get the current time in ISO format. Args: timezone: The timezone to use. Returns: The current time in ISO format string. """ tz: ZoneInfo = ZoneInfo(timezone) return datetime.now(tz=tz).isoformat() app = AGUIApp(agent) ``` ### Agentic Generative UI Demonstrates a long running task where the agent sends updates to the frontend to let the user know what's happening. If you've [run the example](#running-the-example), you can view it at . #### Plan Prompts ```text Create a plan for breakfast and execute it ``` #### Agentic Generative UI - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/agentic_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_generative_ui.py) ```python """Agentic Generative UI feature.""" from __future__ import annotations from textwrap import dedent from typing import Any, Literal from pydantic import BaseModel, Field from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp StepStatus = Literal['pending', 'completed'] class Step(BaseModel): """Represents a step in a plan.""" description: str = Field(description='The description of the step') status: StepStatus = Field( default='pending', description='The status of the step (e.g., pending, completed)', ) class Plan(BaseModel): """Represents a plan with multiple steps.""" steps: list[Step] = Field( default_factory=list[Step], description='The steps in the plan' ) class JSONPatchOp(BaseModel): """A class representing a JSON Patch operation (RFC 6902).""" op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field( description='The operation to perform: add, remove, replace, move, copy, or test', ) path: str = Field(description='JSON Pointer (RFC 6901) to the target location') value: Any = Field( default=None, description='The value to apply (for add, replace operations)', ) from_: str | None = Field( default=None, alias='from', description='Source path (for move, copy operations)', ) agent = Agent( 'gateway/openai:gpt-5-mini', instructions=dedent( """ When planning use tools only, without any other messages. IMPORTANT: - Use the `create_plan` tool to set the initial state of the steps - Use the `update_plan_step` tool to update the status of each step - Do NOT repeat the plan or summarise it in a message - Do NOT confirm the creation or updates in a message - Do NOT ask the user for additional information or next steps Only one plan can be active at a time, so do not call the `create_plan` tool again until all the steps in current plan are completed. """ ), ) @agent.tool_plain async def create_plan(steps: list[str]) -> StateSnapshotEvent: """Create a plan with multiple steps. Args: steps: List of step descriptions to create the plan. Returns: StateSnapshotEvent containing the initial state of the steps. """ plan: Plan = Plan( steps=[Step(description=step) for step in steps], ) return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=plan.model_dump(), ) @agent.tool_plain async def update_plan_step( index: int, description: str | None = None, status: StepStatus | None = None ) -> StateDeltaEvent: """Update the plan with new steps or changes. Args: index: The index of the step to update. description: The new description for the step. status: The new status for the step. Returns: StateDeltaEvent containing the changes made to the plan. """ changes: list[JSONPatchOp] = [] if description is not None: changes.append( JSONPatchOp( op='replace', path=f'/steps/{index}/description', value=description ) ) if status is not None: changes.append( JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status) ) return StateDeltaEvent( type=EventType.STATE_DELTA, delta=changes, ) app = AGUIApp(agent) ``` [ag_ui/api/agentic_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_generative_ui.py) ```python """Agentic Generative UI feature.""" from __future__ import annotations from textwrap import dedent from typing import Any, Literal from pydantic import BaseModel, Field from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp StepStatus = Literal['pending', 'completed'] class Step(BaseModel): """Represents a step in a plan.""" description: str = Field(description='The description of the step') status: StepStatus = Field( default='pending', description='The status of the step (e.g., pending, completed)', ) class Plan(BaseModel): """Represents a plan with multiple steps.""" steps: list[Step] = Field( default_factory=list[Step], description='The steps in the plan' ) class JSONPatchOp(BaseModel): """A class representing a JSON Patch operation (RFC 6902).""" op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field( description='The operation to perform: add, remove, replace, move, copy, or test', ) path: str = Field(description='JSON Pointer (RFC 6901) to the target location') value: Any = Field( default=None, description='The value to apply (for add, replace operations)', ) from_: str | None = Field( default=None, alias='from', description='Source path (for move, copy operations)', ) agent = Agent( 'openai:gpt-5-mini', instructions=dedent( """ When planning use tools only, without any other messages. IMPORTANT: - Use the `create_plan` tool to set the initial state of the steps - Use the `update_plan_step` tool to update the status of each step - Do NOT repeat the plan or summarise it in a message - Do NOT confirm the creation or updates in a message - Do NOT ask the user for additional information or next steps Only one plan can be active at a time, so do not call the `create_plan` tool again until all the steps in current plan are completed. """ ), ) @agent.tool_plain async def create_plan(steps: list[str]) -> StateSnapshotEvent: """Create a plan with multiple steps. Args: steps: List of step descriptions to create the plan. Returns: StateSnapshotEvent containing the initial state of the steps. """ plan: Plan = Plan( steps=[Step(description=step) for step in steps], ) return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=plan.model_dump(), ) @agent.tool_plain async def update_plan_step( index: int, description: str | None = None, status: StepStatus | None = None ) -> StateDeltaEvent: """Update the plan with new steps or changes. Args: index: The index of the step to update. description: The new description for the step. status: The new status for the step. Returns: StateDeltaEvent containing the changes made to the plan. """ changes: list[JSONPatchOp] = [] if description is not None: changes.append( JSONPatchOp( op='replace', path=f'/steps/{index}/description', value=description ) ) if status is not None: changes.append( JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status) ) return StateDeltaEvent( type=EventType.STATE_DELTA, delta=changes, ) app = AGUIApp(agent) ``` ### Human in the Loop Demonstrates simple human in the loop workflow where the agent comes up with a plan and the user can approve it using checkboxes. #### Task Planning Tools - `generate_task_steps` - AG-UI tool to generate and confirm steps #### Task Planning Prompt ```text Generate a list of steps for cleaning a car for me to review ``` #### Human in the Loop - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/human_in_the_loop.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/human_in_the_loop.py) ```python """Human in the Loop Feature. No special handling is required for this feature. """ from __future__ import annotations from textwrap import dedent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent( 'gateway/openai:gpt-5-mini', instructions=dedent( """ When planning tasks use tools only, without any other messages. IMPORTANT: - Use the `generate_task_steps` tool to display the suggested steps to the user - Never repeat the plan, or send a message detailing steps - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again """ ), ) app = AGUIApp(agent) ``` [ag_ui/api/human_in_the_loop.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/human_in_the_loop.py) ```python """Human in the Loop Feature. No special handling is required for this feature. """ from __future__ import annotations from textwrap import dedent from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent( 'openai:gpt-5-mini', instructions=dedent( """ When planning tasks use tools only, without any other messages. IMPORTANT: - Use the `generate_task_steps` tool to display the suggested steps to the user - Never repeat the plan, or send a message detailing steps - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again """ ), ) app = AGUIApp(agent) ``` ### Predictive State Updates Demonstrates how to use the predictive state updates feature to update the state of the UI based on agent responses, including user interaction via user confirmation. If you've [run the example](#running-the-example), you can view it at . #### Story Tools - `write_document` - AG-UI tool to write the document to a window - `document_predict_state` - Pydantic AI tool that enables document state prediction for the `write_document` tool This also shows how to use custom instructions based on shared state information. #### Story Example Starting document text ```markdown Bruce was a good dog, ``` Agent prompt ```text Help me complete my story about bruce the dog, is should be no longer than a sentence. ``` #### Predictive State Updates - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/predictive_state_updates.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/predictive_state_updates.py) ```python """Predictive State feature.""" from __future__ import annotations from textwrap import dedent from pydantic import BaseModel from ag_ui.core import CustomEvent, EventType from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent('gateway/openai:gpt-5-mini', deps_type=StateDeps[DocumentState]) # Tools which return AG-UI events will be sent to the client as part of the # event stream, single events and iterables of events are supported. @agent.tool_plain async def document_predict_state() -> list[CustomEvent]: """Enable document state prediction. Returns: CustomEvent containing the event to enable state prediction. """ return [ CustomEvent( type=EventType.CUSTOM, name='PredictState', value=[ { 'state_key': 'document', 'tool': 'write_document', 'tool_argument': 'document', }, ], ), ] @agent.instructions() async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str: """Provide instructions for writing document if present. Args: ctx: The run context containing document state information. Returns: Instructions string for the document writing agent. """ return dedent( f"""You are a helpful assistant for writing documents. Before you start writing, you MUST call the `document_predict_state` tool to enable state prediction. To present the document to the user for review, you MUST use the `write_document` tool. When you have written the document, DO NOT repeat it as a message. If accepted briefly summarize the changes you made, 2 sentences max, otherwise ask the user to clarify what they want to change. This is the current document: {ctx.deps.state.document} """ ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` [ag_ui/api/predictive_state_updates.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/predictive_state_updates.py) ```python """Predictive State feature.""" from __future__ import annotations from textwrap import dedent from pydantic import BaseModel from ag_ui.core import CustomEvent, EventType from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[DocumentState]) # Tools which return AG-UI events will be sent to the client as part of the # event stream, single events and iterables of events are supported. @agent.tool_plain async def document_predict_state() -> list[CustomEvent]: """Enable document state prediction. Returns: CustomEvent containing the event to enable state prediction. """ return [ CustomEvent( type=EventType.CUSTOM, name='PredictState', value=[ { 'state_key': 'document', 'tool': 'write_document', 'tool_argument': 'document', }, ], ), ] @agent.instructions() async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str: """Provide instructions for writing document if present. Args: ctx: The run context containing document state information. Returns: Instructions string for the document writing agent. """ return dedent( f"""You are a helpful assistant for writing documents. Before you start writing, you MUST call the `document_predict_state` tool to enable state prediction. To present the document to the user for review, you MUST use the `write_document` tool. When you have written the document, DO NOT repeat it as a message. If accepted briefly summarize the changes you made, 2 sentences max, otherwise ask the user to clarify what they want to change. This is the current document: {ctx.deps.state.document} """ ) app = AGUIApp(agent, deps=StateDeps(DocumentState())) ``` ### Shared State Demonstrates how to use the shared state between the UI and the agent. State sent to the agent is detected by a function based instruction. This then validates the data using a custom pydantic model before using to create the instructions for the agent to follow and send to the client using a AG-UI tool. If you've [run the example](#running-the-example), you can view it at . #### Recipe Tools - `display_recipe` - AG-UI tool to display the recipe in a graphical format #### Recipe Example 1. Customise the basic settings of your recipe 1. Click `Improve with AI` #### Shared State - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/shared_state.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/shared_state.py) ```python """Shared State feature.""" from __future__ import annotations from enum import Enum from textwrap import dedent from pydantic import BaseModel, Field from ag_ui.core import EventType, StateSnapshotEvent from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class SkillLevel(str, Enum): """The level of skill required for the recipe.""" BEGINNER = 'Beginner' INTERMEDIATE = 'Intermediate' ADVANCED = 'Advanced' class SpecialPreferences(str, Enum): """Special preferences for the recipe.""" HIGH_PROTEIN = 'High Protein' LOW_CARB = 'Low Carb' SPICY = 'Spicy' BUDGET_FRIENDLY = 'Budget-Friendly' ONE_POT_MEAL = 'One-Pot Meal' VEGETARIAN = 'Vegetarian' VEGAN = 'Vegan' class CookingTime(str, Enum): """The cooking time of the recipe.""" FIVE_MIN = '5 min' FIFTEEN_MIN = '15 min' THIRTY_MIN = '30 min' FORTY_FIVE_MIN = '45 min' SIXTY_PLUS_MIN = '60+ min' class Ingredient(BaseModel): """A class representing an ingredient in a recipe.""" icon: str = Field( default='ingredient', description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient", ) name: str amount: str class Recipe(BaseModel): """A class representing a recipe.""" skill_level: SkillLevel = Field( default=SkillLevel.BEGINNER, description='The skill level required for the recipe', ) special_preferences: list[SpecialPreferences] = Field( default_factory=list[SpecialPreferences], description='Any special preferences for the recipe', ) cooking_time: CookingTime = Field( default=CookingTime.FIVE_MIN, description='The cooking time of the recipe' ) ingredients: list[Ingredient] = Field( default_factory=list[Ingredient], description='Ingredients for the recipe', ) instructions: list[str] = Field( default_factory=list[str], description='Instructions for the recipe' ) class RecipeSnapshot(BaseModel): """A class representing the state of the recipe.""" recipe: Recipe = Field( default_factory=Recipe, description='The current state of the recipe' ) agent = Agent('gateway/openai:gpt-5-mini', deps_type=StateDeps[RecipeSnapshot]) @agent.tool_plain async def display_recipe(recipe: Recipe) -> StateSnapshotEvent: """Display the recipe to the user. Args: recipe: The recipe to display. Returns: StateSnapshotEvent containing the recipe snapshot. """ return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot={'recipe': recipe}, ) @agent.instructions async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str: """Instructions for the recipe generation agent. Args: ctx: The run context containing recipe state information. Returns: Instructions string for the recipe generation agent. """ return dedent( f""" You are a helpful assistant for creating recipes. IMPORTANT: - Create a complete recipe using the existing ingredients - Append new ingredients to the existing ones - Use the `display_recipe` tool to present the recipe to the user - Do NOT repeat the recipe in the message, use the tool instead - Do NOT run the `display_recipe` tool multiple times in a row Once you have created the updated recipe and displayed it to the user, summarise the changes in one sentence, don't describe the recipe in detail or send it as a message to the user. The current state of the recipe is: {ctx.deps.state.recipe.model_dump_json(indent=2)} """, ) app = AGUIApp(agent, deps=StateDeps(RecipeSnapshot())) ``` [ag_ui/api/shared_state.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/shared_state.py) ```python """Shared State feature.""" from __future__ import annotations from enum import Enum from textwrap import dedent from pydantic import BaseModel, Field from ag_ui.core import EventType, StateSnapshotEvent from pydantic_ai import Agent, RunContext from pydantic_ai.ui import StateDeps from pydantic_ai.ui.ag_ui.app import AGUIApp class SkillLevel(str, Enum): """The level of skill required for the recipe.""" BEGINNER = 'Beginner' INTERMEDIATE = 'Intermediate' ADVANCED = 'Advanced' class SpecialPreferences(str, Enum): """Special preferences for the recipe.""" HIGH_PROTEIN = 'High Protein' LOW_CARB = 'Low Carb' SPICY = 'Spicy' BUDGET_FRIENDLY = 'Budget-Friendly' ONE_POT_MEAL = 'One-Pot Meal' VEGETARIAN = 'Vegetarian' VEGAN = 'Vegan' class CookingTime(str, Enum): """The cooking time of the recipe.""" FIVE_MIN = '5 min' FIFTEEN_MIN = '15 min' THIRTY_MIN = '30 min' FORTY_FIVE_MIN = '45 min' SIXTY_PLUS_MIN = '60+ min' class Ingredient(BaseModel): """A class representing an ingredient in a recipe.""" icon: str = Field( default='ingredient', description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient", ) name: str amount: str class Recipe(BaseModel): """A class representing a recipe.""" skill_level: SkillLevel = Field( default=SkillLevel.BEGINNER, description='The skill level required for the recipe', ) special_preferences: list[SpecialPreferences] = Field( default_factory=list[SpecialPreferences], description='Any special preferences for the recipe', ) cooking_time: CookingTime = Field( default=CookingTime.FIVE_MIN, description='The cooking time of the recipe' ) ingredients: list[Ingredient] = Field( default_factory=list[Ingredient], description='Ingredients for the recipe', ) instructions: list[str] = Field( default_factory=list[str], description='Instructions for the recipe' ) class RecipeSnapshot(BaseModel): """A class representing the state of the recipe.""" recipe: Recipe = Field( default_factory=Recipe, description='The current state of the recipe' ) agent = Agent('openai:gpt-5-mini', deps_type=StateDeps[RecipeSnapshot]) @agent.tool_plain async def display_recipe(recipe: Recipe) -> StateSnapshotEvent: """Display the recipe to the user. Args: recipe: The recipe to display. Returns: StateSnapshotEvent containing the recipe snapshot. """ return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot={'recipe': recipe}, ) @agent.instructions async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str: """Instructions for the recipe generation agent. Args: ctx: The run context containing recipe state information. Returns: Instructions string for the recipe generation agent. """ return dedent( f""" You are a helpful assistant for creating recipes. IMPORTANT: - Create a complete recipe using the existing ingredients - Append new ingredients to the existing ones - Use the `display_recipe` tool to present the recipe to the user - Do NOT repeat the recipe in the message, use the tool instead - Do NOT run the `display_recipe` tool multiple times in a row Once you have created the updated recipe and displayed it to the user, summarise the changes in one sentence, don't describe the recipe in detail or send it as a message to the user. The current state of the recipe is: {ctx.deps.state.recipe.model_dump_json(indent=2)} """, ) app = AGUIApp(agent, deps=StateDeps(RecipeSnapshot())) ``` ### Tool Based Generative UI Demonstrates customised rendering for tool output with used confirmation. If you've [run the example](#running-the-example), you can view it at . #### Haiku Tools - `generate_haiku` - AG-UI tool to display a haiku in English and Japanese #### Haiku Prompt ```text Generate a haiku about formula 1 ``` #### Tool Based Generative UI - Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [ag_ui/api/tool_based_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/tool_based_generative_ui.py) ```python """Tool Based Generative UI feature. No special handling is required for this feature. """ from __future__ import annotations from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('gateway/openai:gpt-5-mini') app = AGUIApp(agent) ``` [ag_ui/api/tool_based_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/tool_based_generative_ui.py) ```python """Tool Based Generative UI feature. No special handling is required for this feature. """ from __future__ import annotations from pydantic_ai import Agent from pydantic_ai.ui.ag_ui.app import AGUIApp agent = Agent('openai:gpt-5-mini') app = AGUIApp(agent) ``` Small but complete example of using Pydantic AI to build a support agent for a bank. Demonstrates: - [dynamic system prompt](https://ai.pydantic.dev/agent/#system-prompts) - [structured `output_type`](https://ai.pydantic.dev/output/#structured-output) - [tools](https://ai.pydantic.dev/tools/index.md) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.bank_support ``` ```bash uv run -m pydantic_ai_examples.bank_support ``` (or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`) ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [bank_support.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/bank_support.py) ```python """Small but complete example of using Pydantic AI to build a support agent for a bank. Run with: uv run -m pydantic_ai_examples.bank_support """ import sqlite3 from dataclasses import dataclass from pydantic import BaseModel from pydantic_ai import Agent, RunContext @dataclass class DatabaseConn: """A wrapper over the SQLite connection.""" sqlite_conn: sqlite3.Connection async def customer_name(self, *, id: int) -> str | None: res = cur.execute('SELECT name FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] return None async def customer_balance(self, *, id: int) -> float: res = cur.execute('SELECT balance FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] else: raise ValueError('Customer not found') @dataclass class SupportDependencies: customer_id: int db: DatabaseConn class SupportOutput(BaseModel): support_advice: str """Advice returned to the customer""" block_card: bool """Whether to block their card or not""" risk: int """Risk level of query""" support_agent = Agent( 'gateway/openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query. ' "Reply using the customer's name." ), ) @support_agent.instructions async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool async def customer_balance(ctx: RunContext[SupportDependencies]) -> str: """Returns the customer's current account balance.""" balance = await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, ) return f'${balance:.2f}' if __name__ == '__main__': with sqlite3.connect(':memory:') as con: cur = con.cursor() cur.execute('CREATE TABLE customers(id, name, balance)') cur.execute(""" INSERT INTO customers VALUES (123, 'John', 123.45) """) con.commit() deps = SupportDependencies(customer_id=123, db=DatabaseConn(sqlite_conn=con)) result = support_agent.run_sync('What is my balance?', deps=deps) print(result.output) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = support_agent.run_sync('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` [bank_support.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/bank_support.py) ```python """Small but complete example of using Pydantic AI to build a support agent for a bank. Run with: uv run -m pydantic_ai_examples.bank_support """ import sqlite3 from dataclasses import dataclass from pydantic import BaseModel from pydantic_ai import Agent, RunContext @dataclass class DatabaseConn: """A wrapper over the SQLite connection.""" sqlite_conn: sqlite3.Connection async def customer_name(self, *, id: int) -> str | None: res = cur.execute('SELECT name FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] return None async def customer_balance(self, *, id: int) -> float: res = cur.execute('SELECT balance FROM customers WHERE id=?', (id,)) row = res.fetchone() if row: return row[0] else: raise ValueError('Customer not found') @dataclass class SupportDependencies: customer_id: int db: DatabaseConn class SupportOutput(BaseModel): support_advice: str """Advice returned to the customer""" block_card: bool """Whether to block their card or not""" risk: int """Risk level of query""" support_agent = Agent( 'openai:gpt-5.2', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query. ' "Reply using the customer's name." ), ) @support_agent.instructions async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool async def customer_balance(ctx: RunContext[SupportDependencies]) -> str: """Returns the customer's current account balance.""" balance = await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, ) return f'${balance:.2f}' if __name__ == '__main__': with sqlite3.connect(':memory:') as con: cur = con.cursor() cur.execute('CREATE TABLE customers(id, name, balance)') cur.execute(""" INSERT INTO customers VALUES (123, 'John', 123.45) """) con.commit() deps = SupportDependencies(customer_id=123, db=DatabaseConn(sqlite_conn=con)) result = support_agent.run_sync('What is my balance?', deps=deps) print(result.output) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = support_agent.run_sync('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` # Chat App with FastAPI Simple chat app example build with FastAPI. Demonstrates: - [reusing chat history](https://ai.pydantic.dev/message-history/index.md) - [serializing messages](https://ai.pydantic.dev/message-history/#accessing-messages-from-results) - [streaming responses](https://ai.pydantic.dev/output/#streamed-results) This demonstrates storing chat history between requests and using it to give the model context for new responses. Most of the complex logic here is between `chat_app.py` which streams the response to the browser, and `chat_app.ts` which renders messages in the browser. ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.chat_app ``` ```bash uv run -m pydantic_ai_examples.chat_app ``` Then open the app at [localhost:8000](http://localhost:8000). ## Example Code Python code that runs the chat app: [Learn about Gateway](https://ai.pydantic.dev/gateway) [chat_app.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.py) ```python """Simple chat app example build with FastAPI. Run with: uv run -m pydantic_ai_examples.chat_app """ from __future__ import annotations as _annotations import asyncio import json import sqlite3 from collections.abc import AsyncIterator, Callable from concurrent.futures.thread import ThreadPoolExecutor from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import datetime, timezone from functools import partial from pathlib import Path from typing import Annotated, Any, Literal, TypeVar import fastapi import logfire from fastapi import Depends, Request from fastapi.responses import FileResponse, Response, StreamingResponse from typing_extensions import LiteralString, ParamSpec, TypedDict from pydantic_ai import ( Agent, ModelMessage, ModelMessagesTypeAdapter, ModelRequest, ModelResponse, TextPart, UnexpectedModelBehavior, UserPromptPart, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent('gateway/openai:gpt-5.2') THIS_DIR = Path(__file__).parent @asynccontextmanager async def lifespan(_app: fastapi.FastAPI): async with Database.connect() as db: yield {'db': db} app = fastapi.FastAPI(lifespan=lifespan) logfire.instrument_fastapi(app) @app.get('/') async def index() -> FileResponse: return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html') @app.get('/chat_app.ts') async def main_ts() -> FileResponse: """Get the raw typescript code, it's compiled in the browser, forgive me.""" return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain') async def get_db(request: Request) -> Database: return request.state.db @app.get('/chat/') async def get_chat(database: Database = Depends(get_db)) -> Response: msgs = await database.get_messages() return Response( b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs), media_type='text/plain', ) class ChatMessage(TypedDict): """Format of messages sent to the browser.""" role: Literal['user', 'model'] timestamp: str content: str def to_chat_message(m: ModelMessage) -> ChatMessage: first_part = m.parts[0] if isinstance(m, ModelRequest): if isinstance(first_part, UserPromptPart): assert isinstance(first_part.content, str) return { 'role': 'user', 'timestamp': first_part.timestamp.isoformat(), 'content': first_part.content, } elif isinstance(m, ModelResponse): if isinstance(first_part, TextPart): return { 'role': 'model', 'timestamp': m.timestamp.isoformat(), 'content': first_part.content, } raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}') @app.post('/chat/') async def post_chat( prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db) ) -> StreamingResponse: async def stream_messages(): """Streams new line delimited JSON `Message`s to the client.""" # stream the user prompt so that can be displayed straight away yield ( json.dumps( { 'role': 'user', 'timestamp': datetime.now(tz=timezone.utc).isoformat(), 'content': prompt, } ).encode('utf-8') + b'\n' ) # get the chat history so far to pass as context to the agent messages = await database.get_messages() # run the agent with the user prompt and the chat history async with agent.run_stream(prompt, message_history=messages) as result: async for text in result.stream_output(debounce_by=0.01): # text here is a `str` and the frontend wants # JSON encoded ModelResponse, so we create one m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp()) yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n' # add new messages (e.g. the user prompt and the agent response in this case) to the database await database.add_messages(result.new_messages_json()) return StreamingResponse(stream_messages(), media_type='text/plain') P = ParamSpec('P') R = TypeVar('R') @dataclass class Database: """Rudimentary database to store chat messages in SQLite. The SQLite standard library package is synchronous, so we use a thread pool executor to run queries asynchronously. """ con: sqlite3.Connection _loop: asyncio.AbstractEventLoop _executor: ThreadPoolExecutor @classmethod @asynccontextmanager async def connect( cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite' ) -> AsyncIterator[Database]: with logfire.span('connect to DB'): loop = asyncio.get_event_loop() executor = ThreadPoolExecutor(max_workers=1) con = await loop.run_in_executor(executor, cls._connect, file) slf = cls(con, loop, executor) try: yield slf finally: await slf._asyncify(con.close) @staticmethod def _connect(file: Path) -> sqlite3.Connection: con = sqlite3.connect(str(file)) con = logfire.instrument_sqlite3(con) cur = con.cursor() cur.execute( 'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);' ) con.commit() return con async def add_messages(self, messages: bytes): await self._asyncify( self._execute, 'INSERT INTO messages (message_list) VALUES (?);', messages, commit=True, ) await self._asyncify(self.con.commit) async def get_messages(self) -> list[ModelMessage]: c = await self._asyncify( self._execute, 'SELECT message_list FROM messages order by id' ) rows = await self._asyncify(c.fetchall) messages: list[ModelMessage] = [] for row in rows: messages.extend(ModelMessagesTypeAdapter.validate_json(row[0])) return messages def _execute( self, sql: LiteralString, *args: Any, commit: bool = False ) -> sqlite3.Cursor: cur = self.con.cursor() cur.execute(sql, args) if commit: self.con.commit() return cur async def _asyncify( self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs ) -> R: return await self._loop.run_in_executor( # type: ignore self._executor, partial(func, **kwargs), *args, # type: ignore ) if __name__ == '__main__': import uvicorn uvicorn.run( 'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)] ) ``` [chat_app.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.py) ```python """Simple chat app example build with FastAPI. Run with: uv run -m pydantic_ai_examples.chat_app """ from __future__ import annotations as _annotations import asyncio import json import sqlite3 from collections.abc import AsyncIterator, Callable from concurrent.futures.thread import ThreadPoolExecutor from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import datetime, timezone from functools import partial from pathlib import Path from typing import Annotated, Any, Literal, TypeVar import fastapi import logfire from fastapi import Depends, Request from fastapi.responses import FileResponse, Response, StreamingResponse from typing_extensions import LiteralString, ParamSpec, TypedDict from pydantic_ai import ( Agent, ModelMessage, ModelMessagesTypeAdapter, ModelRequest, ModelResponse, TextPart, UnexpectedModelBehavior, UserPromptPart, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-5.2') THIS_DIR = Path(__file__).parent @asynccontextmanager async def lifespan(_app: fastapi.FastAPI): async with Database.connect() as db: yield {'db': db} app = fastapi.FastAPI(lifespan=lifespan) logfire.instrument_fastapi(app) @app.get('/') async def index() -> FileResponse: return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html') @app.get('/chat_app.ts') async def main_ts() -> FileResponse: """Get the raw typescript code, it's compiled in the browser, forgive me.""" return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain') async def get_db(request: Request) -> Database: return request.state.db @app.get('/chat/') async def get_chat(database: Database = Depends(get_db)) -> Response: msgs = await database.get_messages() return Response( b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs), media_type='text/plain', ) class ChatMessage(TypedDict): """Format of messages sent to the browser.""" role: Literal['user', 'model'] timestamp: str content: str def to_chat_message(m: ModelMessage) -> ChatMessage: first_part = m.parts[0] if isinstance(m, ModelRequest): if isinstance(first_part, UserPromptPart): assert isinstance(first_part.content, str) return { 'role': 'user', 'timestamp': first_part.timestamp.isoformat(), 'content': first_part.content, } elif isinstance(m, ModelResponse): if isinstance(first_part, TextPart): return { 'role': 'model', 'timestamp': m.timestamp.isoformat(), 'content': first_part.content, } raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}') @app.post('/chat/') async def post_chat( prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db) ) -> StreamingResponse: async def stream_messages(): """Streams new line delimited JSON `Message`s to the client.""" # stream the user prompt so that can be displayed straight away yield ( json.dumps( { 'role': 'user', 'timestamp': datetime.now(tz=timezone.utc).isoformat(), 'content': prompt, } ).encode('utf-8') + b'\n' ) # get the chat history so far to pass as context to the agent messages = await database.get_messages() # run the agent with the user prompt and the chat history async with agent.run_stream(prompt, message_history=messages) as result: async for text in result.stream_output(debounce_by=0.01): # text here is a `str` and the frontend wants # JSON encoded ModelResponse, so we create one m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp()) yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n' # add new messages (e.g. the user prompt and the agent response in this case) to the database await database.add_messages(result.new_messages_json()) return StreamingResponse(stream_messages(), media_type='text/plain') P = ParamSpec('P') R = TypeVar('R') @dataclass class Database: """Rudimentary database to store chat messages in SQLite. The SQLite standard library package is synchronous, so we use a thread pool executor to run queries asynchronously. """ con: sqlite3.Connection _loop: asyncio.AbstractEventLoop _executor: ThreadPoolExecutor @classmethod @asynccontextmanager async def connect( cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite' ) -> AsyncIterator[Database]: with logfire.span('connect to DB'): loop = asyncio.get_event_loop() executor = ThreadPoolExecutor(max_workers=1) con = await loop.run_in_executor(executor, cls._connect, file) slf = cls(con, loop, executor) try: yield slf finally: await slf._asyncify(con.close) @staticmethod def _connect(file: Path) -> sqlite3.Connection: con = sqlite3.connect(str(file)) con = logfire.instrument_sqlite3(con) cur = con.cursor() cur.execute( 'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);' ) con.commit() return con async def add_messages(self, messages: bytes): await self._asyncify( self._execute, 'INSERT INTO messages (message_list) VALUES (?);', messages, commit=True, ) await self._asyncify(self.con.commit) async def get_messages(self) -> list[ModelMessage]: c = await self._asyncify( self._execute, 'SELECT message_list FROM messages order by id' ) rows = await self._asyncify(c.fetchall) messages: list[ModelMessage] = [] for row in rows: messages.extend(ModelMessagesTypeAdapter.validate_json(row[0])) return messages def _execute( self, sql: LiteralString, *args: Any, commit: bool = False ) -> sqlite3.Cursor: cur = self.con.cursor() cur.execute(sql, args) if commit: self.con.commit() return cur async def _asyncify( self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs ) -> R: return await self._loop.run_in_executor( # type: ignore self._executor, partial(func, **kwargs), *args, # type: ignore ) if __name__ == '__main__': import uvicorn uvicorn.run( 'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)] ) ``` Simple HTML page to render the app: [chat_app.html](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.html) ```html Chat App

Chat App

Ask me anything...

Error occurred, check the browser developer console for more information.
``` TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser. [chat_app.ts](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.ts) ```ts // BIG FAT WARNING: to avoid the complexity of npm, this typescript is compiled in the browser // there's currently no static type checking import { marked } from 'https://cdnjs.cloudflare.com/ajax/libs/marked/15.0.0/lib/marked.esm.js' const convElement = document.getElementById('conversation') const promptInput = document.getElementById('prompt-input') as HTMLInputElement const spinner = document.getElementById('spinner') // stream the response and render messages as each chunk is received // data is sent as newline-delimited JSON async function onFetchResponse(response: Response): Promise { let text = '' let decoder = new TextDecoder() if (response.ok) { const reader = response.body.getReader() while (true) { const {done, value} = await reader.read() if (done) { break } text += decoder.decode(value) addMessages(text) spinner.classList.remove('active') } addMessages(text) promptInput.disabled = false promptInput.focus() } else { const text = await response.text() console.error(`Unexpected response: ${response.status}`, {response, text}) throw new Error(`Unexpected response: ${response.status}`) } } // The format of messages, this matches pydantic-ai both for brevity and understanding // in production, you might not want to keep this format all the way to the frontend interface Message { role: string content: string timestamp: string } // take raw response text and render messages into the `#conversation` element // Message timestamp is assumed to be a unique identifier of a message, and is used to deduplicate // hence you can send data about the same message multiple times, and it will be updated // instead of creating a new message elements function addMessages(responseText: string) { const lines = responseText.split('\n') const messages: Message[] = lines.filter(line => line.length > 1).map(j => JSON.parse(j)) for (const message of messages) { // we use the timestamp as a crude element id const {timestamp, role, content} = message const id = `msg-${timestamp}` let msgDiv = document.getElementById(id) if (!msgDiv) { msgDiv = document.createElement('div') msgDiv.id = id msgDiv.title = `${role} at ${timestamp}` msgDiv.classList.add('border-top', 'pt-2', role) convElement.appendChild(msgDiv) } msgDiv.innerHTML = marked.parse(content) } window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' }) } function onError(error: any) { console.error(error) document.getElementById('error').classList.remove('d-none') document.getElementById('spinner').classList.remove('active') } async function onSubmit(e: SubmitEvent): Promise { e.preventDefault() spinner.classList.add('active') const body = new FormData(e.target as HTMLFormElement) promptInput.value = '' promptInput.disabled = true const response = await fetch('/chat/', {method: 'POST', body}) await onFetchResponse(response) } // call onSubmit when the form is submitted (e.g. user clicks the send button or hits Enter) document.querySelector('form').addEventListener('submit', (e) => onSubmit(e).catch(onError)) // load messages on page load fetch('/chat/').then(onFetchResponse).catch(onError) ``` # Data Analyst Sometimes in an agent workflow, the agent does not need to know the exact tool output, but still needs to process the tool output in some ways. This is especially common in data analytics: the agent needs to know that the result of a query tool is a `DataFrame` with certain named columns, but not necessarily the content of every single row. With Pydantic AI, you can use a [dependencies object](https://ai.pydantic.dev/dependencies/index.md) to store the result from one tool and use it in another tool. In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes). Demonstrates: - [agent dependencies](https://ai.pydantic.dev/dependencies/index.md) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.data_analyst ``` ```bash uv run -m pydantic_ai_examples.data_analyst ``` Output (debug): > Based on my analysis of the Cornell Movie Review dataset (rotten_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset). ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [data_analyst.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/data_analyst.py) ```python from dataclasses import dataclass, field import datasets import duckdb import pandas as pd from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class AnalystAgentDeps: output: dict[str, pd.DataFrame] = field(default_factory=dict[str, pd.DataFrame]) def store(self, value: pd.DataFrame) -> str: """Store the output in deps and return the reference such as Out[1] to be used by the LLM.""" ref = f'Out[{len(self.output) + 1}]' self.output[ref] = value return ref def get(self, ref: str) -> pd.DataFrame: if ref not in self.output: raise ModelRetry( f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.' ) return self.output[ref] analyst_agent = Agent( 'gateway/openai:gpt-5.2', deps_type=AnalystAgentDeps, instructions='You are a data analyst and your job is to analyze the data according to the user request.', ) @analyst_agent.tool def load_dataset( ctx: RunContext[AnalystAgentDeps], path: str, split: str = 'train', ) -> str: """Load the `split` of dataset `dataset_name` from huggingface. Args: ctx: Pydantic AI agent RunContext path: name of the dataset in the form of `/` split: load the split of the dataset (default: "train") """ # begin load data from hf builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType] splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} if split not in splits: raise ModelRetry( f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}' ) builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType] dataset = builder.as_dataset(split=split) assert isinstance(dataset, datasets.Dataset) dataframe = dataset.to_pandas() assert isinstance(dataframe, pd.DataFrame) # end load data from hf # store the dataframe in the deps and get a ref like "Out[1]" ref = ctx.deps.store(dataframe) # construct a summary of the loaded dataset output = [ f'Loaded the dataset as `{ref}`.', f'Description: {dataset.info.description}' if dataset.info.description else None, f'Features: {dataset.info.features!r}' if dataset.info.features else None, ] return '\n'.join(filter(None, output)) @analyst_agent.tool def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str: """Run DuckDB SQL query on the DataFrame. Note that the virtual table name used in DuckDB SQL must be `dataset`. Args: ctx: Pydantic AI agent RunContext dataset: reference string to the DataFrame sql: the query to be executed using DuckDB """ data = ctx.deps.get(dataset) result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql) # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe) ref = ctx.deps.store(result.df()) return f'Executed SQL, result is `{ref}`' @analyst_agent.tool def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str: """Display at most 5 rows of the dataframe.""" dataset = ctx.deps.get(name) return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType] if __name__ == '__main__': deps = AnalystAgentDeps() result = analyst_agent.run_sync( user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`', deps=deps, ) print(result.output) ``` [data_analyst.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/data_analyst.py) ```python from dataclasses import dataclass, field import datasets import duckdb import pandas as pd from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class AnalystAgentDeps: output: dict[str, pd.DataFrame] = field(default_factory=dict[str, pd.DataFrame]) def store(self, value: pd.DataFrame) -> str: """Store the output in deps and return the reference such as Out[1] to be used by the LLM.""" ref = f'Out[{len(self.output) + 1}]' self.output[ref] = value return ref def get(self, ref: str) -> pd.DataFrame: if ref not in self.output: raise ModelRetry( f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.' ) return self.output[ref] analyst_agent = Agent( 'openai:gpt-5.2', deps_type=AnalystAgentDeps, instructions='You are a data analyst and your job is to analyze the data according to the user request.', ) @analyst_agent.tool def load_dataset( ctx: RunContext[AnalystAgentDeps], path: str, split: str = 'train', ) -> str: """Load the `split` of dataset `dataset_name` from huggingface. Args: ctx: Pydantic AI agent RunContext path: name of the dataset in the form of `/` split: load the split of the dataset (default: "train") """ # begin load data from hf builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType] splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} if split not in splits: raise ModelRetry( f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}' ) builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType] dataset = builder.as_dataset(split=split) assert isinstance(dataset, datasets.Dataset) dataframe = dataset.to_pandas() assert isinstance(dataframe, pd.DataFrame) # end load data from hf # store the dataframe in the deps and get a ref like "Out[1]" ref = ctx.deps.store(dataframe) # construct a summary of the loaded dataset output = [ f'Loaded the dataset as `{ref}`.', f'Description: {dataset.info.description}' if dataset.info.description else None, f'Features: {dataset.info.features!r}' if dataset.info.features else None, ] return '\n'.join(filter(None, output)) @analyst_agent.tool def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str: """Run DuckDB SQL query on the DataFrame. Note that the virtual table name used in DuckDB SQL must be `dataset`. Args: ctx: Pydantic AI agent RunContext dataset: reference string to the DataFrame sql: the query to be executed using DuckDB """ data = ctx.deps.get(dataset) result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql) # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe) ref = ctx.deps.store(result.df()) return f'Executed SQL, result is `{ref}`' @analyst_agent.tool def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str: """Display at most 5 rows of the dataframe.""" dataset = ctx.deps.get(name) return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType] if __name__ == '__main__': deps = AnalystAgentDeps() result = analyst_agent.run_sync( user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`', deps=deps, ) print(result.output) ``` ## Appendix ### Choosing a Model This example requires using a model that understands DuckDB SQL. You can check with `clai`: ```sh > clai -m bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0 clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-sonnet-4-5-20250929-v1:0 clai ➤ do you understand duckdb sql? # DuckDB SQL Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database that uses syntax similar to PostgreSQL. It specializes in analytical queries and is designed for high-performance analysis of structured data. Some key features of DuckDB SQL include: • OLAP (Online Analytical Processing) optimized • Columnar-vectorized query execution • Standard SQL support with PostgreSQL compatibility • Support for complex analytical queries • Efficient handling of CSV/Parquet/JSON files I can help you with DuckDB SQL queries, schema design, optimization, or other DuckDB-related questions. ``` Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent. Demonstrates: - [agent delegation](https://ai.pydantic.dev/multi-agent-applications/#agent-delegation) - [programmatic agent hand-off](https://ai.pydantic.dev/multi-agent-applications/#programmatic-agent-hand-off) - [usage limits](https://ai.pydantic.dev/agent/#usage-limits) In this scenario, a group of agents work together to find the best flight for a user. The control flow for this example can be summarised as follows: ``` graph TD START --> search_agent("search agent") search_agent --> extraction_agent("extraction agent") extraction_agent --> search_agent search_agent --> human_confirm("human confirm") human_confirm --> search_agent search_agent --> FAILED human_confirm --> find_seat_function("find seat function") find_seat_function --> human_seat_choice("human seat choice") human_seat_choice --> find_seat_agent("find seat agent") find_seat_agent --> find_seat_function find_seat_function --> buy_flights("buy flights") buy_flights --> SUCCESS ``` ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.flight_booking ``` ```bash uv run -m pydantic_ai_examples.flight_booking ``` ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [flight_booking.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/flight_booking.py) ```python """Example of a multi-agent flow where one agent delegates work to another. In this scenario, a group of agents work together to find flights for a user. """ import datetime from dataclasses import dataclass from typing import Literal import logfire from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import ( Agent, ModelMessage, ModelRetry, RunContext, RunUsage, UsageLimits, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class FlightDetails(BaseModel): """Details of the most suitable flight.""" flight_number: str price: int origin: str = Field(description='Three-letter airport code') destination: str = Field(description='Three-letter airport code') date: datetime.date class NoFlightFound(BaseModel): """When no valid flight is found.""" @dataclass class Deps: web_page_text: str req_origin: str req_destination: str req_date: datetime.date # This agent is responsible for controlling the flow of the conversation. search_agent = Agent[Deps, FlightDetails | NoFlightFound]( 'openai:gpt-5.2', output_type=FlightDetails | NoFlightFound, # type: ignore retries=4, system_prompt=( 'Your job is to find the cheapest flight for the user on the given date. ' ), ) # This agent is responsible for extracting flight details from web page text. extraction_agent = Agent( 'gateway/openai:gpt-5.2', output_type=list[FlightDetails], system_prompt='Extract all the flight details from the given text.', ) @search_agent.tool async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]: """Get details of all flights.""" # we pass the usage to the search agent so requests within this agent are counted result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage) logfire.info('found {flight_count} flights', flight_count=len(result.output)) return result.output @search_agent.output_validator async def validate_output( ctx: RunContext[Deps], output: FlightDetails | NoFlightFound ) -> FlightDetails | NoFlightFound: """Procedural validation that the flight meets the constraints.""" if isinstance(output, NoFlightFound): return output errors: list[str] = [] if output.origin != ctx.deps.req_origin: errors.append( f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}' ) if output.destination != ctx.deps.req_destination: errors.append( f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}' ) if output.date != ctx.deps.req_date: errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}') if errors: raise ModelRetry('\n'.join(errors)) else: return output class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] class Failed(BaseModel): """Unable to extract a seat selection.""" # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( 'openai:gpt-5.2', output_type=SeatPreference | Failed, system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) # in reality this would be downloaded from a booking site, # potentially using another agent to navigate the site flights_web_page = """ 1. Flight SFO-AK123 - Price: $350 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 2. Flight SFO-AK456 - Price: $370 - Origin: San Francisco International Airport (SFO) - Destination: Fairbanks International Airport (FAI) - Date: January 10, 2025 3. Flight SFO-AK789 - Price: $400 - Origin: San Francisco International Airport (SFO) - Destination: Juneau International Airport (JNU) - Date: January 20, 2025 4. Flight NYC-LA101 - Price: $250 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 5. Flight CHI-MIA202 - Price: $200 - Origin: Chicago O'Hare International Airport (ORD) - Destination: Miami International Airport (MIA) - Date: January 12, 2025 6. Flight BOS-SEA303 - Price: $120 - Origin: Boston Logan International Airport (BOS) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 12, 2025 7. Flight DFW-DEN404 - Price: $150 - Origin: Dallas/Fort Worth International Airport (DFW) - Destination: Denver International Airport (DEN) - Date: January 10, 2025 8. Flight ATL-HOU505 - Price: $180 - Origin: Hartsfield-Jackson Atlanta International Airport (ATL) - Destination: George Bush Intercontinental Airport (IAH) - Date: January 10, 2025 """ # restrict how many requests this app can make to the LLM usage_limits = UsageLimits(request_limit=15) async def main(): deps = Deps( web_page_text=flights_web_page, req_origin='SFO', req_destination='ANC', req_date=datetime.date(2025, 1, 10), ) message_history: list[ModelMessage] | None = None usage: RunUsage = RunUsage() # run the agent until a satisfactory flight is found while True: result = await search_agent.run( f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}', deps=deps, usage=usage, message_history=message_history, usage_limits=usage_limits, ) if isinstance(result.output, NoFlightFound): print('No flight found') break else: flight = result.output print(f'Flight found: {flight}') answer = Prompt.ask( 'Do you want to buy this flight, or keep searching? (buy/*search)', choices=['buy', 'search', ''], show_choices=False, ) if answer == 'buy': seat = await find_seat(usage) await buy_tickets(flight, seat) break else: message_history = result.all_messages( output_tool_return_content='Please suggest another flight' ) async def find_seat(usage: RunUsage) -> SeatPreference: message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference): print(f'Purchasing flight {flight_details=!r} {seat=!r}...') if __name__ == '__main__': import asyncio asyncio.run(main()) ``` [flight_booking.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/flight_booking.py) ```python """Example of a multi-agent flow where one agent delegates work to another. In this scenario, a group of agents work together to find flights for a user. """ import datetime from dataclasses import dataclass from typing import Literal import logfire from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import ( Agent, ModelMessage, ModelRetry, RunContext, RunUsage, UsageLimits, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class FlightDetails(BaseModel): """Details of the most suitable flight.""" flight_number: str price: int origin: str = Field(description='Three-letter airport code') destination: str = Field(description='Three-letter airport code') date: datetime.date class NoFlightFound(BaseModel): """When no valid flight is found.""" @dataclass class Deps: web_page_text: str req_origin: str req_destination: str req_date: datetime.date # This agent is responsible for controlling the flow of the conversation. search_agent = Agent[Deps, FlightDetails | NoFlightFound]( 'openai:gpt-5.2', output_type=FlightDetails | NoFlightFound, # type: ignore retries=4, system_prompt=( 'Your job is to find the cheapest flight for the user on the given date. ' ), ) # This agent is responsible for extracting flight details from web page text. extraction_agent = Agent( 'openai:gpt-5.2', output_type=list[FlightDetails], system_prompt='Extract all the flight details from the given text.', ) @search_agent.tool async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]: """Get details of all flights.""" # we pass the usage to the search agent so requests within this agent are counted result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage) logfire.info('found {flight_count} flights', flight_count=len(result.output)) return result.output @search_agent.output_validator async def validate_output( ctx: RunContext[Deps], output: FlightDetails | NoFlightFound ) -> FlightDetails | NoFlightFound: """Procedural validation that the flight meets the constraints.""" if isinstance(output, NoFlightFound): return output errors: list[str] = [] if output.origin != ctx.deps.req_origin: errors.append( f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}' ) if output.destination != ctx.deps.req_destination: errors.append( f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}' ) if output.date != ctx.deps.req_date: errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}') if errors: raise ModelRetry('\n'.join(errors)) else: return output class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] class Failed(BaseModel): """Unable to extract a seat selection.""" # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( 'openai:gpt-5.2', output_type=SeatPreference | Failed, system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) # in reality this would be downloaded from a booking site, # potentially using another agent to navigate the site flights_web_page = """ 1. Flight SFO-AK123 - Price: $350 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 2. Flight SFO-AK456 - Price: $370 - Origin: San Francisco International Airport (SFO) - Destination: Fairbanks International Airport (FAI) - Date: January 10, 2025 3. Flight SFO-AK789 - Price: $400 - Origin: San Francisco International Airport (SFO) - Destination: Juneau International Airport (JNU) - Date: January 20, 2025 4. Flight NYC-LA101 - Price: $250 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 5. Flight CHI-MIA202 - Price: $200 - Origin: Chicago O'Hare International Airport (ORD) - Destination: Miami International Airport (MIA) - Date: January 12, 2025 6. Flight BOS-SEA303 - Price: $120 - Origin: Boston Logan International Airport (BOS) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 12, 2025 7. Flight DFW-DEN404 - Price: $150 - Origin: Dallas/Fort Worth International Airport (DFW) - Destination: Denver International Airport (DEN) - Date: January 10, 2025 8. Flight ATL-HOU505 - Price: $180 - Origin: Hartsfield-Jackson Atlanta International Airport (ATL) - Destination: George Bush Intercontinental Airport (IAH) - Date: January 10, 2025 """ # restrict how many requests this app can make to the LLM usage_limits = UsageLimits(request_limit=15) async def main(): deps = Deps( web_page_text=flights_web_page, req_origin='SFO', req_destination='ANC', req_date=datetime.date(2025, 1, 10), ) message_history: list[ModelMessage] | None = None usage: RunUsage = RunUsage() # run the agent until a satisfactory flight is found while True: result = await search_agent.run( f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}', deps=deps, usage=usage, message_history=message_history, usage_limits=usage_limits, ) if isinstance(result.output, NoFlightFound): print('No flight found') break else: flight = result.output print(f'Flight found: {flight}') answer = Prompt.ask( 'Do you want to buy this flight, or keep searching? (buy/*search)', choices=['buy', 'search', ''], show_choices=False, ) if answer == 'buy': seat = await find_seat(usage) await buy_tickets(flight, seat) break else: message_history = result.all_messages( output_tool_return_content='Please suggest another flight' ) async def find_seat(usage: RunUsage) -> SeatPreference: message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference): print(f'Purchasing flight {flight_details=!r} {seat=!r}...') if __name__ == '__main__': import asyncio asyncio.run(main()) ``` # Pydantic Model Simple example of using Pydantic AI to construct a Pydantic model from a text input. Demonstrates: - [structured `output_type`](https://ai.pydantic.dev/output/#structured-output) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` This examples uses `openai:gpt-5` by default, but it works well with other models, e.g. you can run it with Gemini using: ```bash PYDANTIC_AI_MODEL=gemini-3-pro-preview python -m pydantic_ai_examples.pydantic_model ``` ```bash PYDANTIC_AI_MODEL=gemini-3-pro-preview uv run -m pydantic_ai_examples.pydantic_model ``` (or `PYDANTIC_AI_MODEL=gemini-3-flash-preview ...`) ## Example Code [pydantic_model.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/pydantic_model.py) ```python """Simple example of using Pydantic AI to construct a Pydantic model from a text input. Run with: uv run -m pydantic_ai_examples.pydantic_model """ import os import logfire from pydantic import BaseModel from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class MyModel(BaseModel): city: str country: str model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-5.2') print(f'Using model: {model}') agent = Agent(model, output_type=MyModel) if __name__ == '__main__': result = agent.run_sync('The windy city in the US of A.') print(result.output) print(result.usage()) ``` # Question Graph Example of a graph for asking and evaluating questions. Demonstrates: - [`pydantic_graph`](https://ai.pydantic.dev/graph/index.md) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.question_graph ``` ```bash uv run -m pydantic_ai_examples.question_graph ``` ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [question_graph.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/question_graph.py) ```python """Example of a graph for asking and evaluating questions. Run with: uv run -m pydantic_ai_examples.question_graph """ from __future__ import annotations as _annotations from dataclasses import dataclass, field from pathlib import Path import logfire from groq import BaseModel from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_graph.persistence.file import FileStatePersistence # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() ask_agent = Agent('gateway/openai:gpt-5.2', output_type=str) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list[ModelMessage]) evaluate_agent_messages: list[ModelMessage] = field( default_factory=list[ModelMessage] ) @dataclass class Ask(BaseNode[QuestionState]): async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.all_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationOutput(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'gateway/openai:gpt-5.2', output_type=EvaluationOutput, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> End[str] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.all_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) async def run_as_continuous(): state = QuestionState() node = Ask() end = await question_graph.run(node, state=state) print('END:', end.output) async def run_as_cli(answer: str | None): persistence = FileStatePersistence(Path('question_graph.json')) persistence.set_graph_types(question_graph) if snapshot := await persistence.load_next(): state = snapshot.state assert answer is not None, ( 'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli "' ) node = Evaluate(answer) else: state = QuestionState() node = Ask() # debug(state, node) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() if isinstance(node, End): print('END:', node.data) history = await persistence.load_all() print('history:', '\n'.join(str(e.node) for e in history), sep='\n') print('Finished!') break elif isinstance(node, Answer): print(node.question) break # otherwise just continue if __name__ == '__main__': import asyncio import sys try: sub_command = sys.argv[1] assert sub_command in ('continuous', 'cli', 'mermaid') except (IndexError, AssertionError): print( 'Usage:\n' ' uv run -m pydantic_ai_examples.question_graph mermaid\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph continuous\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph cli [answer]', file=sys.stderr, ) sys.exit(1) if sub_command == 'mermaid': print(question_graph.mermaid_code(start_node=Ask)) elif sub_command == 'continuous': asyncio.run(run_as_continuous()) else: a = sys.argv[2] if len(sys.argv) > 2 else None asyncio.run(run_as_cli(a)) ``` [question_graph.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/question_graph.py) ```python """Example of a graph for asking and evaluating questions. Run with: uv run -m pydantic_ai_examples.question_graph """ from __future__ import annotations as _annotations from dataclasses import dataclass, field from pathlib import Path import logfire from groq import BaseModel from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_graph.persistence.file import FileStatePersistence # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() ask_agent = Agent('openai:gpt-5.2', output_type=str) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list[ModelMessage]) evaluate_agent_messages: list[ModelMessage] = field( default_factory=list[ModelMessage] ) @dataclass class Ask(BaseNode[QuestionState]): async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.all_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationOutput(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-5.2', output_type=EvaluationOutput, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> End[str] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.all_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) async def run_as_continuous(): state = QuestionState() node = Ask() end = await question_graph.run(node, state=state) print('END:', end.output) async def run_as_cli(answer: str | None): persistence = FileStatePersistence(Path('question_graph.json')) persistence.set_graph_types(question_graph) if snapshot := await persistence.load_next(): state = snapshot.state assert answer is not None, ( 'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli "' ) node = Evaluate(answer) else: state = QuestionState() node = Ask() # debug(state, node) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() if isinstance(node, End): print('END:', node.data) history = await persistence.load_all() print('history:', '\n'.join(str(e.node) for e in history), sep='\n') print('Finished!') break elif isinstance(node, Answer): print(node.question) break # otherwise just continue if __name__ == '__main__': import asyncio import sys try: sub_command = sys.argv[1] assert sub_command in ('continuous', 'cli', 'mermaid') except (IndexError, AssertionError): print( 'Usage:\n' ' uv run -m pydantic_ai_examples.question_graph mermaid\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph continuous\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph cli [answer]', file=sys.stderr, ) sys.exit(1) if sub_command == 'mermaid': print(question_graph.mermaid_code(start_node=Ask)) elif sub_command == 'continuous': asyncio.run(run_as_continuous()) else: a = sys.argv[2] if len(sys.argv) > 2 else None asyncio.run(run_as_cli(a)) ``` The mermaid diagram generated in this example looks like this: ``` --- title: question_graph --- stateDiagram-v2 [*] --> Ask Ask --> Answer: ask the question Answer --> Evaluate: answer the question Evaluate --> Congratulate Evaluate --> Castigate Congratulate --> [*]: success Castigate --> Ask: try again ``` # RAG RAG search example. This demo allows you to ask question of the [logfire](https://pydantic.dev/logfire) documentation. Demonstrates: - [tools](https://ai.pydantic.dev/tools/index.md) - [agent dependencies](https://ai.pydantic.dev/dependencies/index.md) - RAG search This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the Pydantic AI agent. Logic for extracting sections from markdown files and a JSON file with that data is available in [this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992). [PostgreSQL with pgvector](https://github.com/pgvector/pgvector) is used as the search database, the easiest way to download and run pgvector is using Docker: ```bash mkdir postgres-data docker run --rm \ -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 ``` As with the [SQL gen](https://ai.pydantic.dev/examples/sql-gen/index.md) example, we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL `data` directory locally to persist the data if you need to stop and restart the container. With that running and [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), we can build the search database with (**WARNING**: this requires the `OPENAI_API_KEY` env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation): ```bash python -m pydantic_ai_examples.rag build ``` ```bash uv run -m pydantic_ai_examples.rag build ``` (Note building the database doesn't use Pydantic AI right now, instead it uses the OpenAI SDK directly.) You can then ask the agent a question with: ```bash python -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ```bash uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [rag.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/rag.py) ```python """RAG example with pydantic-ai — using vector search to augment a chat agent. Run pgvector with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 Build the search DB with: uv run -m pydantic_ai_examples.rag build Ask the agent a question with: uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" """ from __future__ import annotations as _annotations import asyncio import re import sys import unicodedata from contextlib import asynccontextmanager from dataclasses import dataclass import asyncpg import httpx import logfire import pydantic_core from anyio import create_task_group from openai import AsyncOpenAI from pydantic import TypeAdapter from typing_extensions import AsyncGenerator from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() @dataclass class Deps: openai: AsyncOpenAI pool: asyncpg.Pool agent = Agent('gateway/openai:gpt-5.2', deps_type=Deps) @agent.tool async def retrieve(context: RunContext[Deps], search_query: str) -> str: """Retrieve documentation sections based on a search query. Args: context: The call context. search_query: The search query. """ with logfire.span( 'create embedding for {search_query=}', search_query=search_query ): embedding = await context.deps.openai.embeddings.create( input=search_query, model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() rows = await context.deps.pool.fetch( 'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8', embedding_json, ) return '\n\n'.join( f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n' for row in rows ) async def run_agent(question: str): """Entry point to run the agent and perform RAG based question answering.""" openai = AsyncOpenAI() logfire.instrument_openai(openai) logfire.info('Asking "{question}"', question=question) async with database_connect(False) as pool: deps = Deps(openai=openai, pool=pool) answer = await agent.run(question, deps=deps) print(answer.output) ####################################################### # The rest of this file is dedicated to preparing the # # search database, and some utilities. # ####################################################### # JSON document from # https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992 DOCS_JSON = ( 'https://gist.githubusercontent.com/' 'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/' '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json' ) async def build_search_db(): """Build the search database.""" async with httpx.AsyncClient() as client: response = await client.get(DOCS_JSON) response.raise_for_status() sections = sections_ta.validate_json(response.content) openai = AsyncOpenAI() logfire.instrument_openai(openai) async with database_connect(True) as pool: with logfire.span('create schema'): async with pool.acquire() as conn: async with conn.transaction(): await conn.execute(DB_SCHEMA) sem = asyncio.Semaphore(10) async with create_task_group() as tg: for section in sections: tg.start_soon(insert_doc_section, sem, openai, pool, section) async def insert_doc_section( sem: asyncio.Semaphore, openai: AsyncOpenAI, pool: asyncpg.Pool, section: DocsSection, ) -> None: async with sem: url = section.url() exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url) if exists: logfire.info('Skipping {url=}', url=url) return with logfire.span('create embedding for {url=}', url=url): embedding = await openai.embeddings.create( input=section.embedding_content(), model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() await pool.execute( 'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)', url, section.title, section.content, embedding_json, ) @dataclass class DocsSection: id: int parent: int | None path: str level: int title: str content: str def url(self) -> str: url_path = re.sub(r'\.md$', '', self.path) return ( f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}' ) def embedding_content(self) -> str: return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content)) sections_ta = TypeAdapter(list[DocsSection]) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect( create_db: bool = False, ) -> AsyncGenerator[asyncpg.Pool, None]: server_dsn, database = ( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_rag', ) if create_db: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() pool = await asyncpg.create_pool(f'{server_dsn}/{database}') try: yield pool finally: await pool.close() DB_SCHEMA = """ CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS doc_sections ( id serial PRIMARY KEY, url text NOT NULL UNIQUE, title text NOT NULL, content text NOT NULL, -- text-embedding-3-small returns a vector of 1536 floats embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops); """ def slugify(value: str, separator: str, unicode: bool = False) -> str: """Slugify a string, to make it URL friendly.""" # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38 if not unicode: # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty` value = unicodedata.normalize('NFKD', value) value = value.encode('ascii', 'ignore').decode('ascii') value = re.sub(r'[^\w\s-]', '', value).strip().lower() return re.sub(rf'[{separator}\s]+', separator, value) if __name__ == '__main__': action = sys.argv[1] if len(sys.argv) > 1 else None if action == 'build': asyncio.run(build_search_db()) elif action == 'search': if len(sys.argv) == 3: q = sys.argv[2] else: q = 'How do I configure logfire to work with FastAPI?' asyncio.run(run_agent(q)) else: print( 'uv run --extra examples -m pydantic_ai_examples.rag build|search', file=sys.stderr, ) sys.exit(1) ``` [rag.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/rag.py) ```python """RAG example with pydantic-ai — using vector search to augment a chat agent. Run pgvector with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 Build the search DB with: uv run -m pydantic_ai_examples.rag build Ask the agent a question with: uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" """ from __future__ import annotations as _annotations import asyncio import re import sys import unicodedata from contextlib import asynccontextmanager from dataclasses import dataclass import asyncpg import httpx import logfire import pydantic_core from anyio import create_task_group from openai import AsyncOpenAI from pydantic import TypeAdapter from typing_extensions import AsyncGenerator from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() @dataclass class Deps: openai: AsyncOpenAI pool: asyncpg.Pool agent = Agent('openai:gpt-5.2', deps_type=Deps) @agent.tool async def retrieve(context: RunContext[Deps], search_query: str) -> str: """Retrieve documentation sections based on a search query. Args: context: The call context. search_query: The search query. """ with logfire.span( 'create embedding for {search_query=}', search_query=search_query ): embedding = await context.deps.openai.embeddings.create( input=search_query, model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() rows = await context.deps.pool.fetch( 'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8', embedding_json, ) return '\n\n'.join( f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n' for row in rows ) async def run_agent(question: str): """Entry point to run the agent and perform RAG based question answering.""" openai = AsyncOpenAI() logfire.instrument_openai(openai) logfire.info('Asking "{question}"', question=question) async with database_connect(False) as pool: deps = Deps(openai=openai, pool=pool) answer = await agent.run(question, deps=deps) print(answer.output) ####################################################### # The rest of this file is dedicated to preparing the # # search database, and some utilities. # ####################################################### # JSON document from # https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992 DOCS_JSON = ( 'https://gist.githubusercontent.com/' 'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/' '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json' ) async def build_search_db(): """Build the search database.""" async with httpx.AsyncClient() as client: response = await client.get(DOCS_JSON) response.raise_for_status() sections = sections_ta.validate_json(response.content) openai = AsyncOpenAI() logfire.instrument_openai(openai) async with database_connect(True) as pool: with logfire.span('create schema'): async with pool.acquire() as conn: async with conn.transaction(): await conn.execute(DB_SCHEMA) sem = asyncio.Semaphore(10) async with create_task_group() as tg: for section in sections: tg.start_soon(insert_doc_section, sem, openai, pool, section) async def insert_doc_section( sem: asyncio.Semaphore, openai: AsyncOpenAI, pool: asyncpg.Pool, section: DocsSection, ) -> None: async with sem: url = section.url() exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url) if exists: logfire.info('Skipping {url=}', url=url) return with logfire.span('create embedding for {url=}', url=url): embedding = await openai.embeddings.create( input=section.embedding_content(), model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() await pool.execute( 'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)', url, section.title, section.content, embedding_json, ) @dataclass class DocsSection: id: int parent: int | None path: str level: int title: str content: str def url(self) -> str: url_path = re.sub(r'\.md$', '', self.path) return ( f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}' ) def embedding_content(self) -> str: return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content)) sections_ta = TypeAdapter(list[DocsSection]) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect( create_db: bool = False, ) -> AsyncGenerator[asyncpg.Pool, None]: server_dsn, database = ( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_rag', ) if create_db: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() pool = await asyncpg.create_pool(f'{server_dsn}/{database}') try: yield pool finally: await pool.close() DB_SCHEMA = """ CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS doc_sections ( id serial PRIMARY KEY, url text NOT NULL UNIQUE, title text NOT NULL, content text NOT NULL, -- text-embedding-3-small returns a vector of 1536 floats embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops); """ def slugify(value: str, separator: str, unicode: bool = False) -> str: """Slugify a string, to make it URL friendly.""" # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38 if not unicode: # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty` value = unicodedata.normalize('NFKD', value) value = value.encode('ascii', 'ignore').decode('ascii') value = re.sub(r'[^\w\s-]', '', value).strip().lower() return re.sub(rf'[{separator}\s]+', separator, value) if __name__ == '__main__': action = sys.argv[1] if len(sys.argv) > 1 else None if action == 'build': asyncio.run(build_search_db()) elif action == 'search': if len(sys.argv) == 3: q = sys.argv[2] else: q = 'How do I configure logfire to work with FastAPI?' asyncio.run(run_agent(q)) else: print( 'uv run --extra examples -m pydantic_ai_examples.rag build|search', file=sys.stderr, ) sys.exit(1) ``` # Examples Here we include some examples of how to use Pydantic AI and what it can do. ## Usage These examples are distributed with `pydantic-ai` so you can run them either by cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by simply installing `pydantic-ai` from PyPI with `pip` or `uv`. ### Installing required dependencies Either way you'll need to install extra dependencies to run some examples, you just need to install the `examples` optional dependency group. If you've installed `pydantic-ai` via pip/uv, you can install the extra dependencies with: ```bash pip install "pydantic-ai[examples]" ``` ```bash uv add "pydantic-ai[examples]" ``` If you clone the repo, you should instead use `uv sync --extra examples` to install extra dependencies. ### Setting model environment variables These examples will need you to set up authentication with one or more of the LLMs, see the [model configuration](https://ai.pydantic.dev/models/overview/index.md) docs for details on how to do this. TL;DR: in most cases you'll need to set one of the following environment variables: ```bash export OPENAI_API_KEY=your-api-key ``` ```bash export GEMINI_API_KEY=your-api-key ``` ### Running Examples To run the examples (this will work whether you installed `pydantic_ai`, or cloned the repo), run: ```bash python -m pydantic_ai_examples. ``` ```bash uv run -m pydantic_ai_examples. ``` For example, to run the very simple [`pydantic_model`](https://ai.pydantic.dev/examples/pydantic-model/index.md) example: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup: ```bash OPENAI_API_KEY='your-api-key' \ uv run --with "pydantic-ai[examples]" \ -m pydantic_ai_examples.pydantic_model ``` ______________________________________________________________________ You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with: ```bash python -m pydantic_ai_examples --copy-to examples/ ``` ```bash uv run -m pydantic_ai_examples --copy-to examples/ ``` # Slack Lead Qualifier with Modal In this example, we're going to build an agentic app that: - automatically researches each new member that joins a company's public Slack community to see how good of a fit they are for the company's commercial product, - sends this analysis into a (private) Slack channel, and - sends a daily summary of the top 5 leads from the previous 24 hours into a (different) Slack channel. We'll be deploying the app on [Modal](https://modal.com), as it lets you use Python to define an app with web endpoints, scheduled functions, and background functions, and deploy them with a CLI, without needing to set up or manage any infrastructure. It's a great way to lower the barrier for people in your organization to start building and deploying AI agents to make their jobs easier. We also add [Pydantic Logfire](https://pydantic.dev/logfire) to get observability into the app and agent as they're running in response to webhooks and the schedule ## Screenshots This is what the analysis sent into Slack will look like: This is what the corresponding trace in [Logfire](https://pydantic.dev/logfire) will look like: All of these entries can be clicked on to get more details about what happened at that step, including the full conversation with the LLM and HTTP requests and responses. ## Prerequisites If you just want to see the code without actually going through the effort of setting up the bits necessary to run it, feel free to [jump ahead](#the-code). ### Slack app You need to have a Slack workspace and the necessary permissions to create apps. 2. Create a new Slack app using the instructions at . 1. In step 2, "Requesting scopes", request the following scopes: - [`users.read`](https://docs.slack.dev/reference/scopes/users.read) - [`users.read.email`](https://docs.slack.dev/reference/scopes/users.read.email) - [`users.profile.read`](https://docs.slack.dev/reference/scopes/users.profile.read) 1. In step 3, "Installing and authorizing the app", note down the Access Token as we're going to need to store it as a Secret in Modal. 1. You can skip steps 4 and 5. We're going to need to subscribe to the `team_join` event, but at this point you don't have a webhook URL yet. 1. Create the channels the app will post into, and add the Slack app to them: - `#new-slack-leads` - `#daily-slack-leads-summary` These names are hard-coded in the example. If you want to use different channels, you can clone the repo and change them in `examples/pydantic_examples/slack_lead_qualifier/functions.py`. ### Logfire Write Token 1. If you don't have a Logfire account yet, create one on . 1. Create a new project named, for example, `slack-lead-qualifier`. 1. Generate a new Write Token and note it down, as we're going to need to store it as a Secret in Modal. ### OpenAI API Key 1. If you don't have an OpenAI account yet, create one on . 1. Create a new API Key in Settings and note it down, as we're going to need to store it as a Secret in Modal. ### Modal account 1. If you don't have a Modal account yet, create one on . 1. Create 3 Secrets of type "Custom" on : - Name: `slack`, key: `SLACK_API_KEY`, value: the Slack Access Token you generated earlier - Name: `logfire`, key: `LOGFIRE_TOKEN`, value: the Logfire Write Token you generated earlier - Name: `openai`, key: `OPENAI_API_KEY`, value: the OpenAI API Key you generated earlier ## Usage 1. Make sure you have the [dependencies installed](https://ai.pydantic.dev/examples/setup/#usage). 1. Authenticate with Modal: ```bash python/uv-run -m modal setup ``` 1. Run the example as an [ephemeral Modal app](https://modal.com/docs/guide/apps#ephemeral-apps), meaning it will only run until you quit it using Ctrl+C: ```bash python/uv-run -m modal serve -m pydantic_ai_examples.slack_lead_qualifier.modal ``` 1. Note down the URL after `Created web function web_app =>`, this is your webhook endpoint URL. 1. Go back to and follow step 4, "Configuring the app for event listening", to subscribe to the `team_join` event with the webhook endpoint URL you noted down as the Request URL. Now when someone new (possibly you with a throwaway email) joins the Slack workspace, you'll see the webhook event being processed in the terminal where you ran `modal serve` and in the Logfire Live view, and after waiting a few seconds you should see the result appear in the `#new-slack-leads` Slack channel! Faking a Slack signup You can also fake a Slack signup event and try out the agent like this, with any name or email you please: ```bash curl -X POST \ -H "Content-Type: application/json" \ -d '{ "type": "event_callback", "event": { "type": "team_join", "user": { "profile": { "email": "samuel@pydantic.dev", "first_name": "Samuel", "last_name": "Colvin", "display_name": "Samuel Colvin" } } } }' ``` Deploying to production If you'd like to deploy this app into your Modal workspace in a persistent fashion, you can use this command: ```bash python/uv-run -m modal deploy -m pydantic_ai_examples.slack_lead_qualifier.modal ``` You'll likely want to [download the code](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/slack_lead_qualifier) first, put it in a new repo, and then do [continuous deployment](https://modal.com/docs/guide/continuous-deployment#github-actions) using GitHub Actions. Don't forget to update the Slack event request URL to the new persistent URL! You'll also want to modify the [instructions for the agent](#agent) to your own situation. ## The code We're going to start with the basics, and then gradually build up into the full app. ### Models #### `Profile` First, we define a [Pydantic](https://docs.pydantic.dev) model that represents a Slack user profile. These are the fields we get from the [`team_join`](https://docs.slack.dev/reference/events/team_join) event that's sent to the webhook endpoint that we'll define in a bit. [slack_lead_qualifier/models.py (L11-L15)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L11-L15) ```python ... class Profile(BaseModel): first_name: str | None = None last_name: str | None = None display_name: str | None = None email: str ... ``` We also define a `Profile.as_prompt()` helper method that uses format_as_xml to turn the profile into a string that can be sent to the model. [slack_lead_qualifier/models.py (L7-L19)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L7-L19) ```python ... from pydantic_ai import format_as_xml ... class Profile(BaseModel): ... def as_prompt(self) -> str: return format_as_xml(self, root_tag='profile') ... ``` #### `Analysis` The second model we'll need represents the result of the analysis that the agent will perform. We include docstrings to provide additional context to the model on what these fields should contain. [slack_lead_qualifier/models.py (L23-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L31) ```python ... class Analysis(BaseModel): profile: Profile organization_name: str organization_domain: str job_title: str relevance: Annotated[int, Ge(1), Le(5)] """Estimated fit for Pydantic Logfire: 1 = low, 5 = high""" summary: str """One-sentence welcome note summarising who they are and how we might help""" ... ``` We also define a `Analysis.as_slack_blocks()` helper method that turns the analysis into some [Slack blocks](https://api.slack.com/reference/block-kit/blocks) that can be sent to the Slack API to post a new message. [slack_lead_qualifier/models.py (L23-L46)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L46) ```python ... class Analysis(BaseModel): ... def as_slack_blocks(self, include_relevance: bool = False) -> list[dict[str, Any]]: profile = self.profile relevance = f'({self.relevance}/5)' if include_relevance else '' return [ { 'type': 'markdown', 'text': f'[{profile.display_name}](mailto:{profile.email}), {self.job_title} at [**{self.organization_name}**](https://{self.organization_domain}) {relevance}', }, { 'type': 'markdown', 'text': self.summary, }, ] ``` ### Agent Now it's time to get into Pydantic AI and define the agent that will do the actual analysis! We specify the model we'll use (`openai:gpt-5`), provide [instructions](https://ai.pydantic.dev/agent/#instructions), give the agent access to the [DuckDuckGo search tool](https://ai.pydantic.dev/common-tools/#duckduckgo-search-tool), and tell it to output either an `Analysis` or `None` using the [Native Output](https://ai.pydantic.dev/output/#native-output) structured output mode. The real meat of the app is in the instructions that tell the agent how to evaluate each new Slack member. If you plan to use this app yourself, you'll of course want to modify them to your own situation. [Learn about Gateway](https://ai.pydantic.dev/gateway) [slack_lead_qualifier/agent.py (L7-L40)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L7-L40) ```python ... from pydantic_ai import Agent, NativeOutput from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool ... agent = Agent( 'gateway/openai:gpt-5.2', instructions=dedent( """ When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them. **What to include** 1. **Who they are:** Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio). 2. **Where they work:** Name of the organisation and its domain. 3. **How we can help:** On a scale of 1–5, estimate how likely they are to benefit from **Pydantic Logfire** (our paid observability tool) based on factors such as company size, product maturity, or AI usage. *1 = probably not relevant, 5 = very strong fit.* **Our products (for context only)** • **Pydantic Validation** – Python data-validation (open source) • **Pydantic AI** – Python agent framework (open source) • **Pydantic Logfire** – Observability for traces, logs & metrics with first-class AI support (commercial) **How to research** • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub. • If you can't find enough to form a reasonable view, return **None**. """ ), tools=[duckduckgo_search_tool()], output_type=NativeOutput([Analysis, NoneType]), ) ... ``` [slack_lead_qualifier/agent.py (L7-L40)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L7-L40) ```python ... from pydantic_ai import Agent, NativeOutput from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool ... agent = Agent( 'openai:gpt-5.2', instructions=dedent( """ When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them. **What to include** 1. **Who they are:** Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio). 2. **Where they work:** Name of the organisation and its domain. 3. **How we can help:** On a scale of 1–5, estimate how likely they are to benefit from **Pydantic Logfire** (our paid observability tool) based on factors such as company size, product maturity, or AI usage. *1 = probably not relevant, 5 = very strong fit.* **Our products (for context only)** • **Pydantic Validation** – Python data-validation (open source) • **Pydantic AI** – Python agent framework (open source) • **Pydantic Logfire** – Observability for traces, logs & metrics with first-class AI support (commercial) **How to research** • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub. • If you can't find enough to form a reasonable view, return **None**. """ ), tools=[duckduckgo_search_tool()], output_type=NativeOutput([Analysis, NoneType]), ) ... ``` #### `analyze_profile` We also define a `analyze_profile` helper function that takes a `Profile`, runs the agent, and returns an `Analysis` (or `None`), and instrument it using [Logfire](https://ai.pydantic.dev/logfire/index.md). [slack_lead_qualifier/agent.py (L44-L47)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L44-L47) ```python ... @logfire.instrument('Analyze profile') async def analyze_profile(profile: Profile) -> Analysis | None: result = await agent.run(profile.as_prompt()) return result.output ``` ### Analysis store The next building block we'll need is a place to store all the analyses that have been done so that we can look them up when we send the daily summary. Fortunately, Modal provides us with a convenient way to store some data that can be read back in a subsequent Modal run (webhook or scheduled): [`modal.Dict`](https://modal.com/docs/reference/modal.Dict). We define some convenience methods to easily add, list, and clear analyses. [slack_lead_qualifier/store.py (L4-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/store.py#L4-L31) ```python ... import modal ... class AnalysisStore: @classmethod @logfire.instrument('Add analysis to store') async def add(cls, analysis: Analysis): await cls._get_store().put.aio(analysis.profile.email, analysis.model_dump()) @classmethod @logfire.instrument('List analyses from store') async def list(cls) -> list[Analysis]: return [ Analysis.model_validate(analysis) async for analysis in cls._get_store().values.aio() ] @classmethod @logfire.instrument('Clear analyses from store') async def clear(cls): await cls._get_store().clear.aio() @classmethod def _get_store(cls) -> modal.Dict: return modal.Dict.from_name('analyses', create_if_missing=True) # type: ignore ``` Note Note that `# type: ignore` on the last line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. ### Send Slack message Next, we'll need a way to actually send a Slack message, so we define a simple function that uses Slack's [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API. [slack_lead_qualifier/slack.py (L8-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/slack.py#L8-L30) ```python ... API_KEY = os.getenv('SLACK_API_KEY') assert API_KEY, 'SLACK_API_KEY is not set' @logfire.instrument('Send Slack message') async def send_slack_message(channel: str, blocks: list[dict[str, Any]]): client = httpx.AsyncClient() response = await client.post( 'https://slack.com/api/chat.postMessage', json={ 'channel': channel, 'blocks': blocks, }, headers={ 'Authorization': f'Bearer {API_KEY}', }, timeout=5, ) response.raise_for_status() result = response.json() if not result.get('ok', False): error = result.get('error', 'Unknown error') raise Exception(f'Failed to send to Slack: {error}') ``` ### Features Now we can start putting these building blocks together to implement the actual features we want! #### `process_slack_member` This function takes a [`Profile`](#profile), [analyzes](#analyze_profile) it using the agent, adds it to the [`AnalysisStore`](#analysis-store), and [sends](#send-slack-message) the analysis into the `#new-slack-leads` channel. [slack_lead_qualifier/functions.py (L4-L45)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L4-L45) ```python ... from .agent import analyze_profile from .models import Profile from .slack import send_slack_message from .store import AnalysisStore ... NEW_LEAD_CHANNEL = '#new-slack-leads' ... @logfire.instrument('Process Slack member') async def process_slack_member(profile: Profile): analysis = await analyze_profile(profile) logfire.info('Analysis', analysis=analysis) if analysis is None: return await AnalysisStore().add(analysis) await send_slack_message( NEW_LEAD_CHANNEL, [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'New Slack member with score {analysis.relevance}/5', }, }, { 'type': 'divider', }, *analysis.as_slack_blocks(), ], ) ... ``` #### `send_daily_summary` This function list all of the analyses in the [`AnalysisStore`](#analysis-store), takes the top 5 by relevance, [sends](#send-slack-message) them into the `#daily-slack-leads-summary` channel, and clears the `AnalysisStore` so that the next daily run won't process these analyses again. [slack_lead_qualifier/functions.py (L8-L85)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L8-L85) ```python ... from .slack import send_slack_message from .store import AnalysisStore ... DAILY_SUMMARY_CHANNEL = '#daily-slack-leads-summary' ... @logfire.instrument('Send daily summary') async def send_daily_summary(): analyses = await AnalysisStore().list() logfire.info('Analyses', analyses=analyses) if len(analyses) == 0: return sorted_analyses = sorted(analyses, key=lambda x: x.relevance, reverse=True) top_analyses = sorted_analyses[:5] blocks = [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'Top {len(top_analyses)} new Slack members from the last 24 hours', }, }, ] for analysis in top_analyses: blocks.extend( [ { 'type': 'divider', }, *analysis.as_slack_blocks(include_relevance=True), ] ) await send_slack_message( DAILY_SUMMARY_CHANNEL, blocks, ) await AnalysisStore().clear() ``` ### Web app As it stands, neither of these functions are actually being called from anywhere. Let's implement a [FastAPI](https://fastapi.tiangolo.com/) endpoint to handle the `team_join` Slack webhook (also known as the [Slack Events API](https://docs.slack.dev/apis/events-api)) and call the [`process_slack_member`](#process_slack_member) function we just defined. We also instrument FastAPI using Logfire for good measure. [slack_lead_qualifier/app.py (L20-L36)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L20-L36) ```python ... app = FastAPI() logfire.instrument_fastapi(app, capture_headers=True) @app.post('/') async def process_webhook(payload: dict[str, Any]) -> dict[str, Any]: if payload['type'] == 'url_verification': return {'challenge': payload['challenge']} elif ( payload['type'] == 'event_callback' and payload['event']['type'] == 'team_join' ): profile = Profile.model_validate(payload['event']['user']['profile']) process_slack_member(profile) return {'status': 'OK'} raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY) ``` #### `process_slack_member` with Modal I was a little sneaky there -- we're not actually calling the [`process_slack_member`](#process_slack_member) function we defined in `functions.py` directly, as Slack requires webhooks to respond within 3 seconds, and we need a bit more time than that to talk to the LLM, do some web searches, and send the Slack message. Instead, we're calling the following function defined alongside the app, which uses Modal's [`modal.Function.spawn`](https://modal.com/docs/reference/modal.Function#spawn) feature to run a function in the background. (If you're curious what the Modal side of this function looks like, you can [jump ahead](#backgrounded-process_slack_member).) Because `modal.py` (which we'll see in the next section) imports `app.py`, we import from `modal.py` inside the function definition because doing so at the top level would have resulted in a circular import error. We also pass along the current Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), meaning that the background function execution will show up nested under the webhook request trace, so that we have everything related to that request in one place. [slack_lead_qualifier/app.py (L11-L16)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L11-L16) ```python ... def process_slack_member(profile: Profile): from .modal import process_slack_member as _process_slack_member _process_slack_member.spawn( profile.model_dump(), logfire_ctx=get_context() ) ... ``` ### Modal app Now let's see how easy Modal makes it to deploy all of this. #### Set up Modal The first thing we do is define the Modal app, by specifying the base image to use (Debian with Python 3.13), all the Python packages it needs, and all of the secrets defined in the Modal interface that need to be made available during runtime. [slack_lead_qualifier/modal.py (L4-L21)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L4-L21) ```python ... import modal image = modal.Image.debian_slim(python_version='3.13').pip_install( 'pydantic', 'pydantic_ai_slim[openai,duckduckgo]', 'logfire[httpx,fastapi]', 'fastapi[standard]', 'httpx', ) app = modal.App( name='slack-lead-qualifier', image=image, secrets=[ modal.Secret.from_name('logfire'), modal.Secret.from_name('openai'), modal.Secret.from_name('slack'), ], ) ... ``` #### Set up Logfire Next, we define a function to set up Logfire instrumentation for Pydantic AI and HTTPX. We cannot do this at the top level of the file, as the requested packages (like `logfire`) will only be available within functions running on Modal (like the ones we'll define next). This file, `modal.py`, runs on your local machine and only has access to the `modal` package. [slack_lead_qualifier/modal.py (L25-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L25-L30) ```python ... def setup_logfire(): import logfire logfire.configure(service_name=app.name) logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) ... ``` #### Web app To deploy a [web endpoint](https://modal.com/docs/guide/webhooks) on Modal, we simply define a function that returns an ASGI app (like FastAPI) and decorate it with `@app.function()` and `@modal.asgi_app()`. This `web_app` function will be run on Modal, so inside the function we can call the `setup_logfire` function that requires the `logfire` package, and import `app.py` which uses the other requested packages. By default, Modal spins up a container to handle a function call (like a web request) on-demand, meaning there's a little bit of startup time to each request. However, Slack requires webhooks to respond within 3 seconds, so we specify `min_containers=1` to keep the web endpoint running and ready to answer requests at all times. This is a bit annoying and wasteful, but fortunately [Modal's pricing](https://modal.com/pricing) is pretty reasonable, you get $30 free monthly compute, and they offer up to $50k in free credits for startup and academic researchers. [slack_lead_qualifier/modal.py (L34-L41)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L34-L41) ```python ... @app.function(min_containers=1) @modal.asgi_app() # type: ignore def web_app(): setup_logfire() from .app import app as _app return _app ... ``` Note Note that `# type: ignore` on the `@modal.asgi_app()` line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. #### Scheduled `send_daily_summary` To define a [scheduled function](https://modal.com/docs/guide/cron), we can use the `@app.function()` decorator with a `schedule` argument. This Modal function will call our imported [`send_daily_summary`](#send_daily_summary) function every day at 8 am UTC. [slack_lead_qualifier/modal.py (L60-L66)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L60-L66) ```python ... @app.function(schedule=modal.Cron('0 8 * * *')) # Every day at 8am UTC async def send_daily_summary(): setup_logfire() from .functions import send_daily_summary as _send_daily_summary await _send_daily_summary() ``` #### Backgrounded `process_slack_member` Finally, we define a Modal function that wraps our [`process_slack_member`](#process_slack_member) function, so that it can run in the background. As you'll remember from when we [spawned this function from the web app](#process_slack_member-with-modal), we passed along the Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), so we need to attach it here. [slack_lead_qualifier/modal.py (L45-L56)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L45-L56) ```python ... @app.function() async def process_slack_member(profile_raw: dict[str, Any], logfire_ctx: Any): setup_logfire() from logfire.propagate import attach_context from .functions import process_slack_member as _process_slack_member from .models import Profile with attach_context(logfire_ctx): profile = Profile.model_validate(profile_raw) await _process_slack_member(profile) ... ``` ## Conclusion And that's it! Now, assuming you've met the [prerequisites](#prerequisites), you can run or deploy the app using the commands under [usage](#usage). # SQL Generation Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Demonstrates: - [dynamic system prompt](https://ai.pydantic.dev/agent/#system-prompts) - [structured `output_type`](https://ai.pydantic.dev/output/#structured-output) - [output validation](https://ai.pydantic.dev/output/#output-validator-functions) - [agent dependencies](https://ai.pydantic.dev/dependencies/index.md) ## Running the Example The resulting SQL is validated by running it as an `EXPLAIN` query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker: ```bash docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres ``` *(we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running)* With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.sql_gen ``` ```bash uv run -m pydantic_ai_examples.sql_gen ``` or to use a custom prompt: ```bash python -m pydantic_ai_examples.sql_gen "find me errors" ``` ```bash uv run -m pydantic_ai_examples.sql_gen "find me errors" ``` This model uses `gemini-3-flash-preview` by default since Gemini is good at single shot queries of this kind. ## Example Code [sql_gen.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/sql_gen.py) ```python """Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Run postgres with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres Run with: uv run -m pydantic_ai_examples.sql_gen "show me logs from yesterday, with level 'error'" """ import asyncio import sys from collections.abc import AsyncGenerator from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import date from typing import Annotated, Any, TypeAlias import asyncpg import logfire from annotated_types import MinLen from devtools import debug from pydantic import BaseModel, Field from pydantic_ai import Agent, ModelRetry, RunContext, format_as_xml # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() DB_SCHEMA = """ CREATE TABLE records ( created_at timestamptz, start_timestamp timestamptz, end_timestamp timestamptz, trace_id text, span_id text, parent_span_id text, level log_level, span_name text, message text, attributes_json_schema text, attributes jsonb, tags text[], is_exception boolean, otel_status_message text, service_name text ); """ SQL_EXAMPLES = [ { 'request': 'show me records where foobar is false', 'response': "SELECT * FROM records WHERE attributes->>'foobar' = false", }, { 'request': 'show me records where attributes include the key "foobar"', 'response': "SELECT * FROM records WHERE attributes ? 'foobar'", }, { 'request': 'show me records from yesterday', 'response': "SELECT * FROM records WHERE start_timestamp::date > CURRENT_TIMESTAMP - INTERVAL '1 day'", }, { 'request': 'show me error records with the tag "foobar"', 'response': "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)", }, ] @dataclass class Deps: conn: asyncpg.Connection class Success(BaseModel): """Response when SQL could be successfully generated.""" sql_query: Annotated[str, MinLen(1)] explanation: str = Field( '', description='Explanation of the SQL query, as markdown' ) class InvalidRequest(BaseModel): """Response the user input didn't include enough information to generate SQL.""" error_message: str Response: TypeAlias = Success | InvalidRequest agent = Agent[Deps, Response]( 'google-gla:gemini-3-flash-preview', # Type ignore while we wait for PEP-0747, nonetheless unions will work fine everywhere else output_type=Response, # type: ignore deps_type=Deps, ) @agent.system_prompt async def system_prompt() -> str: return f"""\ Given the following PostgreSQL table of records, your job is to write a SQL query that suits the user's request. Database schema: {DB_SCHEMA} today's date = {date.today()} {format_as_xml(SQL_EXAMPLES)} """ @agent.output_validator async def validate_output(ctx: RunContext[Deps], output: Response) -> Response: if isinstance(output, InvalidRequest): return output # gemini often adds extraneous backslashes to SQL output.sql_query = output.sql_query.replace('\\', '') if not output.sql_query.upper().startswith('SELECT'): raise ModelRetry('Please create a SELECT query') try: await ctx.deps.conn.execute(f'EXPLAIN {output.sql_query}') except asyncpg.exceptions.PostgresError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output async def main(): if len(sys.argv) == 1: prompt = 'show me logs from yesterday, with level "error"' else: prompt = sys.argv[1] async with database_connect( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_sql_gen' ) as conn: deps = Deps(conn) result = await agent.run(prompt, deps=deps) debug(result.output) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect(server_dsn: str, database: str) -> AsyncGenerator[Any, None]: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() conn = await asyncpg.connect(f'{server_dsn}/{database}') try: with logfire.span('create schema'): async with conn.transaction(): if not db_exists: await conn.execute( "CREATE TYPE log_level AS ENUM ('debug', 'info', 'warning', 'error', 'critical')" ) await conn.execute(DB_SCHEMA) yield conn finally: await conn.close() if __name__ == '__main__': asyncio.run(main()) ``` This example shows how to stream markdown from an agent, using the [`rich`](https://github.com/Textualize/rich) library to highlight the output in the terminal. It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set. Demonstrates: - [streaming text responses](https://ai.pydantic.dev/output/#streaming-text) ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_markdown ``` ```bash uv run -m pydantic_ai_examples.stream_markdown ``` ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [stream_markdown.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_markdown.py) ```python """This example shows how to stream markdown from an agent, using the `rich` library to display the markdown. Run with: uv run -m pydantic_ai_examples.stream_markdown """ import asyncio import os import logfire from rich.console import Console, ConsoleOptions, RenderResult from rich.live import Live from rich.markdown import CodeBlock, Markdown from rich.syntax import Syntax from rich.text import Text from pydantic_ai import Agent from pydantic_ai.models import KnownModelName # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent() # models to try, and the appropriate env var models: list[tuple[KnownModelName, str]] = [ ('gateway/gemini:gemini-3-flash-preview', 'GEMINI_API_KEY'), ('openai:gpt-5-mini', 'OPENAI_API_KEY'), ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'), ] async def main(): prettier_code_blocks() console = Console() prompt = 'Show me a short example of using Pydantic.' console.log(f'Asking: {prompt}...', style='cyan') for model, env_var in models: if env_var in os.environ: console.log(f'Using model: {model}') with Live('', console=console, vertical_overflow='visible') as live: async with agent.run_stream(prompt, model=model) as result: async for message in result.stream_output(): live.update(Markdown(message)) console.log(result.usage()) else: console.log(f'{model} requires {env_var} to be set.') def prettier_code_blocks(): """Make rich code blocks prettier and easier to copy. From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22 """ class SimpleCodeBlock(CodeBlock): def __rich_console__( self, console: Console, options: ConsoleOptions ) -> RenderResult: code = str(self.text).rstrip() yield Text(self.lexer_name, style='dim') yield Syntax( code, self.lexer_name, theme=self.theme, background_color='default', word_wrap=True, ) yield Text(f'/{self.lexer_name}', style='dim') Markdown.elements['fence'] = SimpleCodeBlock if __name__ == '__main__': asyncio.run(main()) ``` [stream_markdown.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_markdown.py) ```python """This example shows how to stream markdown from an agent, using the `rich` library to display the markdown. Run with: uv run -m pydantic_ai_examples.stream_markdown """ import asyncio import os import logfire from rich.console import Console, ConsoleOptions, RenderResult from rich.live import Live from rich.markdown import CodeBlock, Markdown from rich.syntax import Syntax from rich.text import Text from pydantic_ai import Agent from pydantic_ai.models import KnownModelName # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent() # models to try, and the appropriate env var models: list[tuple[KnownModelName, str]] = [ ('google-gla:gemini-3-flash-preview', 'GEMINI_API_KEY'), ('openai:gpt-5-mini', 'OPENAI_API_KEY'), ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'), ] async def main(): prettier_code_blocks() console = Console() prompt = 'Show me a short example of using Pydantic.' console.log(f'Asking: {prompt}...', style='cyan') for model, env_var in models: if env_var in os.environ: console.log(f'Using model: {model}') with Live('', console=console, vertical_overflow='visible') as live: async with agent.run_stream(prompt, model=model) as result: async for message in result.stream_output(): live.update(Markdown(message)) console.log(result.usage()) else: console.log(f'{model} requires {env_var} to be set.') def prettier_code_blocks(): """Make rich code blocks prettier and easier to copy. From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22 """ class SimpleCodeBlock(CodeBlock): def __rich_console__( self, console: Console, options: ConsoleOptions ) -> RenderResult: code = str(self.text).rstrip() yield Text(self.lexer_name, style='dim') yield Syntax( code, self.lexer_name, theme=self.theme, background_color='default', word_wrap=True, ) yield Text(f'/{self.lexer_name}', style='dim') Markdown.elements['fence'] = SimpleCodeBlock if __name__ == '__main__': asyncio.run(main()) ``` Information about whales — an example of streamed structured response validation. Demonstrates: - [streaming structured output](https://ai.pydantic.dev/output/#streaming-structured-output) This script streams structured responses about whales, validates the data and displays it as a dynamic table using [`rich`](https://github.com/Textualize/rich) as the data is received. ## Running the Example With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_whales ``` ```bash uv run -m pydantic_ai_examples.stream_whales ``` Should give an output like this: ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [stream_whales.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_whales.py) ```python """Information about whales — an example of streamed structured response validation. This script streams structured responses about whales, validates the data and displays it as a dynamic table using Rich as the data is received. Run with: uv run -m pydantic_ai_examples.stream_whales """ from typing import Annotated import logfire from pydantic import Field from rich.console import Console from rich.live import Live from rich.table import Table from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class Whale(TypedDict): name: str length: Annotated[ float, Field(description='Average length of an adult whale in meters.') ] weight: NotRequired[ Annotated[ float, Field(description='Average weight of an adult whale in kilograms.', ge=50), ] ] ocean: NotRequired[str] description: NotRequired[Annotated[str, Field(description='Short Description')]] agent = Agent('gateway/openai:gpt-5.2', output_type=list[Whale]) async def main(): console = Console() with Live('\n' * 36, console=console) as live: console.print('Requesting data...', style='cyan') async with agent.run_stream( 'Generate me details of 5 species of Whale.' ) as result: console.print('Response:', style='green') async for whales in result.stream_output(debounce_by=0.01): table = Table( title='Species of Whale', caption='Streaming Structured responses from OpenAI', width=120, ) table.add_column('ID', justify='right') table.add_column('Name') table.add_column('Avg. Length (m)', justify='right') table.add_column('Avg. Weight (kg)', justify='right') table.add_column('Ocean') table.add_column('Description', justify='right') for wid, whale in enumerate(whales, start=1): table.add_row( str(wid), whale['name'], f'{whale["length"]:0.0f}', f'{w:0.0f}' if (w := whale.get('weight')) else '…', whale.get('ocean') or '…', whale.get('description') or '…', ) live.update(table) if __name__ == '__main__': import asyncio asyncio.run(main()) ``` [stream_whales.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_whales.py) ```python """Information about whales — an example of streamed structured response validation. This script streams structured responses about whales, validates the data and displays it as a dynamic table using Rich as the data is received. Run with: uv run -m pydantic_ai_examples.stream_whales """ from typing import Annotated import logfire from pydantic import Field from rich.console import Console from rich.live import Live from rich.table import Table from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class Whale(TypedDict): name: str length: Annotated[ float, Field(description='Average length of an adult whale in meters.') ] weight: NotRequired[ Annotated[ float, Field(description='Average weight of an adult whale in kilograms.', ge=50), ] ] ocean: NotRequired[str] description: NotRequired[Annotated[str, Field(description='Short Description')]] agent = Agent('openai:gpt-5.2', output_type=list[Whale]) async def main(): console = Console() with Live('\n' * 36, console=console) as live: console.print('Requesting data...', style='cyan') async with agent.run_stream( 'Generate me details of 5 species of Whale.' ) as result: console.print('Response:', style='green') async for whales in result.stream_output(debounce_by=0.01): table = Table( title='Species of Whale', caption='Streaming Structured responses from OpenAI', width=120, ) table.add_column('ID', justify='right') table.add_column('Name') table.add_column('Avg. Length (m)', justify='right') table.add_column('Avg. Weight (kg)', justify='right') table.add_column('Ocean') table.add_column('Description', justify='right') for wid, whale in enumerate(whales, start=1): table.add_row( str(wid), whale['name'], f'{whale["length"]:0.0f}', f'{w:0.0f}' if (w := whale.get('weight')) else '…', whale.get('ocean') or '…', whale.get('description') or '…', ) live.update(table) if __name__ == '__main__': import asyncio asyncio.run(main()) ``` Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. Demonstrates: - [tools](https://ai.pydantic.dev/tools/index.md) - [agent dependencies](https://ai.pydantic.dev/dependencies/index.md) - [streaming text responses](https://ai.pydantic.dev/output/#streaming-text) - Building a [Gradio](https://www.gradio.app/) UI for the agent In this case the idea is a "weather" agent — the user can ask for the weather in multiple locations, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather for those locations. ## Running the Example To run this example properly, you might want to add two extra API keys **(Note if either key is missing, the code will fall back to dummy data, so they're not required)**: - A weather API key from [tomorrow.io](https://www.tomorrow.io/weather-api/) set via `WEATHER_API_KEY` - A geocoding API key from [geocode.maps.co](https://geocode.maps.co/) set via `GEO_API_KEY` With [dependencies installed and environment variables set](https://ai.pydantic.dev/examples/setup/#usage), run: ```bash python -m pydantic_ai_examples.weather_agent ``` ```bash uv run -m pydantic_ai_examples.weather_agent ``` ## Example Code [Learn about Gateway](https://ai.pydantic.dev/gateway) [weather_agent.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent.py) ```python """Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. In this case the idea is a "weather" agent — the user can ask for the weather in multiple cities, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather. Run with: uv run -m pydantic_ai_examples.weather_agent """ from __future__ import annotations as _annotations import asyncio from dataclasses import dataclass from typing import Any import logfire from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() @dataclass class Deps: client: AsyncClient weather_agent = Agent( 'gateway/openai:gpt-5-mini', # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use # the below tools appropriately, but others like anthropic and gemini require a bit more direction. instructions='Be concise, reply with one sentence.', deps_type=Deps, retries=2, ) class LatLng(BaseModel): lat: float lng: float @weather_agent.tool async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng: """Get the latitude and longitude of a location. Args: ctx: The context. location_description: A description of a location. """ # NOTE: the response here will be random, and is not related to the location description. r = await ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/latlng', params={'location': location_description}, ) r.raise_for_status() return LatLng.model_validate_json(r.content) @weather_agent.tool async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]: """Get the weather at a location. Args: ctx: The context. lat: Latitude of the location. lng: Longitude of the location. """ # NOTE: the responses here will be random, and are not related to the lat and lng. temp_response, descr_response = await asyncio.gather( ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/number', params={'min': 10, 'max': 30}, ), ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/weather', params={'lat': lat, 'lng': lng}, ), ) temp_response.raise_for_status() descr_response.raise_for_status() return { 'temperature': f'{temp_response.text} °C', 'description': descr_response.text, } async def main(): async with AsyncClient() as client: logfire.instrument_httpx(client, capture_all=True) deps = Deps(client=client) result = await weather_agent.run( 'What is the weather like in London and in Wiltshire?', deps=deps ) print('Response:', result.output) if __name__ == '__main__': asyncio.run(main()) ``` [weather_agent.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent.py) ```python """Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. In this case the idea is a "weather" agent — the user can ask for the weather in multiple cities, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather. Run with: uv run -m pydantic_ai_examples.weather_agent """ from __future__ import annotations as _annotations import asyncio from dataclasses import dataclass from typing import Any import logfire from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() @dataclass class Deps: client: AsyncClient weather_agent = Agent( 'openai:gpt-5-mini', # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use # the below tools appropriately, but others like anthropic and gemini require a bit more direction. instructions='Be concise, reply with one sentence.', deps_type=Deps, retries=2, ) class LatLng(BaseModel): lat: float lng: float @weather_agent.tool async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng: """Get the latitude and longitude of a location. Args: ctx: The context. location_description: A description of a location. """ # NOTE: the response here will be random, and is not related to the location description. r = await ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/latlng', params={'location': location_description}, ) r.raise_for_status() return LatLng.model_validate_json(r.content) @weather_agent.tool async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]: """Get the weather at a location. Args: ctx: The context. lat: Latitude of the location. lng: Longitude of the location. """ # NOTE: the responses here will be random, and are not related to the lat and lng. temp_response, descr_response = await asyncio.gather( ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/number', params={'min': 10, 'max': 30}, ), ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/weather', params={'lat': lat, 'lng': lng}, ), ) temp_response.raise_for_status() descr_response.raise_for_status() return { 'temperature': f'{temp_response.text} °C', 'description': descr_response.text, } async def main(): async with AsyncClient() as client: logfire.instrument_httpx(client, capture_all=True) deps = Deps(client=client) result = await weather_agent.run( 'What is the weather like in London and in Wiltshire?', deps=deps ) print('Response:', result.output) if __name__ == '__main__': asyncio.run(main()) ``` ## Running the UI You can build multi-turn chat applications for your agent with [Gradio](https://www.gradio.app/), a framework for building AI web applications entirely in python. Gradio comes with built-in chat components and agent support so the entire UI will be implemented in a single python file! Here's what the UI looks like for the weather agent: ```bash pip install gradio>=5.9.0 python/uv-run -m pydantic_ai_examples.weather_agent_gradio ``` ## UI Code [weather_agent_gradio.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent_gradio.py) ```python from __future__ import annotations as _annotations import json from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import ToolCallPart, ToolReturnPart from pydantic_ai_examples.weather_agent import Deps, weather_agent try: import gradio as gr except ImportError as e: raise ImportError( 'Please install gradio with `pip install gradio`. You must use python>=3.10.' ) from e TOOL_TO_DISPLAY_NAME = {'get_lat_lng': 'Geocoding API', 'get_weather': 'Weather API'} client = AsyncClient() deps = Deps(client=client) async def stream_from_agent(prompt: str, chatbot: list[dict], past_messages: list): chatbot.append({'role': 'user', 'content': prompt}) yield gr.Textbox(interactive=False, value=''), chatbot, gr.skip() async with weather_agent.run_stream( prompt, deps=deps, message_history=past_messages ) as result: for message in result.new_messages(): for call in message.parts: if isinstance(call, ToolCallPart): call_args = call.args_as_json_str() metadata = { 'title': f'🛠️ Using {TOOL_TO_DISPLAY_NAME[call.tool_name]}', } if call.tool_call_id is not None: metadata['id'] = call.tool_call_id gr_message = { 'role': 'assistant', 'content': 'Parameters: ' + call_args, 'metadata': metadata, } chatbot.append(gr_message) if isinstance(call, ToolReturnPart): for gr_message in chatbot: if ( gr_message.get('metadata', {}).get('id', '') == call.tool_call_id ): if isinstance(call.content, BaseModel): json_content = call.content.model_dump_json() else: json_content = json.dumps(call.content) gr_message['content'] += f'\nOutput: {json_content}' yield gr.skip(), chatbot, gr.skip() chatbot.append({'role': 'assistant', 'content': ''}) async for message in result.stream_text(): chatbot[-1]['content'] = message yield gr.skip(), chatbot, gr.skip() past_messages = result.all_messages() yield gr.Textbox(interactive=True), gr.skip(), past_messages async def handle_retry(chatbot, past_messages: list, retry_data: gr.RetryData): new_history = chatbot[: retry_data.index] previous_prompt = chatbot[retry_data.index]['content'] past_messages = past_messages[: retry_data.index] async for update in stream_from_agent(previous_prompt, new_history, past_messages): yield update def undo(chatbot, past_messages: list, undo_data: gr.UndoData): new_history = chatbot[: undo_data.index] past_messages = past_messages[: undo_data.index] return chatbot[undo_data.index]['content'], new_history, past_messages def select_data(message: gr.SelectData) -> str: return message.value['text'] with gr.Blocks() as demo: gr.HTML( """

Weather Assistant

This assistant answer your weather questions.

""" ) past_messages = gr.State([]) chatbot = gr.Chatbot( label='Packing Assistant', avatar_images=(None, 'https://ai.pydantic.dev/img/logo-white.svg'), examples=[ {'text': 'What is the weather like in Miami?'}, {'text': 'What is the weather like in London?'}, ], ) with gr.Row(): prompt = gr.Textbox( lines=1, show_label=False, placeholder='What is the weather like in New York City?', ) generation = prompt.submit( stream_from_agent, inputs=[prompt, chatbot, past_messages], outputs=[prompt, chatbot, past_messages], ) chatbot.example_select(select_data, None, [prompt]) chatbot.retry( handle_retry, [chatbot, past_messages], [prompt, chatbot, past_messages] ) chatbot.undo(undo, [chatbot, past_messages], [prompt, chatbot, past_messages]) if __name__ == '__main__': demo.launch() ```