# Pydantic AI > GenAI Agent Framework, the Pydantic way Pydantic AI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI. # Introduction # Pydantic AI *GenAI Agent Framework, the Pydantic way* Pydantic AI is a Python agent framework designed to help you quickly, confidently, and painlessly build production grade applications and workflows with Generative AI. FastAPI revolutionized web development by offering an innovative and ergonomic design, built on the foundation of [Pydantic Validation](https://docs.pydantic.dev) and modern Python features like type hints. Yet despite virtually every Python agent framework and LLM library using Pydantic Validation, when we began to use LLMs in [Pydantic Logfire](https://pydantic.dev/logfire), we couldn't find anything that gave us the same feeling. We built Pydantic AI with one simple aim: to bring that FastAPI feeling to GenAI app and agent development. ## Why use Pydantic AI 1. **Built by the Pydantic Team**: [Pydantic Validation](https://docs.pydantic.dev/latest/) is the validation layer of the OpenAI SDK, the Google ADK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more. *Why use the derivative when you can go straight to the source?* 1. **Model-agnostic**: Supports virtually every [model](models/overview/) and provider: OpenAI, Anthropic, Gemini, DeepSeek, Grok, Cohere, Mistral, and Perplexity; Azure AI Foundry, Amazon Bedrock, Google Vertex AI, Ollama, LiteLLM, Groq, OpenRouter, Together AI, Fireworks AI, Cerebras, Hugging Face, GitHub, Heroku, Vercel, Nebius. If your favorite model or provider is not listed, you can easily implement a [custom model](models/overview/#custom-models). 1. **Seamless Observability**: Tightly [integrates](logfire/) with [Pydantic Logfire](https://pydantic.dev/logfire), our general-purpose OpenTelemetry observability platform, for real-time debugging, evals-based performance monitoring, and behavior, tracing, and cost tracking. If you already have an observability platform that supports OTel, you can [use that too](logfire/#alternative-observability-backends). 1. **Fully Type-safe**: Designed to give your IDE or AI coding agent as much context as possible for auto-completion and [type checking](agents/#static-type-checking), moving entire classes of errors from runtime to write-time for a bit of that Rust "if it compiles, it works" feel. 1. **Powerful Evals**: Enables you to systematically test and [evaluate](evals/) the performance and accuracy of the agentic systems you build, and monitor the performance over time in Pydantic Logfire. 1. **MCP, A2A, and AG-UI**: Integrates the [Model Context Protocol](mcp/client/), [Agent2Agent](a2a/), and [AG-UI](ag-ui/) standards to give your agent access to external tools and data, let it interoperate with other agents, and build interactive applications with streaming event-based communication. 1. **Human-in-the-Loop Tool Approval**: Easily lets you flag that certain tool calls [require approval](deferred-tools/#human-in-the-loop-tool-approval) before they can proceed, possibly depending on tool call arguments, conversation history, or user preferences. 1. **Durable Execution**: Enables you to build [durable agents](durable_execution/overview/) that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. 1. **Streamed Outputs**: Provides the ability to [stream](output/#streamed-results) structured output continuously, with immediate validation, ensuring real time access to generated data. 1. **Graph Support**: Provides a powerful way to define [graphs](graph/) using type hints, for use in complex applications where standard control flow can degrade to spaghetti code. Realistically though, no list is going to be as convincing as [giving it a try](#next-steps) and seeing how it makes you feel! **Sign up for our newsletter, *The Pydantic Stack*, with updates & tutorials on Pydantic AI, Logfire, and Pydantic:** Subscribe ## Hello World Example Here's a minimal example of Pydantic AI: hello_world.py ```python from pydantic_ai import Agent agent = Agent( # (1)! 'anthropic:claude-sonnet-4-0', instructions='Be concise, reply with one sentence.', # (2)! ) result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. We configure the agent to use [Anthropic's Claude Sonnet 4.0](api/models/anthropic/) model, but you can also set the model when running the agent. 1. Register static [instructions](agents/#instructions) using a keyword argument to the agent. 1. [Run the agent](agents/#running-agents) synchronously, starting a conversation with the LLM. *(This example is complete, it can be run "as is", assuming you've [installed the `pydantic_ai` package](install/))* The exchange will be very short: Pydantic AI will send the instructions and the user prompt to the LLM, and the model will return a text response. Not very interesting yet, but we can easily add [tools](tools/), [dynamic instructions](agents/#instructions), and [structured outputs](output/) to build more powerful agents. ## Tools & Dependency Injection Example Here is a concise example using Pydantic AI to build a support agent for a bank: bank_support.py ```python from dataclasses import dataclass from pydantic import BaseModel, Field from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn @dataclass class SupportDependencies: # (3)! customer_id: int db: DatabaseConn # (12)! class SupportOutput(BaseModel): # (13)! support_advice: str = Field(description='Advice returned to the customer') block_card: bool = Field(description="Whether to block the customer's card") risk: int = Field(description='Risk level of query', ge=0, le=10) support_agent = Agent( # (1)! 'openai:gpt-5', # (2)! deps_type=SupportDependencies, output_type=SupportOutput, # (9)! instructions=( # (4)! 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) @support_agent.instructions # (5)! async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool # (6)! async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> float: """Returns the customer's current account balance.""" # (7)! return await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) ... # (11)! async def main(): deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = await support_agent.run('What is my balance?', deps=deps) # (8)! print(result.output) # (10)! """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = await support_agent.run('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` 1. This [agent](agents/) will act as first-tier support in a bank. Agents are generic in the type of dependencies they accept and the type of output they return. In this case, the support agent has type `Agent[SupportDependencies, SupportOutput]`. 1. Here we configure the agent to use [OpenAI's GPT-5 model](api/models/openai/), you can also set the model when running the agent. 1. The `SupportDependencies` dataclass is used to pass data, connections, and logic into the model that will be needed when running [instructions](agents/#instructions) and [tool](tools/) functions. Pydantic AI's system of dependency injection provides a [type-safe](agents/#static-type-checking) way to customise the behavior of your agents, and can be especially useful when running [unit tests](testing/) and evals. 1. Static [instructions](agents/#instructions) can be registered with the instructions keyword argument to the agent. 1. Dynamic [instructions](agents/#instructions) can be registered with the @agent.instructions decorator, and can make use of dependency injection. Dependencies are carried via the RunContext argument, which is parameterized with the `deps_type` from above. If the type annotation here is wrong, static type checkers will catch it. 1. The [`@agent.tool`](tools/) decorator let you register functions which the LLM may call while responding to a user. Again, dependencies are carried via RunContext, any other arguments become the tool schema passed to the LLM. Pydantic is used to validate these arguments, and errors are passed back to the LLM so it can retry. 1. The docstring of a tool is also passed to the LLM as the description of the tool. Parameter descriptions are [extracted](tools/#function-tools-and-schema) from the docstring and added to the parameter schema sent to the LLM. 1. [Run the agent](agents/#running-agents) asynchronously, conducting a conversation with the LLM until a final response is reached. Even in this fairly simple case, the agent will exchange multiple messages with the LLM as tools are called to retrieve an output. 1. The response from the agent will be guaranteed to be a `SupportOutput`. If validation fails [reflection](agents/#reflection-and-self-correction), the agent is prompted to try again. 1. The output will be validated with Pydantic to guarantee it is a `SupportOutput`, since the agent is generic, it'll also be typed as a `SupportOutput` to aid with static type checking. 1. In a real use case, you'd add more tools and longer instructions to the agent to extend the context it's equipped with and support it can provide. 1. This is a simple sketch of a database connection, used to keep the example short and readable. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. 1. This [Pydantic](https://docs.pydantic.dev) model is used to constrain the structured data returned by the agent. From this simple definition, Pydantic builds the JSON Schema that tells the LLM how to return the data, and performs validation to guarantee the data is correct at the end of the run. Complete `bank_support.py` example The code included here is incomplete for the sake of brevity (the definition of `DatabaseConn` is missing); you can find the complete `bank_support.py` example [here](examples/bank-support/). ## Instrumentation with Pydantic Logfire Even a simple agent with just a handful of tools can result in a lot of back-and-forth with the LLM, making it nearly impossible to be confident of what's going on just from reading the code. To understand the flow of the above runs, we can watch the agent in action using Pydantic Logfire. To do this, we need to [set up Logfire](logfire/#using-logfire), and add the following to our code: bank_support_with_logfire.py ```python ... from pydantic_ai import Agent, RunContext from bank_database import DatabaseConn import logfire logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! logfire.instrument_asyncpg() # (3)! ... support_agent = Agent( 'openai:gpt-4o', deps_type=SupportDependencies, output_type=SupportOutput, system_prompt=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query.' ), ) ``` 1. Configure the Logfire SDK, this will fail if project is not set up. 1. This will instrument all Pydantic AI agents used from here on out. If you want to instrument only a specific agent, you can pass the instrument=True keyword argument to the agent. 1. In our demo, `DatabaseConn` uses `asyncpg` to connect to a PostgreSQL database, so [`logfire.instrument_asyncpg()`](https://magicstack.github.io/asyncpg/current/) is used to log the database queries. That's enough to get the following view of your agent in action: See [Monitoring and Performance](logfire/) to learn more. ## `llms.txt` The Pydantic AI documentation is available in the [llms.txt](https://llmstxt.org/) format. This format is defined in Markdown and suited for LLMs and AI coding assistants and agents. Two formats are available: - [`llms.txt`](https://ai.pydantic.dev/llms.txt): a file containing a brief description of the project, along with links to the different sections of the documentation. The structure of this file is described in details [here](https://llmstxt.org/#format). - [`llms-full.txt`](https://ai.pydantic.dev/llms-full.txt): Similar to the `llms.txt` file, but every link content is included. Note that this file may be too large for some LLMs. As of today, these files are not automatically leveraged by IDEs or coding agents, but they will use it if you provide a link or the full text. ## Next Steps To try Pydantic AI for yourself, [install it](install/) and follow the instructions [in the examples](examples/setup/). Read the [docs](agents/) to learn more about building applications with Pydantic AI. Read the [API Reference](api/agent/) to understand Pydantic AI's interface. Join [Slack](https://logfire.pydantic.dev/docs/join-slack/) or file an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues) if you have any questions. # Upgrade Guide In September 2025, Pydantic AI reached V1, which means we're committed to API stability: we will not introduce changes that break your code until V2 (if we do, you can shout at us as it's definitely a mistake). Once we release V2, in April 2026 at the earliest, we'll continue to provide security fixes for V1 for another 6 months minimum, so you have time to upgrade your applications. ## Breaking Changes Here's a filtered list of the breaking changes for each version to help you upgrade Pydantic AI. ### v1.0.1 (2025-09-05) The following breaking change was accidentally left out of v1.0.0: - See [#2808](https://github.com/pydantic/pydantic-ai/pull/2808) - Remove `Python` evaluator from `pydantic_evals` for security reasons ### v1.0.0 (2025-09-04) - See [#2725](https://github.com/pydantic/pydantic-ai/pull/2725) - Drop support for Python 3.9 - See [#2738](https://github.com/pydantic/pydantic-ai/pull/2738) - Make many dataclasses require keyword arguments - See [#2715](https://github.com/pydantic/pydantic-ai/pull/2715) - Remove `cases` and `averages` attributes from `pydantic_evals` spans - See [#2798](https://github.com/pydantic/pydantic-ai/pull/2798) - Change `ModelRequest.parts` and `ModelResponse.parts` types from `list` to `Sequence` - See [#2726](https://github.com/pydantic/pydantic-ai/pull/2726) - Default `InstrumentationSettings` version to 2 - See [#2717](https://github.com/pydantic/pydantic-ai/pull/2717) - Remove errors when passing `AsyncRetrying` or `Retrying` object to `AsyncTenacityTransport` or `TenacityTransport` instead of `RetryConfig` ### v0.x.x Before V1, minor versions were used to introduce breaking changes: **v0.8.0 (2025-08-26)** See [#2689](https://github.com/pydantic/pydantic-ai/pull/2689) - `AgentStreamEvent` was expanded to be a union of `ModelResponseStreamEvent` and `HandleResponseEvent`, simplifying the `event_stream_handler` function signature. Existing code accepting `AgentStreamEvent | HandleResponseEvent` will continue to work. **v0.7.6 (2025-08-26)** The following breaking change was inadvertently released in a patch version rather than a minor version: See [#2670](https://github.com/pydantic/pydantic-ai/pull/2670) - `TenacityTransport` and `AsyncTenacityTransport` now require the use of `pydantic_ai.retries.RetryConfig` (which is just a `TypedDict` containing the kwargs to `tenacity.retry`) instead of `tenacity.Retrying` or `tenacity.AsyncRetrying`. **v0.7.0 (2025-08-12)** See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now yields a `FinalResultEvent` along with the existing `PartStartEvent` and `PartDeltaEvent`. If you're using `pydantic_ai.direct.model_request_stream` or `pydantic_ai.direct.model_request_stream_sync`, you may need to update your code to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.Model.request_stream` now receives a `run_context` argument. If you've implemented a custom `Model` subclass, you will need to account for this. See [#2458](https://github.com/pydantic/pydantic-ai/pull/2458) - `pydantic_ai.models.StreamedResponse` now requires a `model_request_parameters` field and constructor argument. If you've implemented a custom `Model` subclass and implemented `request_stream`, you will need to account for this. **v0.6.0 (2025-08-06)** This release was meant to clean some old deprecated code, so we can get a step closer to V1. See [#2440](https://github.com/pydantic/pydantic-ai/pull/2440) - The `next` method was removed from the `Graph` class. Use `async with graph.iter(...) as run: run.next()` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_type`, `result_tool_name` and `result_tool_description` arguments were removed from the `Agent` class. Use `output_type` instead. See [#2441](https://github.com/pydantic/pydantic-ai/pull/2441) - The `result_retries` argument was also removed from the `Agent` class. Use `output_retries` instead. See [#2443](https://github.com/pydantic/pydantic-ai/pull/2443) - The `data` property was removed from the `FinalResult` class. Use `output` instead. See [#2445](https://github.com/pydantic/pydantic-ai/pull/2445) - The `get_data` and `validate_structured_result` methods were removed from the `StreamedRunResult` class. Use `get_output` and `validate_structured_output` instead. See [#2446](https://github.com/pydantic/pydantic-ai/pull/2446) - The `format_as_xml` function was moved to the `pydantic_ai.format_as_xml` module. Import it via `from pydantic_ai import format_as_xml` instead. See [#2451](https://github.com/pydantic/pydantic-ai/pull/2451) - Removed deprecated `Agent.result_validator` method, `Agent.last_run_messages` property, `AgentRunResult.data` property, and `result_tool_return_content` parameters from result classes. **v0.5.0 (2025-08-04)** See [#2388](https://github.com/pydantic/pydantic-ai/pull/2388) - The `source` field of an `EvaluationResult` is now of type `EvaluatorSpec` rather than the actual source `Evaluator` instance, to help with serialization/deserialization. See [#2163](https://github.com/pydantic/pydantic-ai/pull/2163) - The `EvaluationReport.print` and `EvaluationReport.console_table` methods now require most arguments be passed by keyword. **v0.4.0 (2025-07-08)** See [#1799](https://github.com/pydantic/pydantic-ai/pull/1799) - Pydantic Evals `EvaluationReport` and `ReportCase` are now generic dataclasses instead of Pydantic models. If you were serializing them using `model_dump()`, you will now need to use the `EvaluationReportAdapter` and `ReportCaseAdapter` type adapters instead. See [#1507](https://github.com/pydantic/pydantic-ai/pull/1507) - The `ToolDefinition` `description` argument is now optional and the order of positional arguments has changed from `name, description, parameters_json_schema, ...` to `name, parameters_json_schema, description, ...` to account for this. **v0.3.0 (2025-06-18)** See [#1142](https://github.com/pydantic/pydantic-ai/pull/1142) — Adds support for thinking parts. We now convert the thinking blocks (`"...""`) in provider specific text parts to Pydantic AI `ThinkingPart`s. Also, as part of this release, we made the choice to not send back the `ThinkingPart`s to the provider - the idea is to save costs on behalf of the user. In the future, we intend to add a setting to customize this behavior. **v0.2.0 (2025-05-12)** See [#1647](https://github.com/pydantic/pydantic-ai/pull/1647) — usage makes sense as part of `ModelResponse`, and could be really useful in "messages" (really a sequence of requests and response). In this PR: - Adds `usage` to `ModelResponse` (field has a default factory of `Usage()` so it'll work to load data that doesn't have usage) - changes the return type of `Model.request` to just `ModelResponse` instead of `tuple[ModelResponse, Usage]` **v0.1.0 (2025-04-15)** See [#1248](https://github.com/pydantic/pydantic-ai/pull/1248) — the attribute/parameter name `result` was renamed to `output` in many places. Hopefully all changes keep a deprecated attribute or parameter with the old name, so you should get many deprecation warnings. See [#1484](https://github.com/pydantic/pydantic-ai/pull/1484) — `format_as_xml` was moved and made available to import from the package root, e.g. `from pydantic_ai import format_as_xml`. ## Full Changelog For the full changelog, see [GitHub Releases](https://github.com/pydantic/pydantic-ai/releases). # Getting Help If you need help getting started with Pydantic AI or with advanced usage, the following sources may be useful. ## Slack Join the `#pydantic-ai` channel in the [Pydantic Slack](https://logfire.pydantic.dev/docs/join-slack/) to ask questions, get help, and chat about Pydantic AI. There's also channels for Pydantic, Logfire, and FastUI. If you're on a [Logfire](https://pydantic.dev/logfire) Pro plan, you can also get a dedicated private slack collab channel with us. ## GitHub Issues The [Pydantic AI GitHub Issues](https://github.com/pydantic/pydantic-ai/issues) are a great place to ask questions and give us feedback. # Installation Pydantic AI is available on PyPI as [`pydantic-ai`](https://pypi.org/project/pydantic-ai/) so installation is as simple as: ```bash pip install pydantic-ai ``` ```bash uv add pydantic-ai ``` (Requires Python 3.10+) This installs the `pydantic_ai` package, core dependencies, and libraries required to use all the models included in Pydantic AI. If you want to install only those dependencies required to use a specific model, you can install the ["slim"](#slim-install) version of Pydantic AI. ## Use with Pydantic Logfire Pydantic AI has an excellent (but completely optional) integration with [Pydantic Logfire](https://pydantic.dev/logfire) to help you view and understand agent runs. Logfire comes included with `pydantic-ai` (but not the ["slim" version](#slim-install)), so you can typically start using it immediately by following the [Logfire setup docs](../logfire/#using-logfire). ## Running Examples We distribute the [`pydantic_ai_examples`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples) directory as a separate PyPI package ([`pydantic-ai-examples`](https://pypi.org/project/pydantic-ai-examples/)) to make examples extremely easy to customize and run. To install examples, use the `examples` optional group: ```bash pip install "pydantic-ai[examples]" ``` ```bash uv add "pydantic-ai[examples]" ``` To run the examples, follow instructions in the [examples docs](../examples/setup/). ## Slim Install If you know which model you're going to use and want to avoid installing superfluous packages, you can use the [`pydantic-ai-slim`](https://pypi.org/project/pydantic-ai-slim/) package. For example, if you're using just OpenAIChatModel, you would run: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` `pydantic-ai-slim` has the following optional groups: - `logfire` — installs [`logfire`](../logfire/) [PyPI ↗](https://pypi.org/project/logfire) - `evals` — installs [`pydantic-evals`](../evals/) [PyPI ↗](https://pypi.org/project/pydantic-evals) - `openai` — installs `openai` [PyPI ↗](https://pypi.org/project/openai) - `vertexai` — installs `google-auth` [PyPI ↗](https://pypi.org/project/google-auth) and `requests` [PyPI ↗](https://pypi.org/project/requests) - `google` — installs `google-genai` [PyPI ↗](https://pypi.org/project/google-genai) - `anthropic` — installs `anthropic` [PyPI ↗](https://pypi.org/project/anthropic) - `groq` — installs `groq` [PyPI ↗](https://pypi.org/project/groq) - `mistral` — installs `mistralai` [PyPI ↗](https://pypi.org/project/mistralai) - `cohere` - installs `cohere` [PyPI ↗](https://pypi.org/project/cohere) - `bedrock` - installs `boto3` [PyPI ↗](https://pypi.org/project/boto3) - `huggingface` - installs `huggingface-hub[inference]` [PyPI ↗](https://pypi.org/project/huggingface-hub) - `duckduckgo` - installs `ddgs` [PyPI ↗](https://pypi.org/project/ddgs) - `tavily` - installs `tavily-python` [PyPI ↗](https://pypi.org/project/tavily-python) - `cli` - installs `rich` [PyPI ↗](https://pypi.org/project/rich), `prompt-toolkit` [PyPI ↗](https://pypi.org/project/prompt-toolkit), and `argcomplete` [PyPI ↗](https://pypi.org/project/argcomplete) - `mcp` - installs `mcp` [PyPI ↗](https://pypi.org/project/mcp) - `a2a` - installs `fasta2a` [PyPI ↗](https://pypi.org/project/fasta2a) - `ag-ui` - installs `ag-ui-protocol` [PyPI ↗](https://pypi.org/project/ag-ui-protocol) and `starlette` [PyPI ↗](https://pypi.org/project/starlette) - `dbos` - installs [`dbos`](../durable_execution/dbos/) [PyPI ↗](https://pypi.org/project/dbos) - `prefect` - installs [`prefect`](../durable_execution/prefect/) [PyPI ↗](https://pypi.org/project/prefect) See the [models](../models/overview/) documentation for information on which optional dependencies are required for each model. You can also install dependencies for multiple models and use cases, for example: ```bash pip install "pydantic-ai-slim[openai,vertexai,logfire]" ``` ```bash uv add "pydantic-ai-slim[openai,vertexai,logfire]" ``` # Troubleshooting Below are suggestions on how to fix some common errors you might encounter while using Pydantic AI. If the issue you're experiencing is not listed below or addressed in the documentation, please feel free to ask in the [Pydantic Slack](../help/) or create an issue on [GitHub](https://github.com/pydantic/pydantic-ai/issues). ## Jupyter Notebook Errors ### `RuntimeError: This event loop is already running` This error is caused by conflicts between the event loops in Jupyter notebook and Pydantic AI's. One way to manage these conflicts is by using [`nest-asyncio`](https://pypi.org/project/nest-asyncio/). Namely, before you execute any agent runs, do the following: ```python import nest_asyncio nest_asyncio.apply() ``` Note: This fix also applies to Google Colab and [Marimo](https://github.com/marimo-team/marimo). ## API Key Configuration ### `UserError: API key must be provided or set in the [MODEL]_API_KEY environment variable` If you're running into issues with setting the API key for your model, visit the [Models](../models/overview/) page to learn more about how to set an environment variable and/or pass in an `api_key` argument. ## Monitoring HTTPX Requests You can use custom `httpx` clients in your models in order to access specific requests, responses, and headers at runtime. It's particularly helpful to use `logfire`'s [HTTPX integration](../logfire/#monitoring-http-requests) to monitor the above. # Concepts documentation # Agent2Agent (A2A) Protocol The [Agent2Agent (A2A) Protocol](https://google.github.io/A2A/) is an open standard introduced by Google that enables communication and interoperability between AI agents, regardless of the framework or vendor they are built on. At Pydantic, we built the [FastA2A](#fasta2a) library to make it easier to implement the A2A protocol in Python. We also built a convenience method that expose Pydantic AI agents as A2A servers - let's have a quick look at how to use it: agent_to_a2a.py ```py from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='Be fun!') app = agent.to_a2a() ``` *You can run the example with `uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000`* This will expose the agent as an A2A server, and you can start sending requests to it. See more about [exposing Pydantic AI agents as A2A servers](#pydantic-ai-agent-to-a2a-server). ## FastA2A **FastA2A** is an agentic framework agnostic implementation of the A2A protocol in Python. The library is designed to be used with any agentic framework, and is **not exclusive to Pydantic AI**. ### Design **FastA2A** is built on top of [Starlette](https://www.starlette.io), which means it's fully compatible with any ASGI server. Given the nature of the A2A protocol, it's important to understand the design before using it, as a developer you'll need to provide some components: - Storage: to save and load tasks, as well as store context for conversations - Broker: to schedule tasks - Worker: to execute tasks Let's have a look at how those components fit together: ``` flowchart TB Server["HTTP Server"] <--> |Sends Requests/
Receives Results| TM subgraph CC[Core Components] direction RL TM["TaskManager
(coordinates)"] --> |Schedules Tasks| Broker TM <--> Storage Broker["Broker
(queues & schedules)"] <--> Storage["Storage
(persistence)"] Broker --> |Delegates Execution| Worker end Worker["Worker
(implementation)"] ``` FastA2A allows you to bring your own Storage, Broker and Worker. #### Understanding Tasks and Context In the A2A protocol: - **Task**: Represents one complete execution of an agent. When a client sends a message to the agent, a new task is created. The agent runs until completion (or failure), and this entire execution is considered one task. The final output is stored as a task artifact. - **Context**: Represents a conversation thread that can span multiple tasks. The A2A protocol uses a `context_id` to maintain conversation continuity: - When a new message is sent without a `context_id`, the server generates a new one - Subsequent messages can include the same `context_id` to continue the conversation - All tasks sharing the same `context_id` have access to the complete message history #### Storage Architecture The Storage component serves two purposes: 1. **Task Storage**: Stores tasks in A2A protocol format, including their status, artifacts, and message history 1. **Context Storage**: Stores conversation context in a format optimized for the specific agent implementation This design allows for agents to store rich internal state (e.g., tool calls, reasoning traces) as well as store task-specific A2A-formatted messages and artifacts. For example, a Pydantic AI agent might store its complete internal message format (including tool calls and responses) in the context storage, while storing only the A2A-compliant messages in the task history. ### Installation FastA2A is available on PyPI as [`fasta2a`](https://pypi.org/project/fasta2a/) so installation is as simple as: ```bash pip install fasta2a ``` ```bash uv add fasta2a ``` The only dependencies are: - [starlette](https://www.starlette.io): to expose the A2A server as an [ASGI application](https://asgi.readthedocs.io/en/latest/) - [pydantic](https://pydantic.dev): to validate the request/response messages - [opentelemetry-api](https://opentelemetry-python.readthedocs.io/en/latest): to provide tracing capabilities You can install Pydantic AI with the `a2a` extra to include **FastA2A**: ```bash pip install 'pydantic-ai-slim[a2a]' ``` ```bash uv add 'pydantic-ai-slim[a2a]' ``` ### Pydantic AI Agent to A2A Server To expose a Pydantic AI agent as an A2A server, you can use the `to_a2a` method: agent_to_a2a.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='Be fun!') app = agent.to_a2a() ``` Since `app` is an ASGI application, it can be used with any ASGI server. ```bash uvicorn agent_to_a2a:app --host 0.0.0.0 --port 8000 ``` Since the goal of `to_a2a` is to be a convenience method, it accepts the same arguments as the FastA2A constructor. When using `to_a2a()`, Pydantic AI automatically: - Stores the complete conversation history (including tool calls and responses) in the context storage - Ensures that subsequent messages with the same `context_id` have access to the full conversation history - Persists agent results as A2A artifacts: - String results become `TextPart` artifacts and also appear in the message history - Structured data (Pydantic models, dataclasses, tuples, etc.) become `DataPart` artifacts with the data wrapped as `{"result": }` - Artifacts include metadata with type information and JSON schema when available # Agent User Interaction (AG-UI) Protocol The [Agent User Interaction (AG-UI) Protocol](https://docs.ag-ui.com/introduction) is an open standard introduced by the [CopilotKit](https://webflow.copilotkit.ai/blog/introducing-ag-ui-the-protocol-where-agents-meet-users) team that standardises how frontend applications communicate with AI agents, with support for streaming, frontend tools, shared state, and custom events. Note The AG-UI integration was originally built by the team at [Rocket Science](https://www.rocketscience.gg/) and contributed in collaboration with the Pydantic AI and CopilotKit teams. Thanks Rocket Science! ## Installation The only dependencies are: - [ag-ui-protocol](https://docs.ag-ui.com/introduction): to provide the AG-UI types and encoder. - [starlette](https://www.starlette.io): to handle [ASGI](https://asgi.readthedocs.io/en/latest/) requests from a framework like FastAPI. You can install Pydantic AI with the `ag-ui` extra to ensure you have all the required AG-UI dependencies: ```bash pip install 'pydantic-ai-slim[ag-ui]' ``` ```bash uv add 'pydantic-ai-slim[ag-ui]' ``` To run the examples you'll also need: - [uvicorn](https://www.uvicorn.org/) or another ASGI compatible server ```bash pip install uvicorn ``` ```bash uv add uvicorn ``` ## Usage There are three ways to run a Pydantic AI agent based on AG-UI run input with streamed AG-UI events as output, from most to least flexible. If you're using a Starlette-based web framework like FastAPI, you'll typically want to use the second method. 1. run_ag_ui() takes an agent and an AG-UI [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object, and returns a stream of AG-UI events encoded as strings. It also takes optional Agent.iter() arguments including `deps`. Use this if you're using a web framework not based on Starlette (e.g. Django or Flask) or want to modify the input or output some way. 1. handle_ag_ui_request() takes an agent and a Starlette request (e.g. from FastAPI) coming from an AG-UI frontend, and returns a streaming Starlette response of AG-UI events that you can return directly from your endpoint. It also takes optional Agent.iter() arguments including `deps`, that you can vary for each request (e.g. based on the authenticated user). 1. Agent.to_ag_ui() returns an ASGI application that handles every AG-UI request by running the agent. It also takes optional Agent.iter() arguments including `deps`, but these will be the same for each request, with the exception of the AG-UI state that's injected as described under [state management](#state-management). This ASGI app can be [mounted](https://fastapi.tiangolo.com/advanced/sub-applications/) at a given path in an existing FastAPI app. ### Handle run input and output directly This example uses run_ag_ui() and performs its own request parsing and response generation. This can be modified to work with any web framework. run_ag_ui.py ```py import json from http import HTTPStatus from ag_ui.core import RunAgentInput from fastapi import FastAPI from fastapi.requests import Request from fastapi.responses import Response, StreamingResponse from pydantic import ValidationError from pydantic_ai import Agent from pydantic_ai.ag_ui import SSE_CONTENT_TYPE, run_ag_ui agent = Agent('openai:gpt-4.1', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: run_input = RunAgentInput.model_validate(await request.json()) except ValidationError as e: # pragma: no cover return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) event_stream = run_ag_ui(agent, run_input, accept=accept) return StreamingResponse(event_stream, media_type=accept) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn run_ag_ui:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Handle a Starlette request This example uses handle_ag_ui_request() to directly handle a FastAPI request and return a response. Something analogous to this will work with any Starlette-based web framework. handle_ag_ui_request.py ```py from fastapi import FastAPI from starlette.requests import Request from starlette.responses import Response from pydantic_ai import Agent from pydantic_ai.ag_ui import handle_ag_ui_request agent = Agent('openai:gpt-4.1', instructions='Be fun!') app = FastAPI() @app.post('/') async def run_agent(request: Request) -> Response: return await handle_ag_ui_request(agent, request) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn handle_ag_ui_request:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ### Stand-alone ASGI app This example uses Agent.to_ag_ui() to turn the agent into a stand-alone ASGI application: agent_to_ag_ui.py ```py from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='Be fun!') app = agent.to_ag_ui() ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```shell uvicorn agent_to_ag_ui:app ``` This will expose the agent as an AG-UI server, and your frontend can start sending requests to it. ## Design The Pydantic AI AG-UI integration supports all features of the spec: - [Events](https://docs.ag-ui.com/concepts/events) - [Messages](https://docs.ag-ui.com/concepts/messages) - [State Management](https://docs.ag-ui.com/concepts/state) - [Tools](https://docs.ag-ui.com/concepts/tools) The integration receives messages in the form of a [`RunAgentInput`](https://docs.ag-ui.com/sdk/python/core/types#runagentinput) object that describes the details of the requested agent run including message history, state, and available tools. These are converted to Pydantic AI types and passed to the agent's run method. Events from the agent, including tool calls, are converted to AG-UI events and streamed back to the caller as Server-Sent Events (SSE). A user request may require multiple round trips between client UI and Pydantic AI server, depending on the tools and events needed. ## Features ### State management The integration provides full support for [AG-UI state management](https://docs.ag-ui.com/concepts/state), which enables real-time synchronization between agents and frontend applications. In the example below we have document state which is shared between the UI and server using the StateDeps [dependencies type](../dependencies/) that can be used to automatically validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/js/core/types#runagentinput) using a Pydantic `BaseModel` specified as a generic parameter. Custom dependencies type with AG-UI state If you want to use your own dependencies type to hold AG-UI state as well as other things, it needs to implements the StateHandler protocol, meaning it needs to be a [dataclass](https://docs.python.org/3/library/dataclasses.html) with a non-optional `state` field. This lets Pydantic AI ensure that state is properly isolated between requests by building a new dependencies object each time. If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method. ag_ui_state.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.ag_ui import StateDeps class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-4.1', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = agent.to_ag_ui(deps=StateDeps(DocumentState())) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```bash uvicorn ag_ui_state:app --host 0.0.0.0 --port 9000 ``` ### Tools AG-UI frontend tools are seamlessly provided to the Pydantic AI agent, enabling rich user experiences with frontend user interfaces. ### Events Pydantic AI tools can send [AG-UI events](https://docs.ag-ui.com/concepts/events) simply by returning a [`ToolReturn`](../tools-advanced/#advanced-tool-returns) object with a [`BaseEvent`](https://docs.ag-ui.com/sdk/python/core/events#baseevent) (or a list of events) as `metadata`, which allows for custom events and state updates. ag_ui_tool_events.py ```python from ag_ui.core import CustomEvent, EventType, StateSnapshotEvent from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ToolReturn from pydantic_ai.ag_ui import StateDeps class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent( 'openai:gpt-4.1', instructions='Be fun!', deps_type=StateDeps[DocumentState], ) app = agent.to_ag_ui(deps=StateDeps(DocumentState())) @agent.tool async def update_state(ctx: RunContext[StateDeps[DocumentState]]) -> ToolReturn: return ToolReturn( return_value='State updated', metadata=[ StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=ctx.deps.state, ), ], ) @agent.tool_plain async def custom_events() -> ToolReturn: return ToolReturn( return_value='Count events sent', metadata=[ CustomEvent( type=EventType.CUSTOM, name='count', value=1, ), CustomEvent( type=EventType.CUSTOM, name='count', value=2, ), ] ) ``` Since `app` is an ASGI application, it can be used with any ASGI server: ```bash uvicorn ag_ui_tool_events:app --host 0.0.0.0 --port 9000 ``` ## Examples For more examples of how to use to_ag_ui() see [`pydantic_ai_examples.ag_ui`](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/ag_ui), which includes a server for use with the [AG-UI Dojo](https://docs.ag-ui.com/tutorials/debugging#the-ag-ui-dojo). ## Introduction Agents are Pydantic AI's primary interface for interacting with LLMs. In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows. The Agent class has full API documentation, but conceptually you can think of an agent as a container for: | **Component** | **Description** | | --- | --- | | [Instructions](#instructions) | A set of instructions for the LLM written by the developer. | | [Function tool(s)](../tools/) and [toolsets](../toolsets/) | Functions that the LLM may call to get information while generating a response. | | [Structured output type](../output/) | The structured datatype the LLM must return at the end of a run, if specified. | | [Dependency type constraint](../dependencies/) | Dynamic instructions functions, tools, and output functions may all use dependencies when they're run. | | [LLM model](../api/models/base/) | Optional default LLM model associated with the agent. Can also be specified when running the agent. | | [Model Settings](#additional-configuration) | Optional default model settings to help fine tune requests. Can also be specified when running the agent. | In typing terms, agents are generic in their dependency and output types, e.g., an agent which required dependencies of type `Foobar` and produced outputs of type `list[str]` would have type `Agent[Foobar, list[str]]`. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use [static type checking](#static-type-checking) it should work well with Pydantic AI. Here's a toy example of an agent that simulates a roulette wheel: roulette_wheel.py ```python from pydantic_ai import Agent, RunContext roulette_agent = Agent( # (1)! 'openai:gpt-4o', deps_type=int, output_type=bool, system_prompt=( 'Use the `roulette_wheel` function to see if the ' 'customer has won based on the number they provide.' ), ) @roulette_agent.tool async def roulette_wheel(ctx: RunContext[int], square: int) -> str: # (2)! """check if the square is a winner""" return 'winner' if square == ctx.deps else 'loser' # Run the agent success_number = 18 # (3)! result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number) print(result.output) # (4)! #> True result = roulette_agent.run_sync('I bet five is the winner', deps=success_number) print(result.output) #> False ``` 1. Create an agent, which expects an integer dependency and produces a boolean output. This agent will have type `Agent[int, bool]`. 1. Define a tool that checks if the square is a winner. Here RunContext is parameterized with the dependency type `int`; if you got the dependency type wrong you'd get a typing error. 1. In reality, you might want to use a random number here e.g. `random.randint(0, 36)`. 1. `result.output` will be a boolean indicating if the square is a winner. Pydantic performs the output validation, and it'll be typed as a `bool` since its type is derived from the `output_type` generic parameter of the agent. Agents are designed for reuse, like FastAPI Apps Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small FastAPI app or an APIRouter. ## Running Agents There are five ways to run an agent: 1. agent.run() — an async function which returns a RunResult containing a completed response. 1. agent.run_sync() — a plain, synchronous function which returns a RunResult containing a completed response (internally, this just calls `loop.run_until_complete(self.run())`). 1. agent.run_stream() — an async context manager which returns a StreamedRunResult, which contains methods to stream text and structured output as an async iterable. 1. agent.run_stream_events() — a function which returns an async iterable of AgentStreamEvents and a AgentRunResultEvent containing the final run result. 1. agent.iter() — a context manager which returns an AgentRun, an async iterable over the nodes of the agent's underlying Graph. Here's a simple example demonstrating the first four: run_agent.py ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. async def main(): result = await agent.run('What is the capital of France?') print(result.output) #> The capital of France is Paris. async with agent.run_stream('What is the capital of the UK?') as response: async for text in response.stream_text(): print(text) #> The capital of #> The capital of the UK is #> The capital of the UK is London. events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of Mexico?'): events.append(event) print(events) """ [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='Mexico is Mexico ')), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='City.')), AgentRunResultEvent( result=AgentRunResult(output='The capital of Mexico is Mexico City.') ), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* You can also pass messages from previous runs to continue a conversation or provide context, as described in [Messages and Chat History](../message-history/). ### Streaming Events and Final Output As shown in the example above, run_stream() makes it easy to stream the agent's final output as it comes in. It also takes an optional `event_stream_handler` argument that you can use to gain insight into what is happening during the run before the final output is produced. The example below shows how to stream events and text output. You can also [stream structured output](../output/#streaming-structured-output). Note As the `run_stream()` method will consider the first output matching the [output type](../output/#structured-output) to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use agent.run_stream_events() or agent.iter() instead, as described in the following sections. run_stream_event_stream_handler.py ```python import asyncio from collections.abc import AsyncIterable from datetime import date from pydantic_ai import ( Agent, AgentStreamEvent, FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, RunContext, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) weather_agent = Agent( 'openai:gpt-4o', system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext, location: str, forecast_date: date, ) -> str: return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' output_messages: list[str] = [] async def handle_event(event: AgentStreamEvent): if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append(f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ThinkingPartDelta): output_messages.append(f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}') elif isinstance(event.delta, ToolCallPartDelta): output_messages.append(f'[Request] Part {event.index} args delta: {event.delta.args_delta}') elif isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append(f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}') elif isinstance(event, FinalResultEvent): output_messages.append(f'[Result] The model starting producing a final result (tool_name={event.tool_name})') async def event_stream_handler( ctx: RunContext, event_stream: AsyncIterable[AgentStreamEvent], ): async for event in event_stream: await handle_event(event) async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' async with weather_agent.run_stream(user_prompt, event_stream_handler=event_stream_handler) as run: async for output in run.stream_text(): output_messages.append(f'[Output] {output}') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` ### Streaming All Events Like `agent.run_stream()`, agent.run() takes an optional `event_stream_handler` argument that lets you stream all events from the model's streaming response and the agent's execution of tools. Unlike `run_stream()`, it always runs the agent graph to completion even if text was received ahead of tool calls that looked like it could've been the final result. For convenience, a agent.run_stream_events() method is also available as a wrapper around `run(event_stream_handler=...)`, which returns an async iterable of AgentStreamEvents and a AgentRunResultEvent containing the final run result. Note As they return raw events as they come in, the `run_stream_events()` and `run(event_stream_handler=...)` methods require you to piece together the streamed text and structured output yourself from the `PartStartEvent` and subsequent `PartDeltaEvent`s. To get the best of both worlds, at the expense of some additional complexity, you can use agent.iter() as described in the next section, which lets you [iterate over the agent graph](#iterating-over-an-agents-graph) and [stream both events and output](#streaming-all-events-and-output) at every step. run_events.py ```python import asyncio from pydantic_ai import AgentRunResultEvent from run_stream_event_stream_handler import handle_event, output_messages, weather_agent async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' async for event in weather_agent.run_stream_events(user_prompt): if isinstance(event, AgentRunResultEvent): output_messages.append(f'[Final Output] {event.result.output}') else: await handle_event(event) if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model starting producing a final result (tool_name=None)', "[Request] Part 0 text delta: 'warm and sunny '", "[Request] Part 0 text delta: 'in Paris on '", "[Request] Part 0 text delta: 'Tuesday.'", '[Final Output] It will be warm and sunny in Paris on Tuesday.', ] """ ``` *(This example is complete, it can be run "as is")* ### Iterating Over an Agent's Graph Under the hood, each `Agent` in Pydantic AI uses **pydantic-graph** to manage its execution flow. **pydantic-graph** is a generic, type-centric library for building and running finite state machines in Python. It doesn't actually depend on Pydantic AI — you can use it standalone for workflows that have nothing to do with GenAI — but Pydantic AI makes use of it to orchestrate the handling of model requests and model responses in an agent's run. In many scenarios, you don't need to worry about pydantic-graph at all; calling `agent.run(...)` simply traverses the underlying graph from start to finish. However, if you need deeper insight or control — for example to inject your own logic at specific stages — Pydantic AI exposes the lower-level iteration process via Agent.iter. This method returns an AgentRun, which you can async-iterate over, or manually drive node-by-node via the next method. Once the agent's graph returns an End, you have the final result along with a detailed history of all steps. #### `async for` iteration Here's an example of using `async for` with `iter` to record each node the agent executes: agent_iter_async_for.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: # Each node represents a step in the agent's execution nodes.append(node) print(nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ print(agent_run.result.output) #> The capital of France is Paris. ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* - The `AgentRun` is an async iterator that yields each node (`BaseNode` or `End`) in the flow. - The run ends when an `End` node is returned. #### Using `.next(...)` manually You can also drive the iteration manually by passing the node you want to run next to the `AgentRun.next(...)` method. This allows you to inspect or modify the node before it executes or skip nodes based on your own logic, and to catch errors in `next()` more easily: agent_iter_next.py ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: node = agent_run.next_node # (1)! all_nodes = [node] # Drive the iteration manually: while not isinstance(node, End): # (2)! node = await agent_run.next(node) # (3)! all_nodes.append(node) # (4)! print(all_nodes) """ [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] """ ``` 1. We start by grabbing the first node that will be run in the agent's graph. 1. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. 1. When you call `await agent_run.next(node)`, it executes that node in the agent's graph, updates the run's history, and returns the *next* node to run. 1. You could also inspect or mutate the new `node` here as needed. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* #### Accessing usage and final output You can retrieve usage statistics (tokens, requests, etc.) at any time from the AgentRun object via `agent_run.usage()`. This method returns a RunUsage object containing the usage data. Once the run finishes, `agent_run.result` becomes a AgentRunResult object containing the final output (and related metadata). #### Streaming All Events and Output Here is an example of streaming an agent run in combination with `async for` iteration: streaming_iter.py ```python import asyncio from dataclasses import dataclass from datetime import date from pydantic_ai import ( Agent, FinalResultEvent, FunctionToolCallEvent, FunctionToolResultEvent, PartDeltaEvent, PartStartEvent, RunContext, TextPartDelta, ThinkingPartDelta, ToolCallPartDelta, ) @dataclass class WeatherService: async def get_forecast(self, location: str, forecast_date: date) -> str: # In real code: call weather API, DB queries, etc. return f'The forecast in {location} on {forecast_date} is 24°C and sunny.' async def get_historic_weather(self, location: str, forecast_date: date) -> str: # In real code: call a historical weather API or DB return f'The weather in {location} on {forecast_date} was 18°C and partly cloudy.' weather_agent = Agent[WeatherService, str]( 'openai:gpt-4o', deps_type=WeatherService, output_type=str, # We'll produce a final answer as plain text system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool async def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date, ) -> str: if forecast_date >= date.today(): return await ctx.deps.get_forecast(location, forecast_date) else: return await ctx.deps.get_historic_weather(location, forecast_date) output_messages: list[str] = [] async def main(): user_prompt = 'What will the weather be like in Paris on Tuesday?' # Begin a node-by-node, streaming iteration async with weather_agent.iter(user_prompt, deps=WeatherService()) as run: async for node in run: if Agent.is_user_prompt_node(node): # A user prompt node => The user has provided input output_messages.append(f'=== UserPromptNode: {node.user_prompt} ===') elif Agent.is_model_request_node(node): # A model request node => We can stream tokens from the model's request output_messages.append('=== ModelRequestNode: streaming partial request tokens ===') async with node.stream(run.ctx) as request_stream: final_result_found = False async for event in request_stream: if isinstance(event, PartStartEvent): output_messages.append(f'[Request] Starting part {event.index}: {event.part!r}') elif isinstance(event, PartDeltaEvent): if isinstance(event.delta, TextPartDelta): output_messages.append( f'[Request] Part {event.index} text delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ThinkingPartDelta): output_messages.append( f'[Request] Part {event.index} thinking delta: {event.delta.content_delta!r}' ) elif isinstance(event.delta, ToolCallPartDelta): output_messages.append( f'[Request] Part {event.index} args delta: {event.delta.args_delta}' ) elif isinstance(event, FinalResultEvent): output_messages.append( f'[Result] The model started producing a final result (tool_name={event.tool_name})' ) final_result_found = True break if final_result_found: # Once the final result is found, we can call `AgentStream.stream_text()` to stream the text. # A similar `AgentStream.stream_output()` method is available to stream structured output. async for output in request_stream.stream_text(): output_messages.append(f'[Output] {output}') elif Agent.is_call_tools_node(node): # A handle-response node => The model returned some data, potentially calls a tool output_messages.append('=== CallToolsNode: streaming partial response & tool usage ===') async with node.stream(run.ctx) as handle_stream: async for event in handle_stream: if isinstance(event, FunctionToolCallEvent): output_messages.append( f'[Tools] The LLM calls tool={event.part.tool_name!r} with args={event.part.args} (tool_call_id={event.part.tool_call_id!r})' ) elif isinstance(event, FunctionToolResultEvent): output_messages.append( f'[Tools] Tool call {event.tool_call_id!r} returned => {event.result.content}' ) elif Agent.is_end_node(node): # Once an End node is reached, the agent run is complete assert run.result is not None assert run.result.output == node.data.output output_messages.append(f'=== Final Agent Output: {run.result.output} ===') if __name__ == '__main__': asyncio.run(main()) print(output_messages) """ [ '=== UserPromptNode: What will the weather be like in Paris on Tuesday? ===', '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: ToolCallPart(tool_name='weather_forecast', tool_call_id='0001')", '[Request] Part 0 args delta: {"location":"Pa', '[Request] Part 0 args delta: ris","forecast_', '[Request] Part 0 args delta: date":"2030-01-', '[Request] Part 0 args delta: 01"}', '=== CallToolsNode: streaming partial response & tool usage ===', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: TextPart(content='It will be ')", '[Result] The model started producing a final result (tool_name=None)', '[Output] It will be ', '[Output] It will be warm and sunny ', '[Output] It will be warm and sunny in Paris on ', '[Output] It will be warm and sunny in Paris on Tuesday.', '=== CallToolsNode: streaming partial response & tool usage ===', '=== Final Agent Output: It will be warm and sunny in Paris on Tuesday. ===', ] """ ``` *(This example is complete, it can be run "as is")* ### Additional Configuration #### Usage Limits Pydantic AI offers a UsageLimits structure to help you limit your usage (tokens, requests, and tool calls) on model runs. You can apply these settings by passing the `usage_limits` argument to the `run{_sync,_stream}` functions. Consider the following example, where we limit the number of response tokens: ```py from pydantic_ai import Agent, UsageLimitExceeded, UsageLimits agent = Agent('anthropic:claude-3-5-sonnet-latest') result_sync = agent.run_sync( 'What is the capital of Italy? Answer with just the city.', usage_limits=UsageLimits(response_tokens_limit=10), ) print(result_sync.output) #> Rome print(result_sync.usage()) #> RunUsage(input_tokens=62, output_tokens=1, requests=1) try: result_sync = agent.run_sync( 'What is the capital of Italy? Answer with a paragraph.', usage_limits=UsageLimits(response_tokens_limit=10), ) except UsageLimitExceeded as e: print(e) #> Exceeded the output_tokens_limit of 10 (output_tokens=32) ``` Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling: ```py from typing_extensions import TypedDict from pydantic_ai import Agent, ModelRetry, UsageLimitExceeded, UsageLimits class NeverOutputType(TypedDict): """ Never ever coerce data to this type. """ never_use_this: str agent = Agent( 'anthropic:claude-3-5-sonnet-latest', retries=3, output_type=NeverOutputType, system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.', ) @agent.tool_plain(retries=5) # (1)! def infinite_retry_tool() -> int: raise ModelRetry('Please try again.') try: result_sync = agent.run_sync( 'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3) # (2)! ) except UsageLimitExceeded as e: print(e) #> The next request would exceed the request_limit of 3 ``` 1. This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop. 1. This run will error after 3 requests, preventing the infinite tool calling. ##### Capping tool calls If you need a limit on the number of successful tool invocations within a single run, use `tool_calls_limit`: ```py from pydantic_ai import Agent from pydantic_ai.exceptions import UsageLimitExceeded from pydantic_ai.usage import UsageLimits agent = Agent('anthropic:claude-3-5-sonnet-latest') @agent.tool_plain def do_work() -> str: return 'ok' try: # Allow at most one executed tool call in this run agent.run_sync('Please call the tool twice', usage_limits=UsageLimits(tool_calls_limit=1)) except UsageLimitExceeded as e: print(e) #> The next tool call(s) would exceed the tool_calls_limit of 1 (tool_calls=2). ``` Note - Usage limits are especially relevant if you've registered many tools. Use `request_limit` to bound the number of model turns, and `tool_calls_limit` to cap the number of successful tool executions within a run. - The `tool_calls_limit` is checked before executing tool calls. If the model returns parallel tool calls that would exceed the limit, no tools will be executed. #### Model (Run) Settings Pydantic AI offers a settings.ModelSettings structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as `temperature`, `max_tokens`, `timeout`, and more. There are three ways to apply these settings, with a clear precedence order: 1. **Model-level defaults** - Set when creating a model instance via the `settings` parameter. These serve as the base defaults for that model. 1. **Agent-level defaults** - Set during Agent initialization via the `model_settings` argument. These are merged with model defaults, with agent settings taking precedence. 1. **Run-time overrides** - Passed to `run{_sync,_stream}` functions via the `model_settings` argument. These have the highest priority and are merged with the combined agent and model defaults. For example, if you'd like to set the `temperature` setting to `0.0` to ensure less random behavior, you can do the following: ```py from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.openai import OpenAIChatModel # 1. Model-level defaults model = OpenAIChatModel( 'gpt-4o', settings=ModelSettings(temperature=0.8, max_tokens=500) # Base defaults ) # 2. Agent-level defaults (overrides model defaults by merging) agent = Agent(model, model_settings=ModelSettings(temperature=0.5)) # 3. Run-time overrides (highest priority) result_sync = agent.run_sync( 'What is the capital of Italy?', model_settings=ModelSettings(temperature=0.0) # Final temperature: 0.0 ) print(result_sync.output) #> The capital of Italy is Rome. ``` The final request uses `temperature=0.0` (run-time), `max_tokens=500` (from model), demonstrating how settings merge with run-time taking precedence. Model Settings Support Model-level settings are supported by all concrete model implementations (OpenAI, Anthropic, Google, etc.). Wrapper models like `FallbackModel`, `WrapperModel`, and `InstrumentedModel` don't have their own settings - they use the settings of their underlying models. ### Model specific settings If you wish to further customize model behavior, you can use a subclass of ModelSettings, like GoogleModelSettings, associated with your model of choice. For example: ```py from pydantic_ai import Agent, UnexpectedModelBehavior from pydantic_ai.models.google import GoogleModelSettings agent = Agent('google-gla:gemini-1.5-flash') try: result = agent.run_sync( 'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:', model_settings=GoogleModelSettings( temperature=0.0, # general model settings can also be specified gemini_safety_settings=[ { 'category': 'HARM_CATEGORY_HARASSMENT', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, { 'category': 'HARM_CATEGORY_HATE_SPEECH', 'threshold': 'BLOCK_LOW_AND_ABOVE', }, ], ), ) except UnexpectedModelBehavior as e: print(e) # (1)! """ Safety settings triggered, body: """ ``` 1. This error is raised because the safety thresholds were exceeded. ## Runs vs. Conversations An agent **run** might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a **conversation** might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls. Here's an example of a conversation comprised of multiple runs: conversation_example.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') # First run result1 = agent.run_sync('Who was Albert Einstein?') print(result1.output) #> Albert Einstein was a German-born theoretical physicist. # Second run, passing previous messages result2 = agent.run_sync( 'What was his most famous equation?', message_history=result1.new_messages(), # (1)! ) print(result2.output) #> Albert Einstein's most famous equation is (E = mc^2). ``` 1. Continue the conversation; without `message_history` the model would not know who "his" was referring to. *(This example is complete, it can be run "as is")* ## Type safe by design Pydantic AI is designed to work well with static type checkers, like mypy and pyright. Typing is (somewhat) optional Pydantic AI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time. That said, because Pydantic AI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the `output_type` arguments to Agent) are used at runtime. We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an [issue](https://github.com/pydantic/pydantic-ai/issues) explaining what's annoying you! In particular, agents are generic in both the type of their dependencies and the type of the outputs they return, so you can use the type hints to ensure you're using the right types. Consider the following script with type mistakes: type_mistakes.py ```python from dataclasses import dataclass from pydantic_ai import Agent, RunContext @dataclass class User: name: str agent = Agent( 'test', deps_type=User, # (1)! output_type=bool, ) @agent.system_prompt def add_user_name(ctx: RunContext[str]) -> str: # (2)! return f"The user's name is {ctx.deps}." def foobar(x: bytes) -> None: pass result = agent.run_sync('Does their name start with "A"?', deps=User('Anne')) foobar(result.output) # (3)! ``` 1. The agent is defined as expecting an instance of `User` as `deps`. 1. But here `add_user_name` is defined as taking a `str` as the dependency, not a `User`. 1. Since the agent is defined as returning a `bool`, this will raise a type error since `foobar` expects `bytes`. Running `mypy` on this will give the following output: ```bash ➤ uv run mypy type_mistakes.py type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]" [arg-type] type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes" [arg-type] Found 2 errors in 1 file (checked 1 source file) ``` Running `pyright` would identify the same issues. ## System Prompts System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want. Tip For most use cases, you should use `instructions` instead of "system prompts". If you know what you are doing though and want to preserve system prompt messages in the message history sent to the LLM in subsequent completions requests, you can achieve this using the `system_prompt` argument/decorator. See the section below on [Instructions](#instructions) for more information. Generally, system prompts fall into two categories: 1. **Static system prompts**: These are known when writing the code and can be defined via the `system_prompt` parameter of the Agent constructor. 1. **Dynamic system prompts**: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with @agent.system_prompt. You can add both to a single agent; they're appended in the order they're defined at runtime. Here's an example using both types of system prompts: system_prompts.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-4o', deps_type=str, # (1)! system_prompt="Use the customer's name while replying to them.", # (2)! ) @agent.system_prompt # (3)! def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.system_prompt def add_the_date() -> str: # (4)! return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` 1. The agent expects a string dependency. 1. Static system prompt defined at agent creation time. 1. Dynamic system prompt defined via a decorator with RunContext, this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. 1. Another dynamic system prompt, system prompts don't have to have the `RunContext` parameter. *(This example is complete, it can be run "as is")* ## Instructions Instructions are similar to system prompts. The main difference is that when an explicit `message_history` is provided in a call to `Agent.run` and similar methods, *instructions* from any existing messages in the history are not included in the request to the model — only the instructions of the *current* agent are included. You should use: - `instructions` when you want your request to the model to only include system prompts for the *current* agent - `system_prompt` when you want your request to the model to *retain* the system prompts used in previous requests (possibly made using other agents) In general, we recommend using `instructions` instead of `system_prompt` unless you have a specific reason to use `system_prompt`. Instructions, like system prompts, fall into two categories: 1. **Static instructions**: These are known when writing the code and can be defined via the `instructions` parameter of the Agent constructor. 1. **Dynamic instructions**: These rely on context that is only available at runtime and should be defined using functions decorated with @agent.instructions. Unlike dynamic system prompts, which may be reused when `message_history` is present, dynamic instructions are always reevaluated. Both static and dynamic instructions can be added to a single agent, and they are appended in the order they are defined at runtime. Here's an example using both types of instructions: instructions.py ```python from datetime import date from pydantic_ai import Agent, RunContext agent = Agent( 'openai:gpt-4o', deps_type=str, # (1)! instructions="Use the customer's name while replying to them.", # (2)! ) @agent.instructions # (3)! def add_the_users_name(ctx: RunContext[str]) -> str: return f"The user's name is {ctx.deps}." @agent.instructions def add_the_date() -> str: # (4)! return f'The date is {date.today()}.' result = agent.run_sync('What is the date?', deps='Frank') print(result.output) #> Hello Frank, the date today is 2032-01-02. ``` 1. The agent expects a string dependency. 1. Static instructions defined at agent creation time. 1. Dynamic instructions defined via a decorator with RunContext, this is called just after `run_sync`, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. 1. Another dynamic instruction, instructions don't have to have the `RunContext` parameter. *(This example is complete, it can be run "as is")* Note that returning an empty string will result in no instruction message added. ## Reflection and self-correction Validation errors from both function tool parameter validation and [structured output validation](../output/#structured-output) can be passed back to the model with a request to retry. You can also raise ModelRetry from within a [tool](../tools/) or [output function](../output/#output-functions) to tell the model it should retry generating a response. - The default retry count is **1** but can be altered for the entire agent, a specific tool, or outputs. - You can access the current retry count from within a tool or output function via ctx.retry. Here's an example: tool_retry.py ```python from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry from fake_database import DatabaseConn class ChatResult(BaseModel): user_id: int message: str agent = Agent( 'openai:gpt-4o', deps_type=DatabaseConn, output_type=ChatResult, ) @agent.tool(retries=2) def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int: """Get a user's ID from their full name.""" print(name) #> John #> John Doe user_id = ctx.deps.users.get(name=name) if user_id is None: raise ModelRetry( f'No user found with name {name!r}, remember to provide their full name' ) return user_id result = agent.run_sync( 'Send a message to John Doe asking for coffee next week', deps=DatabaseConn() ) print(result.output) """ user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!' """ ``` ## Model errors If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns `503`), agent runs will raise UnexpectedModelBehavior. In these cases, capture_run_messages can be used to access the messages exchanged during the run to help diagnose the issue. agent_model_errors.py ```python from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages agent = Agent('openai:gpt-4o') @agent.tool_plain def calc_volume(size: int) -> int: # (1)! if size == 42: return size**3 else: raise ModelRetry('Please try again.') with capture_run_messages() as messages: # (2)! try: result = agent.run_sync('Please get me the volume of a box with size 6.') except UnexpectedModelBehavior as e: print('An error occurred:', e) #> An error occurred: Tool 'calc_volume' exceeded max retries count of 1 print('cause:', repr(e.__cause__)) #> cause: ModelRetry('Please try again.') print('messages:', messages) """ messages: [ ModelRequest( parts=[ UserPromptPart( content='Please get me the volume of a box with size 6.', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=62, output_tokens=4), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ RetryPromptPart( content='Please try again.', tool_name='calc_volume', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=72, output_tokens=8), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ else: print(result.output) ``` 1. Define a tool that will raise `ModelRetry` repeatedly in this case. 1. capture_run_messages is used to capture the messages exchanged during the run. *(This example is complete, it can be run "as is")* Note If you call run, run_sync, or run_stream more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. # Built-in Tools Built-in tools are native tools provided by LLM providers that can be used to enhance your agent's capabilities. Unlike [common tools](../common-tools/), which are custom implementations that Pydantic AI executes, built-in tools are executed directly by the model provider. ## Overview Pydantic AI supports the following built-in tools: - **WebSearchTool**: Allows agents to search the web - **CodeExecutionTool**: Enables agents to execute code in a secure environment - **ImageGenerationTool**: Enables agents to generate images - **UrlContextTool**: Enables agents to pull URL contents into their context - **MemoryTool**: Enables agents to use memory These tools are passed to the agent via the `builtin_tools` parameter and are executed by the model provider's infrastructure. Provider Support Not all model providers support built-in tools. If you use a built-in tool with an unsupported provider, Pydantic AI will raise a UserError when you try to run the agent. If a provider supports a built-in tool that is not currently supported by Pydantic AI, please file an issue. ## Web Search Tool The WebSearchTool allows your agent to search the web, making it ideal for queries that require up-to-date data. ### Provider Support | Provider | Supported | Notes | | --- | --- | --- | | OpenAI Responses | ✅ | Full feature support. To include search results on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls, enable the OpenAIResponsesModelSettings.openai_include_web_search_sources [model setting](../agents/#model-run-settings). | | Anthropic | ✅ | Full feature support | | Google | ✅ | No parameter support. No BuiltinToolCallPart or BuiltinToolReturnPart is generated when streaming. Using built-in tools and user tools (including [output tools](../output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](../output/#prompted-output) instead. | | Groq | ✅ | Limited parameter support. To use web search capabilities with Groq, you need to use the [compound models](https://console.groq.com/docs/compound). | | OpenAI Chat Completions | ❌ | Not supported | | Bedrock | ❌ | Not supported | | Mistral | ❌ | Not supported | | Cohere | ❌ | Not supported | | HuggingFace | ❌ | Not supported | ### Usage web_search_anthropic.py ```py from pydantic_ai import Agent, WebSearchTool agent = Agent('anthropic:claude-sonnet-4-0', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` *(This example is complete, it can be run "as is")* With OpenAI, you must use their responses API to access the web search tool. web_search_openai.py ```py from pydantic_ai import Agent, WebSearchTool agent = Agent('openai-responses:gpt-4.1', builtin_tools=[WebSearchTool()]) result = agent.run_sync('Give me a sentence with the biggest news in AI this week.') print(result.output) #> Scientists have developed a universal AI detector that can identify deepfake videos. ``` *(This example is complete, it can be run "as is")* ### Configuration Options The `WebSearchTool` supports several configuration parameters: web_search_configured.py ```py from pydantic_ai import Agent, WebSearchTool, WebSearchUserLocation agent = Agent( 'anthropic:claude-sonnet-4-0', builtin_tools=[ WebSearchTool( search_context_size='high', user_location=WebSearchUserLocation( city='San Francisco', country='US', region='CA', timezone='America/Los_Angeles', ), blocked_domains=['example.com', 'spam-site.net'], allowed_domains=None, # Cannot use both blocked_domains and allowed_domains with Anthropic max_uses=5, # Anthropic only: limit tool usage ) ], ) result = agent.run_sync('Use the web to get the current time.') print(result.output) #> In San Francisco, it's 8:21:41 pm PDT on Wednesday, August 6, 2025. ``` *(This example is complete, it can be run "as is")* #### Provider Support | Parameter | OpenAI | Anthropic | Groq | | --- | --- | --- | --- | | `search_context_size` | ✅ | ❌ | ❌ | | `user_location` | ✅ | ✅ | ❌ | | `blocked_domains` | ❌ | ✅ | ✅ | | `allowed_domains` | ❌ | ✅ | ✅ | | `max_uses` | ❌ | ✅ | ❌ | Anthropic Domain Filtering With Anthropic, you can only use either `blocked_domains` or `allowed_domains`, not both. ## Code Execution Tool The CodeExecutionTool enables your agent to execute code in a secure environment, making it perfect for computational tasks, data analysis, and mathematical operations. ### Provider Support | Provider | Supported | Notes | | --- | --- | --- | | OpenAI | ✅ | To include code execution output on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls, enable the OpenAIResponsesModelSettings.openai_include_code_execution_outputs [model setting](../agents/#model-run-settings). If the code execution generated images, like charts, they will be available on ModelResponse.images as BinaryImage objects. The generated image can also be used as [image output](../output/#image-output) for the agent run. | | Google | ✅ | Using built-in tools and user tools (including [output tools](../output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](../output/#prompted-output) instead. | | Anthropic | ✅ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage code_execution_basic.py ```py from pydantic_ai import Agent, CodeExecutionTool agent = Agent('anthropic:claude-sonnet-4-0', builtin_tools=[CodeExecutionTool()]) result = agent.run_sync('Calculate the factorial of 15.') print(result.output) #> The factorial of 15 is **1,307,674,368,000**. print(result.response.builtin_tool_calls) """ [ ( BuiltinToolCallPart( tool_name='code_execution', args={ 'code': 'import math\n\n# Calculate factorial of 15\nresult = math.factorial(15)\nprint(f"15! = {result}")\n\n# Let\'s also show it in a more readable format with commas\nprint(f"15! = {result:,}")' }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', provider_name='anthropic', ), BuiltinToolReturnPart( tool_name='code_execution', content={ 'content': [], 'return_code': 0, 'stderr': '', 'stdout': '15! = 1307674368000\n15! = 1,307,674,368,000', 'type': 'code_execution_result', }, tool_call_id='srvtoolu_017qRH1J3XrhnpjP2XtzPCmJ', timestamp=datetime.datetime(...), provider_name='anthropic', ), ) ] """ ``` *(This example is complete, it can be run "as is")* In addition to text output, code execution with OpenAI can generate images as part of their response. Accessing this image via ModelResponse.images or [image output](../output/#image-output) requires the OpenAIResponsesModelSettings.openai_include_code_execution_outputs [model setting](../agents/#model-run-settings) to be enabled. code_execution_openai.py ```py from pydantic_ai import Agent, BinaryImage, CodeExecutionTool from pydantic_ai.models.openai import OpenAIResponsesModelSettings agent = Agent( 'openai-responses:gpt-5', builtin_tools=[CodeExecutionTool()], output_type=BinaryImage, model_settings=OpenAIResponsesModelSettings(openai_include_code_execution_outputs=True), ) result = agent.run_sync('Generate a chart of y=x^2 for x=-5 to 5.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* ## Image Generation Tool The ImageGenerationTool enables your agent to generate images. ### Provider Support | Provider | Supported | Notes | | --- | --- | --- | | OpenAI Responses | ✅ | Full feature support. Only supported by models newer than `gpt-4o`. Metadata about the generated image, like the [`revised_prompt`](https://platform.openai.com/docs/guides/tools-image-generation#revised-prompt) sent to the underlying image model, is available on the BuiltinToolReturnPart that's available via ModelResponse.builtin_tool_calls. | | Google | ✅ | No parameter support. Only supported by [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) like `gemini-2.5-flash-image`. These models do not support [structured output](../output/) or [function tools](../tools/). These models will always generate images, even if this built-in tool is not explicitly specified. | | Anthropic | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage Generated images are available on ModelResponse.images as BinaryImage objects: image_generation_openai.py ```py from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent('openai-responses:gpt-5', builtin_tools=[ImageGenerationTool()]) result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` *(This example is complete, it can be run "as is")* Image generation with Google [image generation models](https://ai.google.dev/gemini-api/docs/image-generation) does not require the `ImageGenerationTool` built-in tool to be explicitly specified: image_generation_google.py ```py from pydantic_ai import Agent, BinaryImage agent = Agent('google-gla:gemini-2.5-flash-image') result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ assert isinstance(result.response.images[0], BinaryImage) ``` *(This example is complete, it can be run "as is")* The `ImageGenerationTool` can be used together with `output_type=BinaryImage` to get [image output](../output/#image-output). If the `ImageGenerationTool` built-in tool is not explicitly specified, it will be enabled automatically: image_generation_output.py ```py from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* ### Configuration Options The `ImageGenerationTool` supports several configuration parameters: image_generation_configured.py ```py from pydantic_ai import Agent, BinaryImage, ImageGenerationTool agent = Agent( 'openai-responses:gpt-5', builtin_tools=[ ImageGenerationTool( background='transparent', input_fidelity='high', moderation='low', output_compression=100, output_format='png', partial_images=3, quality='high', size='1024x1024', ) ], output_type=BinaryImage, ) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* For more details, check the API documentation. #### Provider Support | Parameter | OpenAI | Google | | --- | --- | --- | | `background` | ✅ | ❌ | | `input_fidelity` | ✅ | ❌ | | `moderation` | ✅ | ❌ | | `output_compression` | ✅ | ❌ | | `output_format` | ✅ | ❌ | | `partial_images` | ✅ | ❌ | | `quality` | ✅ | ❌ | | `size` | ✅ | ❌ | ## URL Context Tool The UrlContextTool enables your agent to pull URL contents into its context, allowing it to pull up-to-date information from the web. ### Provider Support | Provider | Supported | Notes | | --- | --- | --- | | Google | ✅ | No BuiltinToolCallPart or BuiltinToolReturnPart is currently generated; please submit an issue if you need this. Using built-in tools and user tools (including [output tools](../output/#tool-output)) at the same time is not supported; to use structured output, use [`PromptedOutput`](../output/#prompted-output) instead. | | OpenAI | ❌ | | | Anthropic | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage url_context_basic.py ```py from pydantic_ai import Agent, UrlContextTool agent = Agent('google-gla:gemini-2.5-flash', builtin_tools=[UrlContextTool()]) result = agent.run_sync('What is this? https://ai.pydantic.dev') print(result.output) #> A Python agent framework for building Generative AI applications. ``` *(This example is complete, it can be run "as is")* ## Memory Tool The MemoryTool enables your agent to use memory. ### Provider Support | Provider | Supported | Notes | | --- | --- | --- | | Anthropic | ✅ | Requires a tool named `memory` to be defined that implements [specific sub-commands](https://docs.claude.com/en/docs/agents-and-tools/tool-use/memory-tool#tool-commands). You can use a subclass of [`anthropic.lib.tools.BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) as documented below. | | Google | ❌ | | | OpenAI | ❌ | | | Groq | ❌ | | | Bedrock | ❌ | | | Mistral | ❌ | | | Cohere | ❌ | | | HuggingFace | ❌ | | ### Usage The Anthropic SDK provides an abstract [`BetaAbstractMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/tools/_beta_builtin_memory_tool.py) class that you can subclass to create your own memory storage solution (e.g., database, cloud storage, encrypted files, etc.). Their [`LocalFilesystemMemoryTool`](https://github.com/anthropics/anthropic-sdk-python/blob/main/examples/memory/basic.py) example can serve as a starting point. The following example uses a subclass that hard-codes a specific memory. The bits specific to Pydantic AI are the `MemoryTool` built-in tool and the `memory` tool definition that forwards commands to the `call` method of the `BetaAbstractMemoryTool` subclass. anthropic_memory.py ```py from typing import Any from anthropic.lib.tools import BetaAbstractMemoryTool from anthropic.types.beta import ( BetaMemoryTool20250818CreateCommand, BetaMemoryTool20250818DeleteCommand, BetaMemoryTool20250818InsertCommand, BetaMemoryTool20250818RenameCommand, BetaMemoryTool20250818StrReplaceCommand, BetaMemoryTool20250818ViewCommand, ) from pydantic_ai import Agent, MemoryTool class FakeMemoryTool(BetaAbstractMemoryTool): def view(self, command: BetaMemoryTool20250818ViewCommand) -> str: return 'The user lives in Mexico City.' def create(self, command: BetaMemoryTool20250818CreateCommand) -> str: return f'File created successfully at {command.path}' def str_replace(self, command: BetaMemoryTool20250818StrReplaceCommand) -> str: return f'File {command.path} has been edited' def insert(self, command: BetaMemoryTool20250818InsertCommand) -> str: return f'Text inserted at line {command.insert_line} in {command.path}' def delete(self, command: BetaMemoryTool20250818DeleteCommand) -> str: return f'File deleted: {command.path}' def rename(self, command: BetaMemoryTool20250818RenameCommand) -> str: return f'Renamed {command.old_path} to {command.new_path}' def clear_all_memory(self) -> str: return 'All memory cleared' fake_memory = FakeMemoryTool() agent = Agent('anthropic:claude-sonnet-4-5', builtin_tools=[MemoryTool()]) @agent.tool_plain def memory(**command: Any) -> Any: return fake_memory.call(command) result = agent.run_sync('Remember that I live in Mexico City') print(result.output) """ Got it! I've recorded that you live in Mexico City. I'll remember this for future reference. """ result = agent.run_sync('Where do I live?') print(result.output) #> You live in Mexico City. ``` *(This example is complete, it can be run "as is")* ## API Reference For complete API documentation, see the [API Reference](../api/builtin_tools/). # Common Tools Pydantic AI ships with native tools that can be used to enhance your agent's capabilities. ## DuckDuckGo Search Tool The DuckDuckGo search tool allows you to search the web for information. It is built on top of the [DuckDuckGo API](https://github.com/deedy5/ddgs). ### Installation To use duckduckgo_search_tool, you need to install [`pydantic-ai-slim`](../install/#slim-install) with the `duckduckgo` optional group: ```bash pip install "pydantic-ai-slim[duckduckgo]" ``` ```bash uv add "pydantic-ai-slim[duckduckgo]" ``` ### Usage Here's an example of how you can use the DuckDuckGo search tool with an agent: duckduckgo_search.py ```py from pydantic_ai import Agent from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool agent = Agent( 'openai:o3-mini', tools=[duckduckgo_search_tool()], system_prompt='Search DuckDuckGo for the given query and return the results.', ) result = agent.run_sync( 'Can you list the top five highest-grossing animated films of 2025?' ) print(result.output) """ I looked into several sources on animated box‐office performance in 2025, and while detailed rankings can shift as more money is tallied, multiple independent reports have already highlighted a couple of record‐breaking shows. For example: • Ne Zha 2 – News outlets (Variety, Wikipedia's "List of animated feature films of 2025", and others) have reported that this Chinese title not only became the highest‑grossing animated film of 2025 but also broke records as the highest‑grossing non‑English animated film ever. One article noted its run exceeded US$1.7 billion. • Inside Out 2 – According to data shared on Statista and in industry news, this Pixar sequel has been on pace to set new records (with some sources even noting it as the highest‑grossing animated film ever, as of January 2025). Beyond those two, some entertainment trade sites (for example, a Just Jared article titled "Top 10 Highest-Earning Animated Films at the Box Office Revealed") have begun listing a broader top‑10. Although full consolidated figures can sometimes differ by source and are updated daily during a box‑office run, many of the industry trackers have begun to single out five films as the biggest earners so far in 2025. Unfortunately, although multiple articles discuss the "top animated films" of 2025, there isn't yet a single, universally accepted list with final numbers that names the complete top five. (Box‑office rankings, especially mid‑year, can be fluid as films continue to add to their totals.) Based on what several sources note so far, the two undisputed leaders are: 1. Ne Zha 2 2. Inside Out 2 The remaining top spots (3–5) are reported by some outlets in their "Top‑10 Animated Films" lists for 2025 but the titles and order can vary depending on the source and the exact cut‑off date of the data. For the most up‑to‑date and detailed ranking (including the 3rd, 4th, and 5th highest‑grossing films), I recommend checking resources like: • Wikipedia's "List of animated feature films of 2025" page • Box‑office tracking sites (such as Box Office Mojo or The Numbers) • Trade articles like the one on Just Jared To summarize with what is clear from the current reporting: 1. Ne Zha 2 2. Inside Out 2 3–5. Other animated films (yet to be definitively finalized across all reporting outlets) If you're looking for a final, consensus list of the top five, it may be best to wait until the 2025 year‑end box‑office tallies are in or to consult a regularly updated entertainment industry source. Would you like help finding a current source or additional details on where to look for the complete updated list? """ ``` ## Tavily Search Tool Info Tavily is a paid service, but they have free credits to explore their product. You need to [sign up for an account](https://app.tavily.com/home) and get an API key to use the Tavily search tool. The Tavily search tool allows you to search the web for information. It is built on top of the [Tavily API](https://tavily.com/). ### Installation To use tavily_search_tool, you need to install [`pydantic-ai-slim`](../install/#slim-install) with the `tavily` optional group: ```bash pip install "pydantic-ai-slim[tavily]" ``` ```bash uv add "pydantic-ai-slim[tavily]" ``` ### Usage Here's an example of how you can use the Tavily search tool with an agent: tavily_search.py ```py import os from pydantic_ai import Agent from pydantic_ai.common_tools.tavily import tavily_search_tool api_key = os.getenv('TAVILY_API_KEY') assert api_key is not None agent = Agent( 'openai:o3-mini', tools=[tavily_search_tool(api_key)], system_prompt='Search Tavily for the given query and return the results.', ) result = agent.run_sync('Tell me the top news in the GenAI world, give me links.') print(result.output) """ Here are some of the top recent news articles related to GenAI: 1. How CLEAR users can improve risk analysis with GenAI – Thomson Reuters Read more: https://legal.thomsonreuters.com/blog/how-clear-users-can-improve-risk-analysis-with-genai/ (This article discusses how CLEAR's new GenAI-powered tool streamlines risk analysis by quickly summarizing key information from various public data sources.) 2. TELUS Digital Survey Reveals Enterprise Employees Are Entering Sensitive Data Into AI Assistants More Than You Think – FT.com Read more: https://markets.ft.com/data/announce/detail?dockey=600-202502260645BIZWIRE_USPRX____20250226_BW490609-1 (This news piece highlights findings from a TELUS Digital survey showing that many enterprise employees use public GenAI tools and sometimes even enter sensitive data.) 3. The Essential Guide to Generative AI – Virtualization Review Read more: https://virtualizationreview.com/Whitepapers/2025/02/SNOWFLAKE-The-Essential-Guide-to-Generative-AI.aspx (This guide provides insights into how GenAI is revolutionizing enterprise strategies and productivity, with input from industry leaders.) Feel free to click on the links to dive deeper into each story! """ ``` # Deferred Tools There are a few scenarios where the model should be able to call a tool that should not or cannot be executed during the same agent run inside the same Python process: - it may need to be approved by the user first - it may depend on an upstream service, frontend, or user to provide the result - the result could take longer to generate than it's reasonable to keep the agent process running To support these use cases, Pydantic AI provides the concept of deferred tools, which come in two flavors documented below: - tools that [require approval](#human-in-the-loop-tool-approval) - tools that are [executed externally](#external-tool-execution) When the model calls a deferred tool, the agent run will end with a DeferredToolRequests output object containing information about the deferred tool calls. Once the approvals and/or results are ready, a new agent run can then be started with the original run's [message history](../message-history/) plus a DeferredToolResults object holding results for each tool call in `DeferredToolRequests`, which will continue the original run where it left off. Note that handling deferred tool calls requires `DeferredToolRequests` to be in the `Agent`'s [`output_type`](../output/#structured-output) so that the possible types of the agent run output are correctly inferred. If your agent can also be used in a context where no deferred tools are available and you don't want to deal with that type everywhere you use the agent, you can instead pass the `output_type` argument when you run the agent using agent.run(), agent.run_sync(), agent.run_stream(), or agent.iter(). Note that the run-time `output_type` overrides the one specified at construction time (for type inference reasons), so you'll need to include the original output type explicitly. ## Human-in-the-Loop Tool Approval If a tool function always requires approval, you can pass the `requires_approval=True` argument to the @agent.tool decorator, @agent.tool_plain decorator, Tool class, FunctionToolset.tool decorator, or FunctionToolset.add_function() method. Inside the function, you can then assume that the tool call has been approved. If whether a tool function requires approval depends on the tool call arguments or the agent run context (e.g. [dependencies](../dependencies/) or message history), you can raise the ApprovalRequired exception from the tool function. The RunContext.tool_call_approved property will be `True` if the tool call has already been approved. To require approval for calls to tools provided by a [toolset](../toolsets/) (like an [MCP server](../mcp/client/)), see the [`ApprovalRequiredToolset` documentation](../toolsets/#requiring-tool-approval). When the model calls a tool that requires approval, the agent run will end with a DeferredToolRequests output object with an `approvals` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID. Once you've gathered the user's approvals or denials, you can build a DeferredToolResults object with an `approvals` dictionary that maps each tool call ID to a boolean, a ToolApproved object (with optional `override_args`), or a ToolDenied object (with an optional custom `message` to provide to the model). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](../message-history/). Here's an example that shows how to require approval for all file deletions, and for updates of specific protected files: tool_requires_approval.py ```python from pydantic_ai import ( Agent, ApprovalRequired, DeferredToolRequests, DeferredToolResults, RunContext, ToolDenied, ) agent = Agent('openai:gpt-5', output_type=[str, DeferredToolRequests]) PROTECTED_FILES = {'.env'} @agent.tool def update_file(ctx: RunContext, path: str, content: str) -> str: if path in PROTECTED_FILES and not ctx.tool_call_approved: raise ApprovalRequired return f'File {path!r} updated: {content!r}' @agent.tool_plain(requires_approval=True) def delete_file(path: str) -> str: return f'File {path!r} deleted' result = agent.run_sync('Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ], ) """ results = DeferredToolResults() for call in requests.approvals: result = False if call.tool_name == 'update_file': # Approve all updates result = True elif call.tool_name == 'delete_file': # deny all deletes result = ToolDenied('Deleting files is not allowed') results.approvals[call.tool_call_id] = result result = agent.run_sync(message_history=messages, deferred_tool_results=results) print(result.output) """ I successfully updated `README.md` and cleared `.env`, but was not able to delete `__init__.py`. """ print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Delete `__init__.py`, write `Hello, world!` to `README.md`, and clear `.env`', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='delete_file', args={'path': '__init__.py'}, tool_call_id='delete_file', ), ToolCallPart( tool_name='update_file', args={'path': 'README.md', 'content': 'Hello, world!'}, tool_call_id='update_file_readme', ), ToolCallPart( tool_name='update_file', args={'path': '.env', 'content': ''}, tool_call_id='update_file_dotenv', ), ], usage=RequestUsage(input_tokens=63, output_tokens=21), model_name='gpt-5', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='update_file', content="File 'README.md' updated: 'Hello, world!'", tool_call_id='update_file_readme', timestamp=datetime.datetime(...), ) ] ), ModelRequest( parts=[ ToolReturnPart( tool_name='delete_file', content='Deleting files is not allowed', tool_call_id='delete_file', timestamp=datetime.datetime(...), ), ToolReturnPart( tool_name='update_file', content="File '.env' updated: ''", tool_call_id='update_file_dotenv', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='I successfully updated `README.md` and cleared `.env`, but was not able to delete `__init__.py`.' ) ], usage=RequestUsage(input_tokens=79, output_tokens=39), model_name='gpt-5', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is")* ## External Tool Execution When the result of a tool call cannot be generated inside the same agent run in which it was called, the tool is considered to be external. Examples of external tools are client-side tools implemented by a web or app frontend, and slow tasks that are passed off to a background worker or external service instead of keeping the agent process running. If whether a tool call should be executed externally depends on the tool call arguments, the agent run context (e.g. [dependencies](../dependencies/) or message history), or how long the task is expected to take, you can define a tool function and conditionally raise the CallDeferred exception. Before raising the exception, the tool function would typically schedule some background task and pass along the RunContext.tool_call_id so that the result can be matched to the deferred tool call later. If a tool is always executed externally and its definition is provided to your code along with a JSON schema for its arguments, you can use an [`ExternalToolset`](../toolsets/#external-toolset). If the external tools are known up front and you don't have the arguments JSON schema handy, you can also define a tool function with the appropriate signature that does nothing but raise the CallDeferred exception. When the model calls an external tool, the agent run will end with a DeferredToolRequests output object with a `calls` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID. Once the tool call results are ready, you can build a DeferredToolResults object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](../tools-advanced/#advanced-tool-returns) object, or a ModelRetry exception in case the tool call failed and the model should [try again](../tools-advanced/#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](../message-history/). Here's an example that shows how to move a task that takes a while to complete to the background and return the result to the model once the task is complete: external_tool.py ```python import asyncio from dataclasses import dataclass from typing import Any from pydantic_ai import ( Agent, CallDeferred, DeferredToolRequests, DeferredToolResults, ModelRetry, RunContext, ) @dataclass class TaskResult: tool_call_id: str result: Any async def calculate_answer_task(tool_call_id: str, question: str) -> TaskResult: await asyncio.sleep(1) return TaskResult(tool_call_id=tool_call_id, result=42) agent = Agent('openai:gpt-5', output_type=[str, DeferredToolRequests]) tasks: list[asyncio.Task[TaskResult]] = [] @agent.tool async def calculate_answer(ctx: RunContext, question: str) -> str: assert ctx.tool_call_id is not None task = asyncio.create_task(calculate_answer_task(ctx.tool_call_id, question)) # (1)! tasks.append(task) raise CallDeferred async def main(): result = await agent.run('Calculate the answer to the ultimate question of life, the universe, and everything') messages = result.all_messages() assert isinstance(result.output, DeferredToolRequests) requests = result.output print(requests) """ DeferredToolRequests( calls=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], approvals=[], ) """ done, _ = await asyncio.wait(tasks) # (2)! task_results = [task.result() for task in done] task_results_by_tool_call_id = {result.tool_call_id: result.result for result in task_results} results = DeferredToolResults() for call in requests.calls: try: result = task_results_by_tool_call_id[call.tool_call_id] except KeyError: result = ModelRetry('No result for this tool call was found.') results.calls[call.tool_call_id] = result result = await agent.run(message_history=messages, deferred_tool_results=results) print(result.output) #> The answer to the ultimate question of life, the universe, and everything is 42. print(result.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='Calculate the answer to the ultimate question of life, the universe, and everything', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='calculate_answer', args={ 'question': 'the ultimate question of life, the universe, and everything' }, tool_call_id='pyd_ai_tool_call_id', ) ], usage=RequestUsage(input_tokens=63, output_tokens=13), model_name='gpt-5', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='calculate_answer', content=42, tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content='The answer to the ultimate question of life, the universe, and everything is 42.' ) ], usage=RequestUsage(input_tokens=64, output_tokens=28), model_name='gpt-5', timestamp=datetime.datetime(...), ), ] """ ``` 1. In reality, you'd likely use Celery or a similar task queue to run the task in the background. 1. In reality, this would typically happen in a separate process that polls for the task status or is notified when all pending tasks are complete. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## See Also - [Function Tools](../tools/) - Basic tool concepts and registration - [Advanced Tool Features](../tools-advanced/) - Custom schemas, dynamic tools, and execution details - [Toolsets](../toolsets/) - Managing collections of tools, including `ExternalToolset` for external tools - [Message History](../message-history/) - Understanding how to work with message history for deferred tools # Dependencies Pydantic AI uses a dependency injection system to provide data and services to your agent's [system prompts](../agents/#system-prompts), [tools](../tools/) and [output validators](../output/#output-validator-functions). Matching Pydantic AI's design philosophy, our dependency system tries to use existing best practice in Python development rather than inventing esoteric "magic", this should make dependencies type-safe, understandable easier to test and ultimately easier to deploy in production. ## Defining Dependencies Dependencies can be any python type. While in simple cases you might be able to pass a single object as a dependency (e.g. an HTTP connection), dataclasses are generally a convenient container when your dependencies included multiple objects. Here's an example of defining an agent that requires dependencies. (**Note:** dependencies aren't actually used in this example, see [Accessing Dependencies](#accessing-dependencies) below) unused_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent @dataclass class MyDeps: # (1)! api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, # (2)! ) async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run( 'Tell me a joke.', deps=deps, # (3)! ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Define a dataclass to hold dependencies. 1. Pass the dataclass type to the `deps_type` argument of the Agent constructor. **Note**: we're passing the type here, NOT an instance, this parameter is not actually used at runtime, it's here so we can get full type checking of the agent. 1. When running the agent, pass an instance of the dataclass to the `deps` parameter. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Accessing Dependencies Dependencies are accessed through the RunContext type, this should be the first parameter of system prompt functions etc. system_prompt_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt # (1)! async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = await ctx.deps.http_client.get( # (3)! 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, # (4)! ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. RunContext may optionally be passed to a system_prompt function as the only argument. 1. RunContext is parameterized with the type of the dependencies, if this type is incorrect, static type checkers will raise an error. 1. Access dependencies through the .deps attribute. 1. Access dependencies through the .deps attribute. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Asynchronous vs. Synchronous dependencies [System prompt functions](../agents/#system-prompts), [function tools](../tools/) and [output validators](../output/#output-validator-functions) are all run in the async context of an agent run. If these functions are not coroutines (e.g. `async def`) they are called with run_in_executor in a thread pool, it's therefore marginally preferable to use `async` methods where dependencies perform IO, although synchronous dependencies should work fine too. `run` vs. `run_sync` and Asynchronous vs. Synchronous dependencies Whether you use synchronous or asynchronous dependencies, is completely independent of whether you use `run` or `run_sync` — `run_sync` is just a wrapper around `run` and agents are always run in an async context. Here's the same example as above, but with a synchronous dependency: sync_dependencies.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.Client # (1)! agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt def get_system_prompt(ctx: RunContext[MyDeps]) -> str: # (2)! response = ctx.deps.http_client.get( 'https://example.com', headers={'Authorization': f'Bearer {ctx.deps.api_key}'} ) response.raise_for_status() return f'Prompt: {response.text}' async def main(): deps = MyDeps('foobar', httpx.Client()) result = await agent.run( 'Tell me a joke.', deps=deps, ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. Here we use a synchronous `httpx.Client` instead of an asynchronous `httpx.AsyncClient`. 1. To match the synchronous dependency, the system prompt function is now a plain function, not a coroutine. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Full Example As well as system prompts, dependencies can be used in [tools](../tools/) and [output validators](../output/#output-validator-functions). full_example.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient agent = Agent( 'openai:gpt-4o', deps_type=MyDeps, ) @agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: response = await ctx.deps.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' @agent.tool # (1)! async def get_joke_material(ctx: RunContext[MyDeps], subject: str) -> str: response = await ctx.deps.http_client.get( 'https://example.com#jokes', params={'subject': subject}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text @agent.output_validator # (2)! async def validate_output(ctx: RunContext[MyDeps], output: str) -> str: response = await ctx.deps.http_client.post( 'https://example.com#validate', headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, params={'query': output}, ) if response.status_code == 400: raise ModelRetry(f'invalid response: {response.text}') response.raise_for_status() return output async def main(): async with httpx.AsyncClient() as client: deps = MyDeps('foobar', client) result = await agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. ``` 1. To pass `RunContext` to a tool, use the tool decorator. 1. `RunContext` may optionally be passed to a output_validator function as the first argument. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Overriding Dependencies When testing agents, it's useful to be able to customise dependencies. While this can sometimes be done by calling the agent directly within unit tests, we can also override dependencies while calling application code which in turn calls the agent. This is done via the override method on the agent. joke_app.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class MyDeps: api_key: str http_client: httpx.AsyncClient async def system_prompt_factory(self) -> str: # (1)! response = await self.http_client.get('https://example.com') response.raise_for_status() return f'Prompt: {response.text}' joke_agent = Agent('openai:gpt-4o', deps_type=MyDeps) @joke_agent.system_prompt async def get_system_prompt(ctx: RunContext[MyDeps]) -> str: return await ctx.deps.system_prompt_factory() # (2)! async def application_code(prompt: str) -> str: # (3)! ... ... # now deep within application code we call our agent async with httpx.AsyncClient() as client: app_deps = MyDeps('foobar', client) result = await joke_agent.run(prompt, deps=app_deps) # (4)! return result.output ``` 1. Define a method on the dependency to make the system prompt easier to customise. 1. Call the system prompt factory from within the system prompt function. 1. Application code that calls the agent, in a real application this might be an API endpoint. 1. Call the agent from within the application code, in a real application this call might be deep within a call stack. Note `app_deps` here will NOT be used when deps are overridden. *(This example is complete, it can be run "as is")* test_joke_app.py ```python from joke_app import MyDeps, application_code, joke_agent class TestMyDeps(MyDeps): # (1)! async def system_prompt_factory(self) -> str: return 'test prompt' async def test_application_code(): test_deps = TestMyDeps('test_key', None) # (2)! with joke_agent.override(deps=test_deps): # (3)! joke = await application_code('Tell me a joke.') # (4)! assert joke.startswith('Did you hear about the toothpaste scandal?') ``` 1. Define a subclass of `MyDeps` in tests to customise the system prompt factory. 1. Create an instance of the test dependency, we don't need to pass an `http_client` here as it's not used. 1. Override the dependencies of the agent for the duration of the `with` block, `test_deps` will be used when the agent is run. 1. Now we can safely call our application code, the agent will use the overridden dependencies. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Weather Agent](../examples/weather-agent/) - [SQL Generation](../examples/sql-gen/) - [RAG](../examples/rag/) # Direct Model Requests The `direct` module provides low-level methods for making imperative requests to LLMs where the only abstraction is input and output schema translation, enabling you to use all models with the same API. These methods are thin wrappers around the Model implementations, offering a simpler interface when you don't need the full functionality of an Agent. The following functions are available: - model_request: Make a non-streamed async request to a model - model_request_sync: Make a non-streamed synchronous request to a model - model_request_stream: Make a streamed async request to a model - model_request_stream_sync: Make a streamed sync request to a model ## Basic Example Here's a simple example demonstrating how to use the direct API to make a basic request: direct_basic.py ```python from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] ) print(model_response.parts[0].content) #> The capital of France is Paris. print(model_response.usage) #> RequestUsage(input_tokens=56, output_tokens=7) ``` *(This example is complete, it can be run "as is")* ## Advanced Example with Tool Calling You can also use the direct API to work with function/tool calling. Even here we can use Pydantic to generate the JSON schema for the tool: ```python from typing import Literal from pydantic import BaseModel from pydantic_ai import ModelRequest, ToolDefinition from pydantic_ai.direct import model_request from pydantic_ai.models import ModelRequestParameters class Divide(BaseModel): """Divide two numbers.""" numerator: float denominator: float on_inf: Literal['error', 'infinity'] = 'infinity' async def main(): # Make a request to the model with tool access model_response = await model_request( 'openai:gpt-4.1-nano', [ModelRequest.user_text_prompt('What is 123 / 456?')], model_request_parameters=ModelRequestParameters( function_tools=[ ToolDefinition( name=Divide.__name__.lower(), description=Divide.__doc__, parameters_json_schema=Divide.model_json_schema(), ) ], allow_text_output=True, # Allow model to either use tools or respond directly ), ) print(model_response) """ ModelResponse( parts=[ ToolCallPart( tool_name='divide', args={'numerator': '123', 'denominator': '456'}, tool_call_id='pyd_ai_2e0e396768a14fe482df90a29a78dc7b', ) ], usage=RequestUsage(input_tokens=55, output_tokens=7), model_name='gpt-4.1-nano', timestamp=datetime.datetime(...), ) """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## When to Use the direct API vs Agent The direct API is ideal when: 1. You need more direct control over model interactions 1. You want to implement custom behavior around model requests 1. You're building your own abstractions on top of model interactions For most application use cases, the higher-level Agent API provides a more convenient interface with additional features such as built-in tool execution, retrying, structured output parsing, and more. ## OpenTelemetry or Logfire Instrumentation As with agents, you can enable OpenTelemetry/Logfire instrumentation with just a few extra lines direct_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() logfire.instrument_pydantic_ai() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')], ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` *(This example is complete, it can be run "as is")* You can also enable OpenTelemetry on a per call basis: direct_instrumented.py ```python import logfire from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync logfire.configure() # Make a synchronous request to the model model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')], instrument=True ) print(model_response.parts[0].content) #> The capital of France is Paris. ``` See [Debugging and Monitoring](../logfire/) for more details, including how to instrument with plain OpenTelemetry without Logfire. # Image, Audio, Video & Document Input Some LLMs are now capable of understanding audio, video, image and document content. ## Image Input Info Some models do not support image input. Please check the model's documentation to confirm whether it supports image input. If you have a direct URL for the image, you can use ImageUrl: image_input.py ```py from pydantic_ai import Agent, ImageUrl agent = Agent(model='openai:gpt-4o') result = agent.run_sync( [ 'What company is this logo from?', ImageUrl(url='https://iili.io/3Hs4FMg.png'), ] ) print(result.output) # > This is the logo for Pydantic, a data validation and settings management library in Python. ``` If you have the image locally, you can also use BinaryContent: local_image_input.py ```py import httpx from pydantic_ai import Agent, BinaryContent image_response = httpx.get('https://iili.io/3Hs4FMg.png') # Pydantic logo agent = Agent(model='openai:gpt-4o') result = agent.run_sync( [ 'What company is this logo from?', BinaryContent(data=image_response.content, media_type='image/png'), # (1)! ] ) print(result.output) # > This is the logo for Pydantic, a data validation and settings management library in Python. ``` 1. To ensure the example is runnable we download this image from the web, but you can also use `Path().read_bytes()` to read a local file's contents. ## Audio Input Info Some models do not support audio input. Please check the model's documentation to confirm whether it supports audio input. You can provide audio input using either AudioUrl or BinaryContent. The process is analogous to the examples above. ## Video Input Info Some models do not support video input. Please check the model's documentation to confirm whether it supports video input. You can provide video input using either VideoUrl or BinaryContent. The process is analogous to the examples above. ## Document Input Info Some models do not support document input. Please check the model's documentation to confirm whether it supports document input. You can provide document input using either DocumentUrl or BinaryContent. The process is similar to the examples above. If you have a direct URL for the document, you can use DocumentUrl: document_input.py ```py from pydantic_ai import Agent, DocumentUrl agent = Agent(model='anthropic:claude-3-sonnet') result = agent.run_sync( [ 'What is the main content of this document?', DocumentUrl(url='https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf'), ] ) print(result.output) # > This document is the technical report introducing Gemini 1.5, Google's latest large language model... ``` The supported document formats vary by model. You can also use BinaryContent to pass document data directly: binary_content_input.py ```py from pathlib import Path from pydantic_ai import Agent, BinaryContent pdf_path = Path('document.pdf') agent = Agent(model='anthropic:claude-3-sonnet') result = agent.run_sync( [ 'What is the main content of this document?', BinaryContent(data=pdf_path.read_bytes(), media_type='application/pdf'), ] ) print(result.output) # > The document discusses... ``` ## User-side download vs. direct file URL As a general rule, when you provide a URL using any of `ImageUrl`, `AudioUrl`, `VideoUrl` or `DocumentUrl`, Pydantic AI downloads the file content and then sends it as part of the API request. The situation is different for certain models: - AnthropicModel: if you provide a PDF document via `DocumentUrl`, the URL is sent directly in the API request, so no download happens on the user side. - GoogleModel on Vertex AI: any URL provided using `ImageUrl`, `AudioUrl`, `VideoUrl`, or `DocumentUrl` is sent as-is in the API request and no data is downloaded beforehand. See the [Gemini API docs for Vertex AI](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/inference#filedata) to learn more about supported URLs, formats and limitations: - Cloud Storage bucket URIs (with protocol `gs://`) - Public HTTP(S) URLs - Public YouTube video URL (maximum one URL per request) However, because of crawling restrictions, it may happen that Gemini can't access certain URLs. In that case, you can instruct Pydantic AI to download the file content and send that instead of the URL by setting the boolean flag `force_download` to `True`. This attribute is available on all objects that inherit from FileUrl. - GoogleModel on GLA: YouTube video URLs are sent directly in the request to the model. # Messages and chat history Pydantic AI provides access to messages exchanged during an agent run. These messages can be used both to continue a coherent conversation, and to understand how an agent performed. ### Accessing Messages from Results After running an agent, you can access the messages exchanged during that run from the `result` object. Both RunResult (returned by Agent.run, Agent.run_sync) and StreamedRunResult (returned by Agent.run_stream) have the following methods: - all_messages(): returns all messages, including messages from prior runs. There's also a variant that returns JSON bytes, all_messages_json(). - new_messages(): returns only the messages from the current run. There's also a variant that returns JSON bytes, new_messages_json(). StreamedRunResult and complete messages On StreamedRunResult, the messages returned from these methods will only include the final result message once the stream has finished. E.g. you've awaited one of the following coroutines: - StreamedRunResult.stream_output() - StreamedRunResult.stream_text() - StreamedRunResult.stream_responses() - StreamedRunResult.get_output() **Note:** The final result message will NOT be added to result messages if you use .stream_text(delta=True) since in this case the result content is never built as one string. Example of accessing methods on a RunResult : run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result = agent.run_sync('Tell me a joke.') print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # all messages from the run print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is")* Example of accessing methods on a StreamedRunResult : streamed_run_result_messages.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') async def main(): async with agent.run_stream('Tell me a joke.') as result: # incomplete messages before the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ) ] """ async for text in result.stream_text(): print(text) #> Did you hear #> Did you hear about the toothpaste #> Did you hear about the toothpaste scandal? They called #> Did you hear about the toothpaste scandal? They called it Colgate. # complete messages once the stream finishes print(result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=50, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Using Messages as Input for Further Agent Runs The primary use of message histories in Pydantic AI is to maintain context across multiple agent runs. To use existing messages in a run, pass them to the `message_history` parameter of Agent.run, Agent.run_sync or Agent.run_stream. If `message_history` is set and not empty, a new system prompt is not generated — we assume the existing message history includes a system prompt. Reusing messages in a conversation ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync('Explain?', message_history=result1.new_messages()) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=61, output_tokens=26), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ] """ ``` *(This example is complete, it can be run "as is")* ## Storing and loading messages (to JSON) While maintaining conversation state in memory is enough for many applications, often times you may want to store the messages history of an agent run on disk or in a database. This might be for evals, for sharing data between Python and JavaScript/TypeScript, or any number of other use cases. The intended way to do this is using a `TypeAdapter`. We export ModelMessagesTypeAdapter that can be used for this, or you can create your own. Here's an example showing how: serialize messages to json ```python from pydantic_core import to_jsonable_python from pydantic_ai import ( Agent, ModelMessagesTypeAdapter, # (1)! ) agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') history_step_1 = result1.all_messages() as_python_objects = to_jsonable_python(history_step_1) # (2)! same_history_as_step_1 = ModelMessagesTypeAdapter.validate_python(as_python_objects) result2 = agent.run_sync( # (3)! 'Tell me a different joke.', message_history=same_history_as_step_1 ) ``` 1. Alternatively, you can create a `TypeAdapter` from scratch: ```python from pydantic import TypeAdapter from pydantic_ai import ModelMessage ModelMessagesTypeAdapter = TypeAdapter(list[ModelMessage]) ``` 1. Alternatively you can serialize to/from JSON directly: ```python from pydantic_core import to_json ... as_json_objects = to_json(history_step_1) same_history_as_step_1 = ModelMessagesTypeAdapter.validate_json(as_json_objects) ``` 1. You can now continue the conversation with history `same_history_as_step_1` despite creating a new agent run. *(This example is complete, it can be run "as is")* ## Other ways of using messages Since messages are defined by simple dataclasses, you can manually create and manipulate, e.g. for testing. The message format is independent of the model used, so you can use messages in different agents, or the same agent with different models. In the example below, we reuse the message from the first agent run, which uses the `openai:gpt-4o` model, in a second agent run using the `google-gla:gemini-1.5-pro` model. Reusing messages with a different model ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', system_prompt='Be a helpful assistant.') result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. result2 = agent.run_sync( 'Explain?', model='google-gla:gemini-1.5-pro', message_history=result1.new_messages(), ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. print(result2.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content='Be a helpful assistant.', timestamp=datetime.datetime(...), ), UserPromptPart( content='Tell me a joke.', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ TextPart( content='Did you hear about the toothpaste scandal? They called it Colgate.' ) ], usage=RequestUsage(input_tokens=60, output_tokens=12), model_name='gpt-4o', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ UserPromptPart( content='Explain?', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content='This is an excellent joke invented by Samuel Colvin, it needs no explanation.' ) ], usage=RequestUsage(input_tokens=61, output_tokens=26), model_name='gemini-1.5-pro', timestamp=datetime.datetime(...), ), ] """ ``` ## Processing Message History Sometimes you may want to modify the message history before it's sent to the model. This could be for privacy reasons (filtering out sensitive information), to save costs on tokens, to give less context to the LLM, or custom processing logic. Pydantic AI provides a `history_processors` parameter on `Agent` that allows you to intercept and modify the message history before each model request. History processors replace the message history History processors replace the message history in the state with the processed messages, including the new user prompt part. This means that if you want to keep the original message history, you need to make a copy of it. ### Usage The `history_processors` is a list of callables that take a list of ModelMessage and return a modified list of the same type. Each processor is applied in sequence, and processors can be either synchronous or asynchronous. simple_history_processor.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: """Remove all ModelResponse messages, keeping only ModelRequest messages.""" return [msg for msg in messages if isinstance(msg, ModelRequest)] # Create agent with history processor agent = Agent('openai:gpt-4o', history_processors=[filter_responses]) # Example: Create some conversation history message_history = [ ModelRequest(parts=[UserPromptPart(content='What is 2+2?')]), ModelResponse(parts=[TextPart(content='2+2 equals 4')]), # This will be filtered out ] # When you run the agent, the history processor will filter out ModelResponse messages # result = agent.run_sync('What about 3+3?', message_history=message_history) ``` #### Keep Only Recent Messages You can use the `history_processor` to only keep the recent messages: keep_recent_messages.py ```python from pydantic_ai import Agent, ModelMessage async def keep_recent_messages(messages: list[ModelMessage]) -> list[ModelMessage]: """Keep only the last 5 messages to manage token usage.""" return messages[-5:] if len(messages) > 5 else messages agent = Agent('openai:gpt-4o', history_processors=[keep_recent_messages]) # Example: Even with a long conversation history, only the last 5 messages are sent to the model long_conversation_history: list[ModelMessage] = [] # Your long conversation history here # result = agent.run_sync('What did we discuss?', message_history=long_conversation_history) ``` Be careful when slicing the message history When slicing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269). #### `RunContext` parameter History processors can optionally accept a RunContext parameter to access additional information about the current run, such as dependencies, model information, and usage statistics: context_aware_processor.py ```python from pydantic_ai import Agent, ModelMessage, RunContext def context_aware_processor( ctx: RunContext[None], messages: list[ModelMessage], ) -> list[ModelMessage]: # Access current usage current_tokens = ctx.usage.total_tokens # Filter messages based on context if current_tokens > 1000: return messages[-3:] # Keep only recent messages when token usage is high return messages agent = Agent('openai:gpt-4o', history_processors=[context_aware_processor]) ``` This allows for more sophisticated message processing based on the current state of the agent run. #### Summarize Old Messages Use an LLM to summarize older messages to preserve context while reducing tokens. summarize_old_messages.py ```python from pydantic_ai import Agent, ModelMessage # Use a cheaper model to summarize old messages. summarize_agent = Agent( 'openai:gpt-4o-mini', instructions=""" Summarize this conversation, omitting small talk and unrelated topics. Focus on the technical discussion and next steps. """, ) async def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: # Summarize the oldest 10 messages if len(messages) > 10: oldest_messages = messages[:10] summary = await summarize_agent.run(message_history=oldest_messages) # Return the last message and the summary return summary.new_messages() + messages[-1:] return messages agent = Agent('openai:gpt-4o', history_processors=[summarize_old_messages]) ``` Be careful when summarizing the message history When summarizing the message history, you need to make sure that tool calls and returns are paired, otherwise the LLM may return an error. For more details, refer to [this GitHub issue](https://github.com/pydantic/pydantic-ai/issues/2050#issuecomment-3019976269), where you can find examples of summarizing the message history. ### Testing History Processors You can test what messages are actually sent to the model provider using FunctionModel: test_history_processor.py ```python import pytest from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel @pytest.fixture def received_messages() -> list[ModelMessage]: return [] @pytest.fixture def function_model(received_messages: list[ModelMessage]) -> FunctionModel: def capture_model_function(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: # Capture the messages that the provider actually receives received_messages.clear() received_messages.extend(messages) return ModelResponse(parts=[TextPart(content='Provider response')]) return FunctionModel(capture_model_function) def test_history_processor(function_model: FunctionModel, received_messages: list[ModelMessage]): def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] agent = Agent(function_model, history_processors=[filter_responses]) message_history = [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelResponse(parts=[TextPart(content='Answer 1')]), ] agent.run_sync('Question 2', message_history=message_history) assert received_messages == [ ModelRequest(parts=[UserPromptPart(content='Question 1')]), ModelRequest(parts=[UserPromptPart(content='Question 2')]), ] ``` ### Multiple Processors You can also use multiple processors: multiple_history_processors.py ```python from pydantic_ai import Agent, ModelMessage, ModelRequest def filter_responses(messages: list[ModelMessage]) -> list[ModelMessage]: return [msg for msg in messages if isinstance(msg, ModelRequest)] def summarize_old_messages(messages: list[ModelMessage]) -> list[ModelMessage]: return messages[-5:] agent = Agent('openai:gpt-4o', history_processors=[filter_responses, summarize_old_messages]) ``` In this case, the `filter_responses` processor will be applied first, and the `summarize_old_messages` processor will be applied second. ## Examples For a more complete example of using messages in conversations, see the [chat app](../examples/chat-app/) example. # Multi-agent Applications There are roughly four levels of complexity when building applications with Pydantic AI: 1. Single agent workflows — what most of the `pydantic_ai` documentation covers 1. [Agent delegation](#agent-delegation) — agents using another agent via tools 1. [Programmatic agent hand-off](#programmatic-agent-hand-off) — one agent runs, then application code calls another agent 1. [Graph based control flow](../graph/) — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents Of course, you can combine multiple strategies in a single application. ## Agent delegation "Agent delegation" refers to the scenario where an agent delegates work to another agent, then takes back control when the delegate agent (the agent called from within a tool) finishes. If you want to hand off control to another agent completely, without coming back to the first agent, you can use an [output function](../output/#output-functions). Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](../dependencies/). You'll generally want to pass ctx.usage to the usage keyword argument of the delegate agent run so usage within that run counts towards the total usage of the parent agent run. Multiple models Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final result.usage() of the run will not be possible, but you can still use UsageLimits — including `request_limit`, `total_tokens_limit`, and `tool_calls_limit` — to avoid unexpected costs or runaway tool loops. agent_delegation_simple.py ```python from pydantic_ai import Agent, RunContext, UsageLimits joke_selection_agent = Agent( # (1)! 'openai:gpt-4o', system_prompt=( 'Use the `joke_factory` to generate some jokes, then choose the best. ' 'You must return just a single joke.' ), ) joke_generation_agent = Agent( # (2)! 'google-gla:gemini-1.5-flash', output_type=list[str] ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[None], count: int) -> list[str]: r = await joke_generation_agent.run( # (3)! f'Please generate {count} jokes.', usage=ctx.usage, # (4)! ) return r.output # (5)! result = joke_selection_agent.run_sync( 'Tell me a joke.', usage_limits=UsageLimits(request_limit=5, total_tokens_limit=500), ) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) #> RunUsage(input_tokens=204, output_tokens=24, requests=3, tool_calls=1) ``` 1. The "parent" or controlling agent. 1. The "delegate" agent, which is called from within a tool of the parent agent. 1. Call the delegate agent from within a tool of the parent agent. 1. Pass the usage from the parent agent to the delegate agent so the final result.usage() includes the usage from both agents. 1. Since the function returns `list[str]`, and the `output_type` of `joke_generation_agent` is also `list[str]`, we can simply return `r.output` from the tool. *(This example is complete, it can be run "as is")* The control flow for this example is pretty simple and can be summarised as follows: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ### Agent delegation and dependencies Generally the delegate agent needs to either have the same [dependencies](../dependencies/) as the calling agent, or dependencies which are a subset of the calling agent's dependencies. Initializing dependencies We say "generally" above since there's nothing to stop you initializing dependencies within a tool call and therefore using interdependencies in a delegate agent that are not available on the parent, this should often be avoided since it can be significantly slower than reusing connections etc. from the parent agent. agent_delegation_deps.py ```python from dataclasses import dataclass import httpx from pydantic_ai import Agent, RunContext @dataclass class ClientAndKey: # (1)! http_client: httpx.AsyncClient api_key: str joke_selection_agent = Agent( 'openai:gpt-4o', deps_type=ClientAndKey, # (2)! system_prompt=( 'Use the `joke_factory` tool to generate some jokes on the given subject, ' 'then choose the best. You must return just a single joke.' ), ) joke_generation_agent = Agent( 'google-gla:gemini-1.5-flash', deps_type=ClientAndKey, # (4)! output_type=list[str], system_prompt=( 'Use the "get_jokes" tool to get some jokes on the given subject, ' 'then extract each joke into a list.' ), ) @joke_selection_agent.tool async def joke_factory(ctx: RunContext[ClientAndKey], count: int) -> list[str]: r = await joke_generation_agent.run( f'Please generate {count} jokes.', deps=ctx.deps, # (3)! usage=ctx.usage, ) return r.output @joke_generation_agent.tool # (5)! async def get_jokes(ctx: RunContext[ClientAndKey], count: int) -> str: response = await ctx.deps.http_client.get( 'https://example.com', params={'count': count}, headers={'Authorization': f'Bearer {ctx.deps.api_key}'}, ) response.raise_for_status() return response.text async def main(): async with httpx.AsyncClient() as client: deps = ClientAndKey(client, 'foobar') result = await joke_selection_agent.run('Tell me a joke.', deps=deps) print(result.output) #> Did you hear about the toothpaste scandal? They called it Colgate. print(result.usage()) # (6)! #> RunUsage(input_tokens=309, output_tokens=32, requests=4, tool_calls=2) ``` 1. Define a dataclass to hold the client and API key dependencies. 1. Set the `deps_type` of the calling agent — `joke_selection_agent` here. 1. Pass the dependencies to the delegate agent's run method within the tool call. 1. Also set the `deps_type` of the delegate agent — `joke_generation_agent` here. 1. Define a tool on the delegate agent that uses the dependencies to make an HTTP request. 1. Usage now includes 4 requests — 2 from the calling agent and 2 from the delegate agent. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* This example shows how even a fairly simple agent delegation can lead to a complex control flow: ``` graph TD START --> joke_selection_agent joke_selection_agent --> joke_factory["joke_factory (tool)"] joke_factory --> joke_generation_agent joke_generation_agent --> get_jokes["get_jokes (tool)"] get_jokes --> http_request["HTTP request"] http_request --> get_jokes get_jokes --> joke_generation_agent joke_generation_agent --> joke_factory joke_factory --> joke_selection_agent joke_selection_agent --> END ``` ## Programmatic agent hand-off "Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next. Here agents don't need to use the same deps. Here we show two agents used in succession, the first to find a flight and the second to extract the user's seat preference. programmatic_handoff.py ```python from typing import Literal from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import Agent, ModelMessage, RunContext, RunUsage, UsageLimits class FlightDetails(BaseModel): flight_number: str class Failed(BaseModel): """Unable to find a satisfactory choice.""" flight_search_agent = Agent[None, FlightDetails | Failed]( # (1)! 'openai:gpt-4o', output_type=FlightDetails | Failed, # type: ignore system_prompt=( 'Use the "flight_search" tool to find a flight ' 'from the given origin to the given destination.' ), ) @flight_search_agent.tool # (2)! async def flight_search( ctx: RunContext[None], origin: str, destination: str ) -> FlightDetails | None: # in reality, this would call a flight search API or # use a browser to scrape a flight search website return FlightDetails(flight_number='AK456') usage_limits = UsageLimits(request_limit=15) # (3)! async def find_flight(usage: RunUsage) -> FlightDetails | None: # (4)! message_history: list[ModelMessage] | None = None for _ in range(3): prompt = Prompt.ask( 'Where would you like to fly from and to?', ) result = await flight_search_agent.run( prompt, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, FlightDetails): return result.output else: message_history = result.all_messages( output_tool_return_content='Please try again.' ) class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( # (5)! 'openai:gpt-4o', output_type=SeatPreference | Failed, # type: ignore system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) async def find_seat(usage: RunUsage) -> SeatPreference: # (6)! message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def main(): # (7)! usage: RunUsage = RunUsage() opt_flight_details = await find_flight(usage) if opt_flight_details is not None: print(f'Flight found: {opt_flight_details.flight_number}') #> Flight found: AK456 seat_preference = await find_seat(usage) print(f'Seat preference: {seat_preference}') #> Seat preference: row=1 seat='A' ``` 1. Define the first agent, which finds a flight. We use an explicit type annotation until [PEP-747](https://peps.python.org/pep-0747/) lands, see [structured output](../output/#structured-output). We use a union as the output type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool. 1. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary. 1. Define usage limits for the entire app. 1. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight. 1. As with `flight_search_agent` above, we use an explicit type annotation to define the agent. 1. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference. 1. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* The control flow for this example can be summarised as follows: ``` graph TB START --> ask_user_flight["ask user for flight"] subgraph find_flight flight_search_agent --> ask_user_flight ask_user_flight --> flight_search_agent end flight_search_agent --> ask_user_seat["ask user for seat"] flight_search_agent --> END subgraph find_seat seat_preference_agent --> ask_user_seat ask_user_seat --> seat_preference_agent end seat_preference_agent --> END ``` ## Pydantic Graphs See the [graph](../graph/) documentation on when and how to use graphs. ## Examples The following examples demonstrate how to use dependencies in Pydantic AI: - [Flight booking](../examples/flight-booking/) "Output" refers to the final value returned from [running an agent](../agents/#running-agents). This can be either plain text, [structured data](#structured-output), an [image](#image-output), or the result of a [function](#output-functions) called with arguments provided by the model. The output is wrapped in AgentRunResult or StreamedRunResult so that you can access other data, like usage of the run and [message history](../message-history/#accessing-messages-from-results). Both `AgentRunResult` and `StreamedRunResult` are generic in the data they wrap, so typing information about the data returned by the agent is preserved. A run ends when the model responds with one of the output types, or, if no output type is specified or `str` is one of the allowed options, when a plain text response is received. A run can also be cancelled if usage limits are exceeded, see [Usage Limits](../agents/#usage-limits). Here's an example using a Pydantic model as the `output_type`, forcing the model to respond with data matching our specification: olympics.py ```python from pydantic import BaseModel from pydantic_ai import Agent class CityLocation(BaseModel): city: str country: str agent = Agent('google-gla:gemini-1.5-flash', output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` *(This example is complete, it can be run "as is")* ## Structured output data The Agent class constructor takes an `output_type` argument that takes one or more types or [output functions](#output-functions). It supports simple scalar types, list and dict types (including `TypedDict`s and [`StructuredDict`s](#structured-dict)), dataclasses and Pydantic models, as well as type unions -- generally everything supported as type hints in a Pydantic model. You can also pass a list of multiple choices. By default, Pydantic AI leverages the model's tool calling capability to make it return structured data. When multiple output types are specified (in a union or list), each member is registered with the model as a separate output tool in order to reduce the complexity of the schema and maximise the chances a model will respond correctly. This has been shown to work well across a wide range of models. If you'd like to change the names of the output tools, use a model's native structured output feature, or pass the output schema to the model in its [instructions](../agents/#instructions), you can use an [output mode](#output-modes) marker class. When no output type is specified, or when `str` is among the output types, any plain text response from the model will be used as the output data. If `str` is not among the output types, the model is forced to return structured data or call an output function. If the output type schema is not of type `"object"` (e.g. it's `int` or `list[int]`), the output type is wrapped in a single element object, so the schema of all tools registered with the model are object schemas. Structured outputs (like tools) use Pydantic to build the JSON schema used for the tool, and to validate the data returned by the model. Type checking considerations The Agent class is generic in its output type, and this type is carried through to `AgentRunResult.output` and `StreamedRunResult.output` so that your IDE or static type checker can warn you when your code doesn't properly take into account all the possible values those outputs could have. Static type checkers like pyright and mypy will do their best to infer the agent's output type from the `output_type` you've specified, but they're not always able to do so correctly when you provide functions or multiple types in a union or list, even though Pydantic AI will behave correctly. When this happens, your type checker will complain even when you're confident you've passed a valid `output_type`, and you'll need to help the type checker by explicitly specifying the generic parameters on the `Agent` constructor. This is shown in the second example below and the output functions example further down. Specifically, there are three valid uses of `output_type` where you'll need to do this: 1. When using a union of types, e.g. `output_type=Foo | Bar`. Until [PEP-747](https://peps.python.org/pep-0747/) "Annotating Type Forms" lands in Python 3.15, type checkers do not consider these a valid value for `output_type`. In addition to the generic parameters on the `Agent` constructor, you'll need to add `# type: ignore` to the line that passes the union to `output_type`. Alternatively, you can use a list: `output_type=[Foo, Bar]`. 1. With mypy: When using a list, as a functionally equivalent alternative to a union, or because you're passing in [output functions](#output-functions). Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19142) with mypy to try and get this fixed. 1. With mypy: when using an async output function. Pyright does handle this correctly, and we've filed [an issue](https://github.com/python/mypy/issues/19143) with mypy to try and get this fixed. Here's an example of returning either text or structured data: box_or_error.py ```python from pydantic import BaseModel from pydantic_ai import Agent class Box(BaseModel): width: int height: int depth: int units: str agent = Agent( 'openai:gpt-4o-mini', output_type=[Box, str], # (1)! system_prompt=( "Extract me the dimensions of a box, " "if you can't extract all data, ask the user to try again." ), ) result = agent.run_sync('The box is 10x20x30') print(result.output) #> Please provide the units for the dimensions (e.g., cm, in, m). result = agent.run_sync('The box is 10x20x30 cm') print(result.output) #> width=10 height=20 depth=30 units='cm' ``` 1. This could also have been a union: `output_type=Box | str`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* Here's an example of using a union return type, which will register multiple output tools and wrap non-object schemas in an object: colors_or_sizes.py ```python from pydantic_ai import Agent agent = Agent[None, list[str] | list[int]]( 'openai:gpt-4o-mini', output_type=list[str] | list[int], # type: ignore # (1)! system_prompt='Extract either colors or sizes from the shapes provided.', ) result = agent.run_sync('red square, blue circle, green triangle') print(result.output) #> ['red', 'blue', 'green'] result = agent.run_sync('square size 10, circle size 20, triangle size 30') print(result.output) #> [10, 20, 30] ``` 1. As explained in the "Type checking considerations" section above, using a union rather than a list requires explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* ### Output functions Instead of plain text or structured data, you may want the output of your agent run to be the result of a function called with arguments provided by the model, for example to further process or validate the data provided through the arguments (with the option to tell the model to try again), or to hand off to another agent. Output functions are similar to [function tools](../tools/), but the model is forced to call one of them, the call ends the agent run, and the result is not passed back to the model. As with tool functions, output function arguments provided by the model are validated using Pydantic, they can optionally take RunContext as the first argument, and they can raise ModelRetry to ask the model to try again with modified arguments (or with a different output type). To specify output functions, you set the agent's `output_type` to either a single function (or bound instance method), or a list of functions. The list can also contain other output types like simple scalars or entire Pydantic models. You typically do not want to also register your output function as a tool (using the `@agent.tool` decorator or `tools` argument), as this could confuse the model about which it should be calling. Here's an example of all of these features in action: output_functions.py ```python import re from pydantic import BaseModel from pydantic_ai import Agent, ModelRetry, RunContext, UnexpectedModelBehavior class Row(BaseModel): name: str country: str tables = { 'capital_cities': [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] } class SQLFailure(BaseModel): """An unrecoverable failure. Only use this when you can't change the query to make it work.""" explanation: str def run_sql_query(query: str) -> list[Row]: """Run a SQL query on the database.""" select_table = re.match(r'SELECT (.+) FROM (\w+)', query) if select_table: column_names = select_table.group(1) if column_names != '*': raise ModelRetry("Only 'SELECT *' is supported, you'll have to do column filtering manually.") table_name = select_table.group(2) if table_name not in tables: raise ModelRetry( f"Unknown table '{table_name}' in query '{query}'. Available tables: {', '.join(tables.keys())}." ) return tables[table_name] raise ModelRetry(f"Unsupported query: '{query}'.") sql_agent = Agent[None, list[Row] | SQLFailure]( 'openai:gpt-4o', output_type=[run_sql_query, SQLFailure], instructions='You are a SQL agent that can run SQL queries on a database.', ) async def hand_off_to_sql_agent(ctx: RunContext, query: str) -> list[Row]: """I take natural language queries, turn them into SQL, and run them on a database.""" # Drop the final message with the output tool call, as it shouldn't be passed on to the SQL agent messages = ctx.messages[:-1] try: result = await sql_agent.run(query, message_history=messages) output = result.output if isinstance(output, SQLFailure): raise ModelRetry(f'SQL agent failed: {output.explanation}') return output except UnexpectedModelBehavior as e: # Bubble up potentially retryable errors to the router agent if (cause := e.__cause__) and isinstance(cause, ModelRetry): raise ModelRetry(f'SQL agent failed: {cause.message}') from e else: raise class RouterFailure(BaseModel): """Use me when no appropriate agent is found or the used agent failed.""" explanation: str router_agent = Agent[None, list[Row] | RouterFailure]( 'openai:gpt-4o', output_type=[hand_off_to_sql_agent, RouterFailure], instructions='You are a router to other agents. Never try to solve a problem yourself, just pass it on.', ) result = router_agent.run_sync('Select the names and countries of all capitals') print(result.output) """ [ Row(name='Amsterdam', country='Netherlands'), Row(name='Mexico City', country='Mexico'), ] """ result = router_agent.run_sync('Select all pets') print(repr(result.output)) """ RouterFailure(explanation="The requested table 'pets' does not exist in the database. The only available table is 'capital_cities', which does not contain data about pets.") """ result = router_agent.run_sync('How do I fly from Amsterdam to Mexico City?') print(repr(result.output)) """ RouterFailure(explanation='I am not equipped to provide travel information, such as flights from Amsterdam to Mexico City.') """ ``` #### Text output If you provide an output function that takes a string, Pydantic AI will by default create an output tool like for any other output function. If instead you'd like the model to provide the string using plain text output, you can wrap the function in the TextOutput marker class. If desired, this marker class can be used alongside one or more [`ToolOutput`](#tool-output) marker classes (or unmarked types or functions) in a list provided to `output_type`. text_output_function.py ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-4o', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` *(This example is complete, it can be run "as is")* ### Output modes Pydantic AI implements three different methods to get a model to output structured data: 1. [Tool Output](#tool-output), where tool calls are used to produce the output. 1. [Native Output](#native-output), where the model is required to produce text content compliant with a provided JSON schema. 1. [Prompted Output](#prompted-output), where a prompt is injected into the model instructions including the desired JSON schema, and we attempt to parse the model's plain-text response as appropriate. #### Tool Output In the default Tool Output mode, the output JSON schema of each output type (or function) is provided to the model as the parameters schema of a special output tool. This is the default as it's supported by virtually all models and has been shown to work very well. If you'd like to change the name of the output tool, pass a custom description to aid the model, or turn on or off strict mode, you can wrap the type(s) in the ToolOutput marker class and provide the appropriate arguments. Note that by default, the description is taken from the docstring specified on a Pydantic model or output function, so specifying it using the marker class is typically not necessary. To dynamically modify or filter the available output tools during an agent run, you can define an agent-wide `prepare_output_tools` function that will be called ahead of each step of a run. This function should be of type ToolsPrepareFunc, which takes the RunContext and a list of ToolDefinition, and returns a new list of tool definitions (or `None` to disable all tools for that step). This is analogous to the [`prepare_tools` function](../tools-advanced/#prepare-tools) for non-output tools. tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-4o', output_type=[ # (1)! ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` 1. If we were passing just `Fruit` and `Vehicle` without custom tool names, we could have used a union: `output_type=Fruit | Vehicle`. However, as `ToolOutput` is an object rather than a type, we have to use a list. *(This example is complete, it can be run "as is")* #### Native Output Native Output mode uses a model's native "Structured Outputs" feature (aka "JSON Schema response format"), where the model is forced to only output text matching the provided JSON schema. Note that this is not supported by all models, and sometimes comes with restrictions. For example, Anthropic does not support this at all, and Gemini cannot use tools at the same time as structured output, and attempting to do so will result in an error. To use this mode, you can wrap the output type(s) in the NativeOutput marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-4o', output_type=NativeOutput( [Fruit, Vehicle], # (1)! name='Fruit_or_vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Fruit | Vehicle`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* #### Prompted Output In this mode, the model is prompted to output text matching the provided JSON schema through its [instructions](../agents/#instructions) and it's up to the model to interpret those instructions correctly. This is usable with all models, but is often the least reliable approach as the model is not forced to match the schema. While we would generally suggest starting with tool or native output, in some cases this mode may result in higher quality outputs, and for models without native tool calling or structured output support it is the only option for producing structured outputs. If the model API supports the "JSON Mode" feature (aka "JSON Object response format") to force the model to output valid JSON, this is enabled, but it's still up to the model to abide by the schema. Pydantic AI will validate the returned structured data and tell the model to try again if validation fails, but if the model is not intelligent enough this may not be sufficient. To use this mode, you can wrap the output type(s) in the PromptedOutput marker class that also lets you specify a `name` and `description` if the name and docstring of the type or function are not sufficient. Additionally, it supports an `template` argument lets you specify a custom instructions template to be used instead of the default. prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], # (1)! name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` 1. This could also have been a union: `output_type=Vehicle | Device`. However, as explained in the "Type checking considerations" section above, that would've required explicitly specifying the generic parameters on the `Agent` constructor and adding `# type: ignore` to this line in order to be type checked correctly. *(This example is complete, it can be run "as is")* ### Custom JSON schema If it's not feasible to define your desired structured output object using a Pydantic `BaseModel`, dataclass, or `TypedDict`, for example when you get a JSON schema from an external source or generate it dynamically, you can use the StructuredDict() helper function to generate a `dict[str, Any]` subclass with a JSON schema attached that Pydantic AI will pass to the model. Note that Pydantic AI will not perform any validation of the received JSON object and it's up to the model to correctly interpret the schema and any constraints expressed in it, like required fields or integer value ranges. The output type will be a `dict[str, Any]` and it's up to your code to defensively read from it in case the model made a mistake. You can use an [output validator](#output-validator-functions) to reflect validation errors back to the model and get it to try again. Along with the JSON schema, you can optionally pass `name` and `description` arguments to provide additional context to the model: ```python from pydantic_ai import Agent, StructuredDict HumanDict = StructuredDict( { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] }, name='Human', description='A human with a name and age', ) agent = Agent('openai:gpt-4o', output_type=HumanDict) result = agent.run_sync('Create a person') #> {'name': 'John Doe', 'age': 30} ``` ### Output validators Some validation is inconvenient or impossible to do in Pydantic validators, in particular when the validation requires IO and is asynchronous. Pydantic AI provides a way to add validation functions via the agent.output_validator decorator. If you want to implement separate validation logic for different output types, it's recommended to use [output functions](#output-functions) instead, to save you from having to do `isinstance` checks inside the output validator. If you want the model to output plain text, do your own processing or validation, and then have the agent's final output be the result of your function, it's recommended to use an [output function](#output-functions) with the [`TextOutput` marker class](#text-output). Here's a simplified variant of the [SQL Generation example](../examples/sql-gen/): sql_gen.py ```python from fake_database import DatabaseConn, QueryError from pydantic import BaseModel from pydantic_ai import Agent, RunContext, ModelRetry class Success(BaseModel): sql_query: str class InvalidRequest(BaseModel): error_message: str Output = Success | InvalidRequest agent = Agent[DatabaseConn, Output]( 'google-gla:gemini-1.5-flash', output_type=Output, # type: ignore deps_type=DatabaseConn, system_prompt='Generate PostgreSQL flavored SQL queries based on user input.', ) @agent.output_validator async def validate_sql(ctx: RunContext[DatabaseConn], output: Output) -> Output: if isinstance(output, InvalidRequest): return output try: await ctx.deps.execute(f'EXPLAIN {output.sql_query}') except QueryError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output result = agent.run_sync( 'get me users who were last active yesterday.', deps=DatabaseConn() ) print(result.output) #> sql_query='SELECT * FROM users WHERE last_active::date = today() - interval 1 day' ``` *(This example is complete, it can be run "as is")* ## Image output Some models can generate images as part of their response, for example those that support the [Image Generation built-in tool](../builtin-tools/#image-generation-tool) and OpenAI models using the [Code Execution built-in tool](../builtin-tools/#code-execution-tool) when told to generate a chart. To use the generated image as the output of the agent run, you can set `output_type` to BinaryImage. If no image-generating built-in tool is explicitly specified, the ImageGenerationTool will be enabled automatically. image_output.py ```py from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5', output_type=BinaryImage) result = agent.run_sync('Generate an image of an axolotl.') assert isinstance(result.output, BinaryImage) ``` *(This example is complete, it can be run "as is")* If an agent does not need to always generate an image, you can use a union of `BinaryImage` and `str`. If the model generates both, the image will take precedence as output and the text will be available on ModelResponse.text: image_output_union.py ```py from pydantic_ai import Agent, BinaryImage agent = Agent('openai-responses:gpt-5', output_type=BinaryImage | str) result = agent.run_sync('Tell me a two-sentence story about an axolotl, no image please.') print(result.output) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ result = agent.run_sync('Tell me a two-sentence story about an axolotl with an illustration.') assert isinstance(result.output, BinaryImage) print(result.response.text) """ Once upon a time, in a hidden underwater cave, lived a curious axolotl named Pip who loved to explore. One day, while venturing further than usual, Pip discovered a shimmering, ancient coin that granted wishes! """ ``` ## Streamed Results There two main challenges with streamed results: 1. Validating structured responses before they're complete, this is achieved by "partial validation" which was recently added to Pydantic in [pydantic/pydantic#10748](https://github.com/pydantic/pydantic/pull/10748). 1. When receiving a response, we don't know if it's the final response without starting to stream it and peeking at the content. Pydantic AI streams just enough of the response to sniff out if it's a tool call or an output, then streams the whole thing and calls tools, or returns the stream as a StreamedRunResult. Note As the `run_stream()` method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream all events from the model's streaming response and the agent's execution of tools, use agent.run_stream_events() ([docs](../agents/#streaming-all-events)) or agent.iter() ([docs](../agents/#streaming-all-events-and-output)) instead. ### Streaming Text Example of streamed text output: streamed_hello_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-1.5-flash') # (1)! async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: # (2)! async for message in result.stream_text(): # (3)! print(message) #> The first known #> The first known use of "hello, #> The first known use of "hello, world" was in #> The first known use of "hello, world" was in a 1974 textbook #> The first known use of "hello, world" was in a 1974 textbook about the C #> The first known use of "hello, world" was in a 1974 textbook about the C programming language. ``` 1. Streaming works with the standard Agent class, and doesn't require any special setup, just a model that supports streaming (currently all models support streaming). 1. The Agent.run_stream() method is used to start a streamed run, this method returns a context manager so the connection can be closed when the stream completes. 1. Each item yield by StreamedRunResult.stream_text() is the complete text response, extended as new data is received. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* We can also stream text as deltas rather than the entire text in each item: streamed_delta_hello_world.py ```python from pydantic_ai import Agent agent = Agent('google-gla:gemini-1.5-flash') async def main(): async with agent.run_stream('Where does "hello world" come from?') as result: async for message in result.stream_text(delta=True): # (1)! print(message) #> The first known #> use of "hello, #> world" was in #> a 1974 textbook #> about the C #> programming language. ``` 1. stream_text will error if the response is not text. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Output message not included in `messages` The final output message will **NOT** be added to result messages if you use `.stream_text(delta=True)`, see [Messages and chat history](../message-history/) for more information. ### Streaming Structured Output Here's an example of streaming a user profile as it's built: streamed_user_profile.py ```python from datetime import date from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent class UserProfile(TypedDict): name: str dob: NotRequired[date] bio: NotRequired[str] agent = Agent( 'openai:gpt-4o', output_type=UserProfile, system_prompt='Extract a user profile from the input', ) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for profile in result.stream_output(): print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* As setting an `output_type` uses the [Tool Output](#tool-output) mode by default, this will only work if the model supports streaming tool arguments. For models that don't, like Gemini, try [Native Output](#native-output) or [Prompted Output](#prompted-output) instead. ### Streaming Model Responses If you want fine-grained control of validation, you can use the following pattern to get the entire partial ModelResponse: streamed_user_profile.py ```python from datetime import date from pydantic import ValidationError from typing_extensions import TypedDict from pydantic_ai import Agent class UserProfile(TypedDict, total=False): name: str dob: date bio: str agent = Agent('openai:gpt-4o', output_type=UserProfile) async def main(): user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.' async with agent.run_stream(user_input) as result: async for message, last in result.stream_responses(debounce_by=0.01): # (1)! try: profile = await result.validate_response_output( # (2)! message, allow_partial=not last, ) except ValidationError: continue print(profile) #> {'name': 'Ben'} #> {'name': 'Ben'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the '} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyr'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} #> {'name': 'Ben', 'dob': date(1990, 1, 28), 'bio': 'Likes the chain the dog and the pyramid'} ``` 1. stream_responses streams the data as ModelResponse objects, thus iteration can't fail with a `ValidationError`. 1. validate_response_output validates the data, `allow_partial=True` enables pydantic's experimental_allow_partial flag on TypeAdapter. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Examples The following examples demonstrate how to use streamed responses in Pydantic AI: - [Stream markdown](../examples/stream-markdown/) - [Stream Whales](../examples/stream-whales/) # HTTP Request Retries Pydantic AI provides retry functionality for HTTP requests made by model providers through custom HTTP transports. This is particularly useful for handling transient failures like rate limits, network timeouts, or temporary server errors. ## Overview The retry functionality is built on top of the [tenacity](https://github.com/jd/tenacity) library and integrates seamlessly with httpx clients. You can configure retry behavior for any provider that accepts a custom HTTP client. ## Installation To use the retry transports, you need to install `tenacity`, which you can do via the `retries` dependency group: ```bash pip install 'pydantic-ai-slim[retries]' ``` ```bash uv add 'pydantic-ai-slim[retries]' ``` ## Usage Example Here's an example of adding retry functionality with smart retry handling: smart_retry_example.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_retrying_client(): """Create a client with smart retry handling for multiple error types.""" def should_retry_status(response): """Raise exceptions for retryable HTTP status codes.""" if response.status_code in (429, 502, 503, 504): response.raise_for_status() # This will raise HTTPStatusError transport = AsyncTenacityTransport( config=RetryConfig( # Retry on HTTP errors and connection issues retry=retry_if_exception_type((HTTPStatusError, ConnectionError)), # Smart waiting: respects Retry-After headers, falls back to exponential backoff wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 ), # Stop after 5 attempts stop=stop_after_attempt(5), # Re-raise the last exception if all retries fail reraise=True ), validate_response=should_retry_status ) return AsyncClient(transport=transport) # Use the retrying client with a model client = create_retrying_client() model = OpenAIChatModel('gpt-4o', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Wait Strategies ### wait_retry_after The `wait_retry_after` function is a smart wait strategy that automatically respects HTTP `Retry-After` headers: wait_strategy_example.py ```python from tenacity import wait_exponential from pydantic_ai.retries import wait_retry_after # Basic usage - respects Retry-After headers, falls back to exponential backoff wait_strategy_1 = wait_retry_after() # Custom configuration wait_strategy_2 = wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=120), max_wait=600 # Never wait more than 10 minutes ) ``` This wait strategy: - Automatically parses `Retry-After` headers from HTTP 429 responses - Supports both seconds format (`"30"`) and HTTP date format (`"Wed, 21 Oct 2015 07:28:00 GMT"`) - Falls back to your chosen strategy when no header is present - Respects the `max_wait` limit to prevent excessive delays ## Transport Classes ### AsyncTenacityTransport For asynchronous HTTP clients (recommended for most use cases): async_transport_example.py ```python from httpx import AsyncClient from tenacity import stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = AsyncTenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport: client = AsyncClient(transport=transport) ``` ### TenacityTransport For synchronous HTTP clients: sync_transport_example.py ```python from httpx import Client from tenacity import stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport def validator(response): """Treat responses with HTTP status 4xx/5xx as failures that need to be retried. Without a response validator, only network errors and timeouts will result in a retry. """ response.raise_for_status() # Create the transport transport = TenacityTransport( config=RetryConfig(stop=stop_after_attempt(3), reraise=True), validate_response=validator ) # Create a client using the transport client = Client(transport=transport) ``` ## Common Retry Patterns ### Rate Limit Handling with Retry-After Support rate_limit_handling.py ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_rate_limit_client(): """Create a client that respects Retry-After headers from rate limiting responses.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=1, max=60), max_wait=300 # Don't wait more than 5 minutes ), stop=stop_after_attempt(10), reraise=True ), validate_response=lambda r: r.raise_for_status() # Raises HTTPStatusError for 4xx/5xx ) return AsyncClient(transport=transport) # Example usage client = create_rate_limit_client() # Client is now ready to use with any HTTP requests and will respect Retry-After headers ``` The `wait_retry_after` function automatically detects `Retry-After` headers in 429 (rate limit) responses and waits for the specified time. If no header is present, it falls back to exponential backoff. ### Network Error Handling network_error_handling.py ```python import httpx from tenacity import retry_if_exception_type, stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig def create_network_resilient_client(): """Create a client that handles network errors with retries.""" transport = AsyncTenacityTransport( config=RetryConfig( retry=retry_if_exception_type(( httpx.TimeoutException, httpx.ConnectError, httpx.ReadError )), wait=wait_exponential(multiplier=1, max=10), stop=stop_after_attempt(3), reraise=True ) ) return httpx.AsyncClient(transport=transport) # Example usage client = create_network_resilient_client() # Client will now retry on timeout, connection, and read errors ``` ### Custom Retry Logic custom_retry_logic.py ```python import httpx from tenacity import stop_after_attempt, wait_exponential from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after def create_custom_retry_client(): """Create a client with custom retry logic.""" def custom_retry_condition(exception): """Custom logic to determine if we should retry.""" if isinstance(exception, httpx.HTTPStatusError): # Retry on server errors but not client errors return 500 <= exception.response.status_code < 600 return isinstance(exception, httpx.TimeoutException | httpx.ConnectError) transport = AsyncTenacityTransport( config=RetryConfig( retry=custom_retry_condition, # Use wait_retry_after for smart waiting on rate limits, # with custom exponential backoff as fallback wait=wait_retry_after( fallback_strategy=wait_exponential(multiplier=2, max=30), max_wait=120 ), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) return httpx.AsyncClient(transport=transport) client = create_custom_retry_client() # Client will retry server errors (5xx) and network errors, but not client errors (4xx) ``` ## Using with Different Providers The retry transports work with any provider that accepts a custom HTTP client: ### OpenAI openai_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-4o', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ### Anthropic anthropic_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = AnthropicModel('claude-3-5-sonnet-20241022', provider=AnthropicProvider(http_client=client)) agent = Agent(model) ``` ### Any OpenAI-Compatible Provider openai_compatible_with_retries.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel( 'your-model-name', # Replace with actual model name provider=OpenAIProvider( base_url='https://api.example.com/v1', # Replace with actual API URL api_key='your-api-key', # Replace with actual API key http_client=client ) ) agent = Agent(model) ``` ## Best Practices 1. **Start Conservative**: Begin with a small number of retries (3-5) and reasonable wait times. 1. **Use Exponential Backoff**: This helps avoid overwhelming servers during outages. 1. **Set Maximum Wait Times**: Prevent indefinite delays with reasonable maximum wait times. 1. **Handle Rate Limits Properly**: Respect `Retry-After` headers when possible. 1. **Log Retry Attempts**: Add logging to monitor retry behavior in production. (This will be picked up by Logfire automatically if you instrument httpx.) 1. **Consider Circuit Breakers**: For high-traffic applications, consider implementing circuit breaker patterns. ## Error Handling The retry transports will re-raise the last exception if all retry attempts fail. Make sure to handle these appropriately in your application: error_handling_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider from smart_retry_example import create_retrying_client client = create_retrying_client() model = OpenAIChatModel('gpt-4o', provider=OpenAIProvider(http_client=client)) agent = Agent(model) ``` ## Performance Considerations - Retries add latency to requests, especially with exponential backoff - Consider the total timeout for your application when configuring retry behavior - Monitor retry rates to detect systemic issues - Use async transports for better concurrency when handling multiple requests For more advanced retry configurations, refer to the [tenacity documentation](https://tenacity.readthedocs.io/). # Thinking Thinking (or reasoning) is the process by which a model works through a problem step-by-step before providing its final answer. This capability is typically disabled by default and depends on the specific model being used. See the sections below for how to enable thinking for each provider. ## OpenAI When using the OpenAIChatModel, text output inside `` tags are converted to ThinkingPart objects. You can customize the tags using the thinking_tags field on the [model profile](../models/openai/#model-profile). ### OpenAI Responses The OpenAIResponsesModel can generate native thinking parts. To enable this functionality, you need to set the OpenAIResponsesModelSettings.openai_reasoning_effort and OpenAIResponsesModelSettings.openai_reasoning_summary [model settings](../agents/#model-run-settings). By default, the unique IDs of reasoning, text, and function call parts from the message history are sent to the model, which can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../message-history/#processing-message-history). To disable this, you can disable the OpenAIResponsesModelSettings.openai_send_reasoning_ids [model setting](../agents/#model-run-settings). openai_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5') settings = OpenAIResponsesModelSettings( openai_reasoning_effort='low', openai_reasoning_summary='detailed', ) agent = Agent(model, model_settings=settings) ... ``` ## Anthropic To enable thinking, use the AnthropicModelSettings.anthropic_thinking [model setting](../agents/#model-run-settings). anthropic_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel, AnthropicModelSettings model = AnthropicModel('claude-sonnet-4-0') settings = AnthropicModelSettings( anthropic_thinking={'type': 'enabled', 'budget_tokens': 1024}, ) agent = Agent(model, model_settings=settings) ... ``` ## Google To enable thinking, use the GoogleModelSettings.google_thinking_config [model setting](../agents/#model-run-settings). google_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model = GoogleModel('gemini-2.5-pro') settings = GoogleModelSettings(google_thinking_config={'include_thoughts': True}) agent = Agent(model, model_settings=settings) ... ``` ## Bedrock ## Groq Groq supports different formats to receive thinking parts: - `"raw"`: The thinking part is included in the text content inside `` tags, which are automatically converted to ThinkingPart objects. - `"hidden"`: The thinking part is not included in the text content. - `"parsed"`: The thinking part has its own structured part in the response which is converted into a ThinkingPart object. To enable thinking, use the GroqModelSettings.groq_reasoning_format [model setting](../agents/#model-run-settings): groq_thinking_part.py ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel, GroqModelSettings model = GroqModel('qwen-qwq-32b') settings = GroqModelSettings(groq_reasoning_format='parsed') agent = Agent(model, model_settings=settings) ... ``` ## Mistral Thinking is supported by the `magistral` family of models. It does not need to be specifically enabled. ## Cohere Thinking is supported by the `command-a-reasoning-08-2025` model. It does not need to be specifically enabled. ## Hugging Face Text output inside `` tags is automatically converted to ThinkingPart objects. You can customize the tags using the thinking_tags field on the [model profile](../models/openai/#model-profile). # Third-Party Tools Pydantic AI supports integration with various third-party tool libraries, allowing you to leverage existing tool ecosystems in your agents. ## MCP Tools See the [MCP Client](../mcp/client/) documentation for how to use MCP servers with Pydantic AI as [toolsets](../toolsets/). ## LangChain Tools If you'd like to use a tool from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the tool_from_langchain convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tool in question. Here is how you can use the LangChain `DuckDuckGoSearchRun` tool, which requires the `ddgs` package: ```python from langchain_community.tools import DuckDuckGoSearchRun from pydantic_ai import Agent from pydantic_ai.ext.langchain import tool_from_langchain search = DuckDuckGoSearchRun() search_tool = tool_from_langchain(search) agent = Agent( 'google-gla:gemini-2.0-flash', tools=[search_tool], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple LangChain tools or a LangChain [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits), you can use the LangChainToolset [toolset](../toolsets/) which takes a list of LangChain tools: ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-4o', toolsets=[toolset]) # ... ``` ## ACI.dev Tools If you'd like to use a tool from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the tool_from_aci convenience method. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. Here is how you can use the ACI.dev `TAVILY__SEARCH` tool: ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import tool_from_aci tavily_search = tool_from_aci( 'TAVILY__SEARCH', linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent( 'google-gla:gemini-2.0-flash', tools=[tavily_search], ) result = agent.run_sync('What is the release date of Elden Ring Nightreign?') # (1)! print(result.output) #> Elden Ring Nightreign is planned to be released on May 30, 2025. ``` 1. The release date of this game is the 30th of May 2025, which is after the knowledge cutoff for Gemini 2.0 (August 2024). If you'd like to use multiple ACI.dev tools, you can use the ACIToolset [toolset](../toolsets/) which takes a list of ACI tool names as well as the `linked_account_owner_id`: ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-4o', toolsets=[toolset]) ``` ## See Also - [Function Tools](../tools/) - Basic tool concepts and registration - [Toolsets](../toolsets/) - Managing collections of tools - [MCP Client](../mcp/client/) - Using MCP servers with Pydantic AI - [LangChain Toolsets](../toolsets/#langchain-tools) - Using LangChain toolsets - [ACI.dev Toolsets](../toolsets/#aci-tools) - Using ACI.dev toolsets # Advanced Tool Features This page covers advanced features for function tools in Pydantic AI. For basic tool usage, see the [Function Tools](../tools/) documentation. ## Tool Output Tools can return anything that Pydantic can serialize to JSON, as well as audio, video, image or document content depending on the types of [multi-modal input](../input/) the model supports: function_tool_output.py ```python from datetime import datetime from pydantic import BaseModel from pydantic_ai import Agent, DocumentUrl, ImageUrl from pydantic_ai.models.openai import OpenAIResponsesModel class User(BaseModel): name: str age: int agent = Agent(model=OpenAIResponsesModel('gpt-4o')) @agent.tool_plain def get_current_time() -> datetime: return datetime.now() @agent.tool_plain def get_user() -> User: return User(name='John', age=30) @agent.tool_plain def get_company_logo() -> ImageUrl: return ImageUrl(url='https://iili.io/3Hs4FMg.png') @agent.tool_plain def get_document() -> DocumentUrl: return DocumentUrl(url='https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf') result = agent.run_sync('What time is it?') print(result.output) #> The current time is 10:45 PM on April 17, 2025. result = agent.run_sync('What is the user name?') print(result.output) #> The user's name is John. result = agent.run_sync('What is the company name in the logo?') print(result.output) #> The company name in the logo is "Pydantic." result = agent.run_sync('What is the main content of the document?') print(result.output) #> The document contains just the text "Dummy PDF file." ``` *(This example is complete, it can be run "as is")* Some models (e.g. Gemini) natively support semi-structured return values, while some expect text (OpenAI) but seem to be just as good at extracting meaning from the data. If a Python object is returned and the model expects a string, the value will be serialized to JSON. ### Advanced Tool Returns For scenarios where you need more control over both the tool's return value and the content sent to the model, you can use ToolReturn. This is particularly useful when you want to: - Provide rich multi-modal content (images, documents, etc.) to the model as context - Separate the programmatic return value from the model's context - Include additional metadata that shouldn't be sent to the LLM Here's an example of a computer automation tool that captures screenshots and provides visual feedback: advanced_tool_return.py ```python import time from pydantic_ai import Agent from pydantic_ai import ToolReturn, BinaryContent agent = Agent('openai:gpt-4o') @agent.tool_plain def click_and_capture(x: int, y: int) -> ToolReturn: """Click at coordinates and show before/after screenshots.""" # Take screenshot before action before_screenshot = capture_screen() # Perform click operation perform_click(x, y) time.sleep(0.5) # Wait for UI to update # Take screenshot after action after_screenshot = capture_screen() return ToolReturn( return_value=f"Successfully clicked at ({x}, {y})", content=[ f"Clicked at coordinates ({x}, {y}). Here's the comparison:", "Before:", BinaryContent(data=before_screenshot, media_type="image/png"), "After:", BinaryContent(data=after_screenshot, media_type="image/png"), "Please analyze the changes and suggest next steps." ], metadata={ "coordinates": {"x": x, "y": y}, "action_type": "click_and_capture", "timestamp": time.time() } ) # The model receives the rich visual content for analysis # while your application can access the structured return_value and metadata result = agent.run_sync("Click on the submit button and tell me what happened") print(result.output) # The model can analyze the screenshots and provide detailed feedback ``` - **`return_value`**: The actual return value used in the tool response. This is what gets serialized and sent back to the model as the tool's result. - **`content`**: A sequence of content (text, images, documents, etc.) that provides additional context to the model. This appears as a separate user message. - **`metadata`**: Optional metadata that your application can access but is not sent to the LLM. Useful for logging, debugging, or additional processing. Some other AI frameworks call this feature "artifacts". This separation allows you to provide rich context to the model while maintaining clean, structured return values for your application logic. ## Custom Tool Schema If you have a function that lacks appropriate documentation (i.e. poorly named, no type information, poor docstring, use of \*args or \*\*kwargs and suchlike) then you can still turn it into a tool that can be effectively used by the agent with the Tool.from_schema function. With this you provide the name, description, JSON schema, and whether the function takes a `RunContext` for the function directly: ```python from pydantic_ai import Agent, Tool from pydantic_ai.models.test import TestModel def foobar(**kwargs) -> str: return kwargs['a'] + kwargs['b'] tool = Tool.from_schema( function=foobar, name='sum', description='Sum two numbers.', json_schema={ 'additionalProperties': False, 'properties': { 'a': {'description': 'the first number', 'type': 'integer'}, 'b': {'description': 'the second number', 'type': 'integer'}, }, 'required': ['a', 'b'], 'type': 'object', }, takes_ctx=False, ) test_model = TestModel() agent = Agent(test_model, tools=[tool]) result = agent.run_sync('testing...') print(result.output) #> {"sum":0} ``` Please note that validation of the tool arguments will not be performed, and this will pass all arguments as keyword arguments. ## Dynamic Tools Tools can optionally be defined with another function: `prepare`, which is called at each step of a run to customize the definition of the tool passed to the model, or omit the tool completely from that step. A `prepare` method can be registered via the `prepare` kwarg to any of the tool registration mechanisms: - @agent.tool decorator - @agent.tool_plain decorator - Tool dataclass The `prepare` method, should be of type ToolPrepareFunc, a function which takes RunContext and a pre-built ToolDefinition, and should either return that `ToolDefinition` with or without modifying it, return a new `ToolDefinition`, or return `None` to indicate this tools should not be registered for that step. Here's a simple `prepare` method that only includes the tool if the value of the dependency is `42`. As with the previous example, we use TestModel to demonstrate the behavior without calling a real model. tool_only_if_42.py ```python from pydantic_ai import Agent, RunContext, ToolDefinition agent = Agent('test') async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def @agent.tool(prepare=only_if_42) def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' result = agent.run_sync('testing...', deps=41) print(result.output) #> success (no tool calls) result = agent.run_sync('testing...', deps=42) print(result.output) #> {"hitchhiker":"42 a"} ``` *(This example is complete, it can be run "as is")* Here's a more complex example where we change the description of the `name` parameter to based on the value of `deps` For the sake of variation, we create this tool using the Tool dataclass. customize_name.py ```python from __future__ import annotations from typing import Literal from pydantic_ai import Agent, RunContext, Tool, ToolDefinition from pydantic_ai.models.test import TestModel def greet(name: str) -> str: return f'hello {name}' async def prepare_greet( ctx: RunContext[Literal['human', 'machine']], tool_def: ToolDefinition ) -> ToolDefinition | None: d = f'Name of the {ctx.deps} to greet.' tool_def.parameters_json_schema['properties']['name']['description'] = d return tool_def greet_tool = Tool(greet, prepare=prepare_greet) test_model = TestModel() agent = Agent(test_model, tools=[greet_tool], deps_type=Literal['human', 'machine']) result = agent.run_sync('testing...', deps='human') print(result.output) #> {"greet":"hello a"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='greet', parameters_json_schema={ 'additionalProperties': False, 'properties': { 'name': {'type': 'string', 'description': 'Name of the human to greet.'} }, 'required': ['name'], 'type': 'object', }, ) ] """ ``` *(This example is complete, it can be run "as is")* ### Agent-wide Dynamic Tools In addition to per-tool `prepare` methods, you can also define an agent-wide `prepare_tools` function. This function is called at each step of a run and allows you to filter or modify the list of all tool definitions available to the agent for that step. This is especially useful if you want to enable or disable multiple tools at once, or apply global logic based on the current context. The `prepare_tools` function should be of type ToolsPrepareFunc, which takes the RunContext and a list of ToolDefinition, and returns a new list of tool definitions (or `None` to disable all tools for that step). Note The list of tool definitions passed to `prepare_tools` includes both regular function tools and tools from any [toolsets](../toolsets/) registered on the agent, but not [output tools](../output/#tool-output). To modify output tools, you can set a `prepare_output_tools` function instead. Here's an example that makes all tools strict if the model is an OpenAI model: agent_prepare_tools_customize.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs test_model = TestModel() agent = Agent(test_model, prepare_tools=turn_on_strict_if_openai) @agent.tool_plain def echo(message: str) -> str: return message agent.run_sync('testing...') assert test_model.last_model_request_parameters.function_tools[0].strict is None # Set the system attribute of the test_model to 'openai' test_model._system = 'openai' agent.run_sync('testing with openai...') assert test_model.last_model_request_parameters.function_tools[0].strict ``` *(This example is complete, it can be run "as is")* Here's another example that conditionally filters out the tools by name if the dependency (`ctx.deps`) is `True`: agent_prepare_tools_filter_out.py ```python from pydantic_ai import Agent, RunContext, Tool, ToolDefinition def launch_potato(target: str) -> str: return f'Potato launched at {target}!' async def filter_out_tools_by_name( ctx: RunContext[bool], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.deps: return [tool_def for tool_def in tool_defs if tool_def.name != 'launch_potato'] return tool_defs agent = Agent( 'test', tools=[Tool(launch_potato)], prepare_tools=filter_out_tools_by_name, deps_type=bool, ) result = agent.run_sync('testing...', deps=False) print(result.output) #> {"launch_potato":"Potato launched at a!"} result = agent.run_sync('testing...', deps=True) print(result.output) #> success (no tool calls) ``` *(This example is complete, it can be run "as is")* You can use `prepare_tools` to: - Dynamically enable or disable tools based on the current model, dependencies, or other context - Modify tool definitions globally (e.g., set all tools to strict mode, change descriptions, etc.) If both per-tool `prepare` and agent-wide `prepare_tools` are used, the per-tool `prepare` is applied first to each tool, and then `prepare_tools` is called with the resulting list of tool definitions. ## Tool Execution and Retries When a tool is executed, its arguments (provided by the LLM) are first validated against the function's signature using Pydantic. If validation fails (e.g., due to incorrect types or missing required arguments), a `ValidationError` is raised, and the framework automatically generates a RetryPromptPart containing the validation details. This prompt is sent back to the LLM, informing it of the error and allowing it to correct the parameters and retry the tool call. Beyond automatic validation errors, the tool's own internal logic can also explicitly request a retry by raising the ModelRetry exception. This is useful for situations where the parameters were technically valid, but an issue occurred during execution (like a transient network error, or the tool determining the initial attempt needs modification). ```python from pydantic_ai import ModelRetry def my_flaky_tool(query: str) -> str: if query == 'bad': # Tell the LLM the query was bad and it should try again raise ModelRetry("The query 'bad' is not allowed. Please provide a different query.") # ... process query ... return 'Success!' ``` Raising `ModelRetry` also generates a `RetryPromptPart` containing the exception message, which is sent back to the LLM to guide its next attempt. Both `ValidationError` and `ModelRetry` respect the `retries` setting configured on the `Tool` or `Agent`. ### Parallel tool calls & concurrency When a model returns multiple tool calls in one response, Pydantic AI schedules them concurrently using `asyncio.create_task`. If a tool requires sequential/serial execution, you can pass the sequential flag when registering the tool, or wrap the agent run in the with agent.sequential_tool_calls() context manager. Async functions are run on the event loop, while sync functions are offloaded to threads. To get the best performance, *always* use an async function *unless* you're doing blocking I/O (and there's no way to use a non-blocking library instead) or CPU-bound work (like `numpy` or `scikit-learn` operations), so that simple functions are not offloaded to threads unnecessarily. Limiting tool executions You can cap tool executions within a run using [`UsageLimits(tool_calls_limit=...)`](../agents/#usage-limits). The counter increments only after a successful tool invocation. Output tools (used for [structured output](../output/)) are not counted in the `tool_calls` metric. ## See Also - [Function Tools](../tools/) - Basic tool concepts and registration - [Toolsets](../toolsets/) - Managing collections of tools - [Deferred Tools](../deferred-tools/) - Tools requiring approval or external execution - [Third-Party Tools](../third-party-tools/) - Integrations with external tool libraries # Function Tools Function tools provide a mechanism for models to perform actions and retrieve extra information to help them generate a response. They're useful when you want to enable the model to take some action and use the result, when it is impractical or impossible to put all the context an agent might need into the instructions, or when you want to make agents' behavior more deterministic or reliable by deferring some of the logic required to generate a response to another (not necessarily AI-powered) tool. If you want a model to be able to call a function as its final action, without the result being sent back to the model, you can use an [output function](../output/#output-functions) instead. There are a number of ways to register tools with an agent: - via the @agent.tool decorator — for tools that need access to the agent context - via the @agent.tool_plain decorator — for tools that do not need access to the agent context - via the tools keyword argument to `Agent` which can take either plain functions, or instances of Tool For more advanced use cases, the [toolsets](../toolsets/) feature lets you manage collections of tools (built by you or provided by an [MCP server](../mcp/client/) or other [third party](../third-party-tools/#third-party-tools)) and register them with an agent in one go via the toolsets keyword argument to `Agent`. Internally, all `tools` and `toolsets` are gathered into a single [combined toolset](../toolsets/#combining-toolsets) that's made available to the model. Function tools vs. RAG Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information. The main semantic difference between Pydantic AI Tools and RAG is RAG is synonymous with vector search, while Pydantic AI tools are more general-purpose. (Note: we may add support for vector search functionality in the future, particularly an API for generating embeddings. See [#58](https://github.com/pydantic/pydantic-ai/issues/58)) Function Tools vs. Structured Outputs As the name suggests, function tools use the model's "tools" or "functions" API to let the model know what is available to call. Tools or functions are also used to define the schema(s) for [structured output](../output/) when using the default [tool output mode](../output/#tool-output), thus a model might have access to many tools, some of which call function tools while others end the run and produce a final output. ## Registering via Decorator `@agent.tool` is considered the default decorator since in the majority of cases tools will need access to the agent context. Here's an example using both: dice_game.py ```python import random from pydantic_ai import Agent, RunContext agent = Agent( 'google-gla:gemini-1.5-flash', # (1)! deps_type=str, # (2)! system_prompt=( "You're a dice game, you should roll the die and see if the number " "you get back matches the user's guess. If so, tell them they're a winner. " "Use the player's name in the response." ), ) @agent.tool_plain # (3)! def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) @agent.tool # (4)! def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps dice_result = agent.run_sync('My guess is 4', deps='Anne') # (5)! print(dice_result.output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. This is a pretty simple task, so we can use the fast and cheap Gemini flash model. 1. We pass the user's name as the dependency, to keep things simple we use just the name as a string as the dependency. 1. This tool doesn't need any context, it just returns a random number. You could probably use dynamic instructions in this case. 1. This tool needs the player's name, so it uses `RunContext` to access dependencies which are just the player's name in this case. 1. Run the agent, passing the player's name as the dependency. *(This example is complete, it can be run "as is")* Let's print the messages from that game to see what happened: dice_game_messages.py ```python from dice_game import dice_result print(dice_result.all_messages()) """ [ ModelRequest( parts=[ SystemPromptPart( content="You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response.", timestamp=datetime.datetime(...), ), UserPromptPart( content='My guess is 4', timestamp=datetime.datetime(...), ), ] ), ModelResponse( parts=[ ToolCallPart( tool_name='roll_dice', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=90, output_tokens=2), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='roll_dice', content='4', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ ToolCallPart( tool_name='get_player_name', args={}, tool_call_id='pyd_ai_tool_call_id' ) ], usage=RequestUsage(input_tokens=91, output_tokens=4), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ModelRequest( parts=[ ToolReturnPart( tool_name='get_player_name', content='Anne', tool_call_id='pyd_ai_tool_call_id', timestamp=datetime.datetime(...), ) ] ), ModelResponse( parts=[ TextPart( content="Congratulations Anne, you guessed correctly! You're a winner!" ) ], usage=RequestUsage(input_tokens=92, output_tokens=12), model_name='gemini-1.5-flash', timestamp=datetime.datetime(...), ), ] """ ``` We can represent this with a diagram: ``` sequenceDiagram participant Agent participant LLM Note over Agent: Send prompts Agent ->> LLM: System: "You're a dice game..."
User: "My guess is 4" activate LLM Note over LLM: LLM decides to use
a tool LLM ->> Agent: Call tool
roll_dice() deactivate LLM activate Agent Note over Agent: Rolls a six-sided die Agent -->> LLM: ToolReturn
"4" deactivate Agent activate LLM Note over LLM: LLM decides to use
another tool LLM ->> Agent: Call tool
get_player_name() deactivate LLM activate Agent Note over Agent: Retrieves player name Agent -->> LLM: ToolReturn
"Anne" deactivate Agent activate LLM Note over LLM: LLM constructs final response LLM ->> Agent: ModelResponse
"Congratulations Anne, ..." deactivate LLM Note over Agent: Game session complete ``` ## Registering via Agent Argument As well as using the decorators, we can register tools via the `tools` argument to the Agent constructor. This is useful when you want to reuse tools, and can also give more fine-grained control over the tools. dice_game_tool_kwarg.py ```python import random from pydantic_ai import Agent, RunContext, Tool system_prompt = """\ You're a dice game, you should roll the die and see if the number you get back matches the user's guess. If so, tell them they're a winner. Use the player's name in the response. """ def roll_dice() -> str: """Roll a six-sided die and return the result.""" return str(random.randint(1, 6)) def get_player_name(ctx: RunContext[str]) -> str: """Get the player's name.""" return ctx.deps agent_a = Agent( 'google-gla:gemini-1.5-flash', deps_type=str, tools=[roll_dice, get_player_name], # (1)! system_prompt=system_prompt, ) agent_b = Agent( 'google-gla:gemini-1.5-flash', deps_type=str, tools=[ # (2)! Tool(roll_dice, takes_ctx=False), Tool(get_player_name, takes_ctx=True), ], system_prompt=system_prompt, ) dice_result = {} dice_result['a'] = agent_a.run_sync('My guess is 6', deps='Yashar') dice_result['b'] = agent_b.run_sync('My guess is 4', deps='Anne') print(dice_result['a'].output) #> Tough luck, Yashar, you rolled a 4. Better luck next time. print(dice_result['b'].output) #> Congratulations Anne, you guessed correctly! You're a winner! ``` 1. The simplest way to register tools via the `Agent` constructor is to pass a list of functions, the function signature is inspected to determine if the tool takes RunContext. 1. `agent_a` and `agent_b` are identical — but we can use Tool to reuse tool definitions and give more fine-grained control over how tools are defined, e.g. setting their name or description, or using a custom [`prepare`](../tools-advanced/#tool-prepare) method. *(This example is complete, it can be run "as is")* ## Tool Output Tools can return anything that Pydantic can serialize to JSON. For advanced output options including multi-modal content and metadata, see [Advanced Tool Features](../tools-advanced/#function-tool-output). ## Tool Schema Function parameters are extracted from the function signature, and all parameters except `RunContext` are used to build the schema for that tool call. Even better, Pydantic AI extracts the docstring from functions and (thanks to [griffe](https://mkdocstrings.github.io/griffe/)) extracts parameter descriptions from the docstring and adds them to the schema. [Griffe supports](https://mkdocstrings.github.io/griffe/reference/docstrings/#docstrings) extracting parameter descriptions from `google`, `numpy`, and `sphinx` style docstrings. Pydantic AI will infer the format to use based on the docstring, but you can explicitly set it using docstring_format. You can also enforce parameter requirements by setting `require_parameter_descriptions=True`. This will raise a UserError if a parameter description is missing. To demonstrate a tool's schema, here we use FunctionModel to print the schema a model would receive: tool_schema.py ```python from pydantic_ai import Agent, ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import AgentInfo, FunctionModel agent = Agent() @agent.tool_plain(docstring_format='google', require_parameter_descriptions=True) def foobar(a: int, b: str, c: dict[str, list[float]]) -> str: """Get me foobar. Args: a: apple pie b: banana cake c: carrot smoothie """ return f'{a} {b} {c}' def print_schema(messages: list[ModelMessage], info: AgentInfo) -> ModelResponse: tool = info.function_tools[0] print(tool.description) #> Get me foobar. print(tool.parameters_json_schema) """ { 'additionalProperties': False, 'properties': { 'a': {'description': 'apple pie', 'type': 'integer'}, 'b': {'description': 'banana cake', 'type': 'string'}, 'c': { 'additionalProperties': {'items': {'type': 'number'}, 'type': 'array'}, 'description': 'carrot smoothie', 'type': 'object', }, }, 'required': ['a', 'b', 'c'], 'type': 'object', } """ return ModelResponse(parts=[TextPart('foobar')]) agent.run_sync('hello', model=FunctionModel(print_schema)) ``` *(This example is complete, it can be run "as is")* If a tool has a single parameter that can be represented as an object in JSON schema (e.g. dataclass, TypedDict, pydantic model), the schema for the tool is simplified to be just that object. Here's an example where we use TestModel.last_model_request_parameters to inspect the tool schema that would be passed to the model. single_parameter_tool.py ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.test import TestModel agent = Agent() class Foobar(BaseModel): """This is a Foobar""" x: int y: str z: float = 3.14 @agent.tool_plain def foobar(f: Foobar) -> str: return str(f) test_model = TestModel() result = agent.run_sync('hello', model=test_model) print(result.output) #> {"foobar":"x=0 y='a' z=3.14"} print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='foobar', parameters_json_schema={ 'properties': { 'x': {'type': 'integer'}, 'y': {'type': 'string'}, 'z': {'default': 3.14, 'type': 'number'}, }, 'required': ['x', 'y'], 'title': 'Foobar', 'type': 'object', }, description='This is a Foobar', ) ] """ ``` *(This example is complete, it can be run "as is")* ## See Also For more tool features and integrations, see: - [Advanced Tool Features](../tools-advanced/) - Custom schemas, dynamic tools, tool execution and retries - [Toolsets](../toolsets/) - Managing collections of tools - [Builtin Tools](../builtin-tools/) - Native tools provided by LLM providers - [Common Tools](../common-tools/) - Ready-to-use tool implementations - [Third-Party Tools](../third-party-tools/) - Integrations with MCP, LangChain, ACI.dev and other tool libraries - [Deferred Tools](../deferred-tools/) - Tools requiring approval or external execution # Toolsets A toolset represents a collection of [tools](../tools/) that can be registered with an agent in one go. They can be reused by different agents, swapped out at runtime or during testing, and composed in order to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. A toolset can contain locally defined functions, depend on an external service to provide them, or implement custom logic to list available tools and handle them being called. Toolsets are used (among many other things) to define [MCP servers](../mcp/client/) available to an agent. Pydantic AI includes many kinds of toolsets which are described below, and you can define a [custom toolset](#building-a-custom-toolset) by inheriting from the AbstractToolset class. The toolsets that will be available during an agent run can be specified in four different ways: - at agent construction time, via the toolsets keyword argument to `Agent`, which takes toolset instances as well as functions that generate toolsets [dynamically](#dynamically-building-a-toolset) based on the agent run context - at agent run time, via the `toolsets` keyword argument to agent.run(), agent.run_sync(), agent.run_stream(), or agent.iter(). These toolsets will be additional to those registered on the `Agent` - [dynamically](#dynamically-building-a-toolset), via the @agent.toolset decorator which lets you build a toolset based on the agent run context - as a contextual override, via the `toolsets` keyword argument to the agent.override() context manager. These toolsets will replace those provided at agent construction or run time during the life of the context manager toolsets.py ```python from pydantic_ai import Agent, FunctionToolset from pydantic_ai.models.test import TestModel def agent_tool(): return "I'm registered directly on the agent" def extra_tool(): return "I'm passed as an extra tool for a specific run" def override_tool(): return 'I override all other tools' agent_toolset = FunctionToolset(tools=[agent_tool]) # (1)! extra_toolset = FunctionToolset(tools=[extra_tool]) override_toolset = FunctionToolset(tools=[override_tool]) test_model = TestModel() # (2)! agent = Agent(test_model, toolsets=[agent_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool'] result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['agent_tool', 'extra_tool'] with agent.override(toolsets=[override_toolset]): result = agent.run_sync('What tools are available?', toolsets=[extra_toolset]) # (3)! print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['override_tool'] ``` 1. The FunctionToolset will be explained in detail in the next section. 1. We're using TestModel here because it makes it easy to see which tools were available on each run. 1. This `extra_toolset` will be ignored because we're inside an override context. *(This example is complete, it can be run "as is")* ## Function Toolset As the name suggests, a FunctionToolset makes locally defined functions available as tools. Functions can be added as tools in three different ways: - via the @toolset.tool decorator - via the tools keyword argument to the constructor which can take either plain functions, or instances of Tool - via the toolset.add_function() and toolset.add_tool() methods which can take a plain function or an instance of Tool respectively Functions registered in any of these ways can define an initial `ctx: RunContext` argument in order to receive the agent run context. The `add_function()` and `add_tool()` methods can also be used from a tool function to dynamically register new tools during a run to be available in future run steps. function_toolset.py ```python from datetime import datetime from pydantic_ai import Agent, FunctionToolset, RunContext from pydantic_ai.models.test import TestModel def temperature_celsius(city: str) -> float: return 21.0 def temperature_fahrenheit(city: str) -> float: return 69.8 weather_toolset = FunctionToolset(tools=[temperature_celsius, temperature_fahrenheit]) @weather_toolset.tool def conditions(ctx: RunContext, city: str) -> str: if ctx.run_step % 2 == 0: return "It's sunny" else: return "It's raining" datetime_toolset = FunctionToolset() datetime_toolset.add_function(lambda: datetime.now(), name='now') test_model = TestModel() # (1)! agent = Agent(test_model) result = agent.run_sync('What tools are available?', toolsets=[weather_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions'] result = agent.run_sync('What tools are available?', toolsets=[datetime_toolset]) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ## Toolset Composition Toolsets can be composed to dynamically filter which tools are available, modify tool definitions, or change tool execution behavior. Multiple toolsets can also be combined into one. ### Combining Toolsets CombinedToolset takes a list of toolsets and lets them be used as one. combined_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset([weather_toolset, datetime_toolset]) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['temperature_celsius', 'temperature_fahrenheit', 'conditions', 'now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Filtering Tools FilteredToolset wraps a toolset and filters available tools ahead of each step of the run based on a user-defined function that is passed the agent run context and each tool's ToolDefinition and returns a boolean to indicate whether or not a given tool should be available. To easily chain different modifications, you can also call filtered() on any toolset instead of directly constructing a `FilteredToolset`. filtered_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset filtered_toolset = combined_toolset.filtered(lambda ctx, tool_def: 'fahrenheit' not in tool_def.name) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[filtered_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['weather_temperature_celsius', 'weather_conditions', 'datetime_now'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Prefixing Tool Names PrefixedToolset wraps a toolset and adds a prefix to each tool name to prevent tool name conflicts between different toolsets. To easily chain different modifications, you can also call prefixed() on any toolset instead of directly constructing a `PrefixedToolset`. combined_toolset.py ```python from pydantic_ai import Agent, CombinedToolset from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset combined_toolset = CombinedToolset( [ weather_toolset.prefixed('weather'), datetime_toolset.prefixed('datetime') ] ) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[combined_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ [ 'weather_temperature_celsius', 'weather_temperature_fahrenheit', 'weather_conditions', 'datetime_now', ] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Renaming Tools RenamedToolset wraps a toolset and lets you rename tools using a dictionary mapping new names to original names. This is useful when the names provided by a toolset are ambiguous or would conflict with tools defined by other toolsets, but [prefixing them](#prefixing-tool-names) creates a name that is unnecessarily long or could be confusing to the model. To easily chain different modifications, you can also call renamed() on any toolset instead of directly constructing a `RenamedToolset`. renamed_toolset.py ```python from pydantic_ai import Agent from pydantic_ai.models.test import TestModel from combined_toolset import combined_toolset renamed_toolset = combined_toolset.renamed( { 'current_time': 'datetime_now', 'temperature_celsius': 'weather_temperature_celsius', 'temperature_fahrenheit': 'weather_temperature_fahrenheit' } ) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[renamed_toolset]) result = agent.run_sync('What tools are available?') print([t.name for t in test_model.last_model_request_parameters.function_tools]) """ ['temperature_celsius', 'temperature_fahrenheit', 'weather_conditions', 'current_time'] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. *(This example is complete, it can be run "as is")* ### Dynamic Tool Definitions PreparedToolset lets you modify the entire list of available tools ahead of each step of the agent run using a user-defined function that takes the agent run context and a list of ToolDefinitions and returns a list of modified `ToolDefinition`s. This is the toolset-specific equivalent of the [`prepare_tools`](../tools-advanced/#prepare-tools) argument to `Agent` that prepares all tool definitions registered on an agent across toolsets. Note that it is not possible to add or rename tools using `PreparedToolset`. Instead, you can use [`FunctionToolset.add_function()`](#function-toolset) or [`RenamedToolset`](#renaming-tools). To easily chain different modifications, you can also call prepared() on any toolset instead of directly constructing a `PreparedToolset`. prepared_toolset.py ```python from dataclasses import replace from pydantic_ai import Agent, RunContext, ToolDefinition from pydantic_ai.models.test import TestModel from renamed_toolset import renamed_toolset descriptions = { 'temperature_celsius': 'Get the temperature in degrees Celsius', 'temperature_fahrenheit': 'Get the temperature in degrees Fahrenheit', 'weather_conditions': 'Get the current weather conditions', 'current_time': 'Get the current time', } async def add_descriptions(ctx: RunContext, tool_defs: list[ToolDefinition]) -> list[ToolDefinition] | None: return [ replace(tool_def, description=description) if (description := descriptions.get(tool_def.name, None)) else tool_def for tool_def in tool_defs ] prepared_toolset = renamed_toolset.prepared(add_descriptions) test_model = TestModel() # (1)! agent = Agent(test_model, toolsets=[prepared_toolset]) result = agent.run_sync('What tools are available?') print(test_model.last_model_request_parameters.function_tools) """ [ ToolDefinition( name='temperature_celsius', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Celsius', ), ToolDefinition( name='temperature_fahrenheit', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the temperature in degrees Fahrenheit', ), ToolDefinition( name='weather_conditions', parameters_json_schema={ 'additionalProperties': False, 'properties': {'city': {'type': 'string'}}, 'required': ['city'], 'type': 'object', }, description='Get the current weather conditions', ), ToolDefinition( name='current_time', parameters_json_schema={ 'additionalProperties': False, 'properties': {}, 'type': 'object', }, description='Get the current time', ), ] """ ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. ### Requiring Tool Approval ApprovalRequiredToolset wraps a toolset and lets you dynamically [require approval](../deferred-tools/#human-in-the-loop-tool-approval) for a given tool call based on a user-defined function that is passed the agent run context, the tool's ToolDefinition, and the validated tool call arguments. If no function is provided, all tool calls will require approval. To easily chain different modifications, you can also call approval_required() on any toolset instead of directly constructing a `ApprovalRequiredToolset`. See the [Human-in-the-Loop Tool Approval](../deferred-tools/#human-in-the-loop-tool-approval) documentation for more information on how to handle agent runs that call tools that require approval and how to pass in the results. approval_required_toolset.py ```python from pydantic_ai import Agent, DeferredToolRequests, DeferredToolResults from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset approval_required_toolset = prepared_toolset.approval_required(lambda ctx, tool_def, tool_args: tool_def.name.startswith('temperature')) test_model = TestModel(call_tools=['temperature_celsius', 'temperature_fahrenheit']) # (1)! agent = Agent( test_model, toolsets=[approval_required_toolset], output_type=[str, DeferredToolRequests], ) result = agent.run_sync('Call the temperature tools') messages = result.all_messages() print(result.output) """ DeferredToolRequests( calls=[], approvals=[ ToolCallPart( tool_name='temperature_celsius', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_celsius', ), ToolCallPart( tool_name='temperature_fahrenheit', args={'city': 'a'}, tool_call_id='pyd_ai_tool_call_id__temperature_fahrenheit', ), ], ) """ result = agent.run_sync( message_history=messages, deferred_tool_results=DeferredToolResults( approvals={ 'pyd_ai_tool_call_id__temperature_celsius': True, 'pyd_ai_tool_call_id__temperature_fahrenheit': False, } ) ) print(result.output) #> {"temperature_celsius":21.0,"temperature_fahrenheit":"The tool call was denied."} ``` 1. We're using TestModel here because it makes it easy to specify which tools to call. *(This example is complete, it can be run "as is")* ### Changing Tool Execution WrapperToolset wraps another toolset and delegates all responsibility to it. It is is a no-op by default, but you can subclass `WrapperToolset` to change the wrapped toolset's tool execution behavior by overriding the call_tool() method. logging_toolset.py ```python import asyncio from typing_extensions import Any from pydantic_ai import Agent, RunContext, ToolsetTool, WrapperToolset from pydantic_ai.models.test import TestModel from prepared_toolset import prepared_toolset LOG = [] class LoggingToolset(WrapperToolset): async def call_tool(self, name: str, tool_args: dict[str, Any], ctx: RunContext, tool: ToolsetTool) -> Any: LOG.append(f'Calling tool {name!r} with args: {tool_args!r}') try: await asyncio.sleep(0.1 * len(LOG)) # (1)! result = await super().call_tool(name, tool_args, ctx, tool) LOG.append(f'Finished calling tool {name!r} with result: {result!r}') except Exception as e: LOG.append(f'Error calling tool {name!r}: {e}') raise e else: return result logging_toolset = LoggingToolset(prepared_toolset) agent = Agent(TestModel(), toolsets=[logging_toolset]) # (2)! result = agent.run_sync('Call all the tools') print(LOG) """ [ "Calling tool 'temperature_celsius' with args: {'city': 'a'}", "Calling tool 'temperature_fahrenheit' with args: {'city': 'a'}", "Calling tool 'weather_conditions' with args: {'city': 'a'}", "Calling tool 'current_time' with args: {}", "Finished calling tool 'temperature_celsius' with result: 21.0", "Finished calling tool 'temperature_fahrenheit' with result: 69.8", 'Finished calling tool \'weather_conditions\' with result: "It\'s raining"', "Finished calling tool 'current_time' with result: datetime.datetime(...)", ] """ ``` 1. All docs examples are tested in CI and their their output is verified, so we need `LOG` to always have the same order whenever this code is run. Since the tools could finish in any order, we sleep an increasing amount of time based on which number tool call we are to ensure that they finish (and log) in the same order they were called in. 1. We use TestModel here as it will automatically call each tool. *(This example is complete, it can be run "as is")* ## External Toolset If your agent needs to be able to call [external tools](../deferred-tools/#external-tool-execution) that are provided and executed by an upstream service or frontend, you can build an ExternalToolset from a list of ToolDefinitions containing the tool names, arguments JSON schemas, and descriptions. When the model calls an external tool, the call is considered to be ["deferred"](../deferred-tools/#deferred-tools), and the agent run will end with a DeferredToolRequests output object with a `calls` list holding ToolCallParts containing the tool name, validated arguments, and a unique tool call ID, which are expected to be passed to the upstream service or frontend that will produce the results. When the tool call results are received from the upstream service or frontend, you can build a DeferredToolResults object with a `calls` dictionary that maps each tool call ID to an arbitrary value to be returned to the model, a [`ToolReturn`](../tools-advanced/#advanced-tool-returns) object, or a ModelRetry exception in case the tool call failed and the model should [try again](../tools-advanced/#tool-retries). This `DeferredToolResults` object can then be provided to one of the agent run methods as `deferred_tool_results`, alongside the original run's [message history](../message-history/). Note that you need to add `DeferredToolRequests` to the `Agent`'s or `agent.run()`'s [`output_type`](../output/#structured-output) so that the possible types of the agent run output are correctly inferred. For more information, see the [Deferred Tools](../deferred-tools/#deferred-tools) documentation. To demonstrate, let us first define a simple agent *without* deferred tools: deferred_toolset_agent.py ```python from pydantic import BaseModel from pydantic_ai import Agent, FunctionToolset toolset = FunctionToolset() @toolset.tool def get_default_language(): return 'en-US' @toolset.tool def get_user_name(): return 'David' class PersonalizedGreeting(BaseModel): greeting: str language_code: str agent = Agent('openai:gpt-4o', toolsets=[toolset], output_type=PersonalizedGreeting) result = agent.run_sync('Greet the user in a personalized way') print(repr(result.output)) #> PersonalizedGreeting(greeting='Hello, David!', language_code='en-US') ``` Next, let's define a function that represents a hypothetical "run agent" API endpoint that can be called by the frontend and takes a list of messages to send to the model, a list of frontend tool definitions, and optional deferred tool results. This is where `ExternalToolset`, `DeferredToolRequests`, and `DeferredToolResults` come in: deferred_toolset_api.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ExternalToolset, ModelMessage, ToolDefinition, ) from deferred_toolset_agent import PersonalizedGreeting, agent def run_agent( messages: list[ModelMessage] = [], frontend_tools: list[ToolDefinition] = {}, deferred_tool_results: DeferredToolResults | None = None, ) -> tuple[PersonalizedGreeting | DeferredToolRequests, list[ModelMessage]]: deferred_toolset = ExternalToolset(frontend_tools) result = agent.run_sync( toolsets=[deferred_toolset], # (1)! output_type=[agent.output_type, DeferredToolRequests], # (2)! message_history=messages, # (3)! deferred_tool_results=deferred_tool_results, ) return result.output, result.new_messages() ``` 1. As mentioned in the [Deferred Tools](../deferred-tools/#deferred-tools) documentation, these `toolsets` are additional to those provided to the `Agent` constructor 1. As mentioned in the [Deferred Tools](../deferred-tools/#deferred-tools) documentation, this `output_type` overrides the one provided to the `Agent` constructor, so we have to make sure to not lose it 1. We don't include an `user_prompt` keyword argument as we expect the frontend to provide it via `messages` Now, imagine that the code below is implemented on the frontend, and `run_agent` stands in for an API call to the backend that runs the agent. This is where we actually execute the deferred tool calls and start a new run with the new result included: deferred_tools.py ```python from pydantic_ai import ( DeferredToolRequests, DeferredToolResults, ModelMessage, ModelRequest, ModelRetry, ToolDefinition, UserPromptPart, ) from deferred_toolset_api import run_agent frontend_tool_definitions = [ ToolDefinition( name='get_preferred_language', parameters_json_schema={'type': 'object', 'properties': {'default_language': {'type': 'string'}}}, description="Get the user's preferred language from their browser", ) ] def get_preferred_language(default_language: str) -> str: return 'es-MX' # (1)! frontend_tool_functions = {'get_preferred_language': get_preferred_language} messages: list[ModelMessage] = [ ModelRequest( parts=[ UserPromptPart(content='Greet the user in a personalized way') ] ) ] deferred_tool_results: DeferredToolResults | None = None final_output = None while True: output, new_messages = run_agent(messages, frontend_tool_definitions, deferred_tool_results) messages += new_messages if not isinstance(output, DeferredToolRequests): final_output = output break print(output.calls) """ [ ToolCallPart( tool_name='get_preferred_language', args={'default_language': 'en-US'}, tool_call_id='pyd_ai_tool_call_id', ) ] """ deferred_tool_results = DeferredToolResults() for tool_call in output.calls: if function := frontend_tool_functions.get(tool_call.tool_name): result = function(**tool_call.args_as_dict()) else: result = ModelRetry(f'Unknown tool {tool_call.tool_name!r}') deferred_tool_results.calls[tool_call.tool_call_id] = result print(repr(final_output)) """ PersonalizedGreeting(greeting='Hola, David! Espero que tengas un gran día!', language_code='es-MX') """ ``` 1. Imagine that this returns the frontend [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/Navigator/language). *(This example is complete, it can be run "as is")* ## Dynamically Building a Toolset Toolsets can be built dynamically ahead of each agent run or run step using a function that takes the agent run context and returns a toolset or `None`. This is useful when a toolset (like an MCP server) depends on information specific to an agent run, like its [dependencies](../dependencies/). To register a dynamic toolset, you can pass a function that takes RunContext to the `toolsets` argument of the `Agent` constructor, or you can wrap a compliant function in the @agent.toolset decorator. By default, the function will be called again ahead of each agent run step. If you are using the decorator, you can optionally provide a `per_run_step=False` argument to indicate that the toolset only needs to be built once for the entire run. dynamic_toolset.py ```python from dataclasses import dataclass from typing import Literal from pydantic_ai import Agent, RunContext from pydantic_ai.models.test import TestModel from function_toolset import datetime_toolset, weather_toolset @dataclass class ToggleableDeps: active: Literal['weather', 'datetime'] def toggle(self): if self.active == 'weather': self.active = 'datetime' else: self.active = 'weather' test_model = TestModel() # (1)! agent = Agent( test_model, deps_type=ToggleableDeps # (2)! ) @agent.toolset def toggleable_toolset(ctx: RunContext[ToggleableDeps]): if ctx.deps.active == 'weather': return weather_toolset else: return datetime_toolset @agent.tool def toggle(ctx: RunContext[ToggleableDeps]): ctx.deps.toggle() deps = ToggleableDeps('weather') result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) # (3)! #> ['toggle', 'now'] result = agent.run_sync('Toggle the toolset', deps=deps) print([t.name for t in test_model.last_model_request_parameters.function_tools]) #> ['toggle', 'temperature_celsius', 'temperature_fahrenheit', 'conditions'] ``` 1. We're using TestModel here because it makes it easy to see which tools were available on each run. 1. We're using the agent's dependencies to give the `toggle` tool access to the `active` via the `RunContext` argument. 1. This shows the available tools *after* the `toggle` tool was executed, as the "last model request" was the one that returned the `toggle` tool result to the model. *(This example is complete, it can be run "as is")* ## Building a Custom Toolset To define a fully custom toolset with its own logic to list available tools and handle them being called, you can subclass AbstractToolset and implement the get_tools() and call_tool() methods. If you want to reuse a network connection or session across tool listings and calls during an agent run, you can implement __aenter__() and __aexit__(). ## Third-Party Toolsets ### MCP Servers See the [MCP Client](../mcp/client/) documentation for how to use MCP servers with Pydantic AI. ### LangChain Tools If you'd like to use tools or a [toolkit](https://python.langchain.com/docs/concepts/tools/#toolkits) from LangChain's [community tool library](https://python.langchain.com/docs/integrations/tools/) with Pydantic AI, you can use the LangChainToolset which takes a list of LangChain tools. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the LangChain tool, and up to the LangChain tool to raise an error if the arguments are invalid. You will need to install the `langchain-community` package and any others required by the tools in question. ```python from langchain_community.agent_toolkits import SlackToolkit from pydantic_ai import Agent from pydantic_ai.ext.langchain import LangChainToolset toolkit = SlackToolkit() toolset = LangChainToolset(toolkit.get_tools()) agent = Agent('openai:gpt-4o', toolsets=[toolset]) # ... ``` ### ACI.dev Tools If you'd like to use tools from the [ACI.dev tool library](https://www.aci.dev/tools) with Pydantic AI, you can use the ACIToolset [toolset](./) which takes a list of ACI tool names as well as the `linked_account_owner_id`. Note that Pydantic AI will not validate the arguments in this case -- it's up to the model to provide arguments matching the schema specified by the ACI tool, and up to the ACI tool to raise an error if the arguments are invalid. You will need to install the `aci-sdk` package, set your ACI API key in the `ACI_API_KEY` environment variable, and pass your ACI "linked account owner ID" to the function. ```python import os from pydantic_ai import Agent from pydantic_ai.ext.aci import ACIToolset toolset = ACIToolset( [ 'OPEN_WEATHER_MAP__CURRENT_WEATHER', 'OPEN_WEATHER_MAP__FORECAST', ], linked_account_owner_id=os.getenv('LINKED_ACCOUNT_OWNER_ID'), ) agent = Agent('openai:gpt-4o', toolsets=[toolset]) ``` # Models # Anthropic ## Install To use `AnthropicModel` models, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `anthropic` optional group: ```bash pip install "pydantic-ai-slim[anthropic]" ``` ```bash uv add "pydantic-ai-slim[anthropic]" ``` ## Configuration To use [Anthropic](https://anthropic.com) through their API, go to [console.anthropic.com/settings/keys](https://console.anthropic.com/settings/keys) to generate an API key. `AnthropicModelName` contains a list of available Anthropic models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export ANTHROPIC_API_KEY='your-api-key' ``` You can then use `AnthropicModel` by name: ```python from pydantic_ai import Agent agent = Agent('anthropic:claude-3-5-sonnet-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel model = AnthropicModel('claude-3-5-sonnet-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider model = AnthropicModel( 'claude-3-5-sonnet-latest', provider=AnthropicProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` ## Custom HTTP Client You can customize the `AnthropicProvider` with a custom `httpx.AsyncClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.providers.anthropic import AnthropicProvider custom_http_client = AsyncClient(timeout=30) model = AnthropicModel( 'claude-3-5-sonnet-latest', provider=AnthropicProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Bedrock ## Install To use `BedrockConverseModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `bedrock` optional group: ```bash pip install "pydantic-ai-slim[bedrock]" ``` ```bash uv add "pydantic-ai-slim[bedrock]" ``` ## Configuration To use [AWS Bedrock](https://aws.amazon.com/bedrock/), you'll need an AWS account with Bedrock enabled and appropriate credentials. You can use either AWS credentials directly or a pre-configured boto3 client. `BedrockModelName` contains a list of available Bedrock models, including models from Anthropic, Amazon, Cohere, Meta, and Mistral. ## Environment variables You can set your AWS credentials as environment variables ([among other options](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables)): ```bash export AWS_BEARER_TOKEN_BEDROCK='your-api-key' # or: export AWS_ACCESS_KEY_ID='your-access-key' export AWS_SECRET_ACCESS_KEY='your-secret-key' export AWS_DEFAULT_REGION='us-east-1' # or your preferred region ``` You can then use `BedrockConverseModel` by name: ```python from pydantic_ai import Agent agent = Agent('bedrock:anthropic.claude-3-sonnet-20240229-v1:0') ... ``` Or initialize the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel model = BedrockConverseModel('anthropic.claude-3-sonnet-20240229-v1:0') agent = Agent(model) ... ``` ## Customizing Bedrock Runtime API You can customize the Bedrock Runtime API calls by adding additional parameters, such as [guardrail configurations](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) and [performance settings](https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html). For a complete list of configurable parameters, refer to the documentation for BedrockModelSettings. customize_bedrock_model_settings.py ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel, BedrockModelSettings # Define Bedrock model settings with guardrail and performance configurations bedrock_model_settings = BedrockModelSettings( bedrock_guardrail_config={ 'guardrailIdentifier': 'v1', 'guardrailVersion': 'v1', 'trace': 'enabled' }, bedrock_performance_configuration={ 'latency': 'optimized' } ) model = BedrockConverseModel(model_name='us.amazon.nova-pro-v1:0') agent = Agent(model=model, model_settings=bedrock_model_settings) ``` ## `provider` argument You can provide a custom `BedrockProvider` via the `provider` argument. This is useful when you want to specify credentials directly or use a custom boto3 client: ```python from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using AWS credentials directly model = BedrockConverseModel( 'anthropic.claude-3-sonnet-20240229-v1:0', provider=BedrockProvider( region_name='us-east-1', aws_access_key_id='your-access-key', aws_secret_access_key='your-secret-key', ), ) agent = Agent(model) ... ``` You can also pass a pre-configured boto3 client: ```python import boto3 from pydantic_ai import Agent from pydantic_ai.models.bedrock import BedrockConverseModel from pydantic_ai.providers.bedrock import BedrockProvider # Using a pre-configured boto3 client bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1') model = BedrockConverseModel( 'anthropic.claude-3-sonnet-20240229-v1:0', provider=BedrockProvider(bedrock_client=bedrock_client), ) agent = Agent(model) ... ``` # Cohere ## Install To use `CohereModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `cohere` optional group: ```bash pip install "pydantic-ai-slim[cohere]" ``` ```bash uv add "pydantic-ai-slim[cohere]" ``` ## Configuration To use [Cohere](https://cohere.com/) through their API, go to [dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) and follow your nose until you find the place to generate an API key. `CohereModelName` contains a list of the most popular Cohere models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export CO_API_KEY='your-api-key' ``` You can then use `CohereModel` by name: ```python from pydantic_ai import Agent agent = Agent('cohere:command') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel model = CohereModel('command') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider model = CohereModel('command', provider=CohereProvider(api_key='your-api-key')) agent = Agent(model) ... ``` You can also customize the `CohereProvider` with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.cohere import CohereModel from pydantic_ai.providers.cohere import CohereProvider custom_http_client = AsyncClient(timeout=30) model = CohereModel( 'command', provider=CohereProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Google The `GoogleModel` is a model that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Install To use `GoogleModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `google` optional group: ```bash pip install "pydantic-ai-slim[google]" ``` ```bash uv add "pydantic-ai-slim[google]" ``` ______________________________________________________________________ ## Configuration `GoogleModel` lets you use Google's Gemini models through their [Generative Language API](https://ai.google.dev/api/all-methods) (`generativelanguage.googleapis.com`) or [Vertex AI API](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) (`*-aiplatform.googleapis.com`). ### API Key (Generative Language API) To use Gemini via the Generative Language API, go to [aistudio.google.com](https://aistudio.google.com/apikey) and create an API key. Once you have the API key, set it as an environment variable: ```bash export GOOGLE_API_KEY=your-api-key ``` You can then use `GoogleModel` by explicitly creating a provider: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(api_key='your-api-key') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` ### Vertex AI (Enterprise/Cloud) If you are an enterprise user, you can use the `google-vertex` provider with `GoogleModel` to access Gemini via Vertex AI. This interface has a number of advantages over the Generative Language API: 1. The VertexAI API comes with more enterprise readiness guarantees. 1. You can [purchase provisioned throughput](https://cloud.google.com/vertex-ai/generative-ai/docs/provisioned-throughput#purchase-provisioned-throughput) with VertexAI to guarantee capacity. 1. If you're running Pydantic AI inside GCP, you don't need to set up authentication, it should "just work". 1. You can decide which region to use, which might be important from a regulatory perspective, and might improve latency. The big disadvantage is that for local development you may need to create and configure a "service account", which can be challenging to get right. Whichever way you authenticate, you'll need to have VertexAI enabled in your GCP account. To use Vertex AI, you may need to set up [application default credentials](https://cloud.google.com/docs/authentication/application-default-credentials) or use a service account. You can also specify the region. #### Application Default Credentials If you have the [`gcloud` CLI](https://cloud.google.com/sdk/gcloud) installed and configured, you can use: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True) model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` #### Service Account To use a service account JSON file: google_model_service_account.py ```python from google.oauth2 import service_account from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider credentials = service_account.Credentials.from_service_account_file( 'path/to/service-account.json', scopes=['https://www.googleapis.com/auth/cloud-platform'], ) provider = GoogleProvider(credentials=credentials, project='your-project-id') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` #### Customizing Location You can specify the location when using Vertex AI: google_model_location.py ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider provider = GoogleProvider(vertexai=True, location='asia-east1') model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` #### Customizing Model You can access models from the [Model Garden](https://cloud.google.com/model-garden?hl=en) that support the generateContent API and are available under your GCP project, including but not limited to Gemini, using one of the following `model_name` patterns: - `{model_id}` for Gemini models - `{publisher}/{model_id}` - `publishers/{publisher}/models/{model_id}` - `projects/{project}/locations/{location}/publishers/{publisher}/models/{model_id}` ```python from google.oauth2 import service_account from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider credentials = service_account.Credentials.from_service_account_file( 'path/to/service-account.json', scopes=['https://www.googleapis.com/auth/cloud-platform'], ) provider = GoogleProvider( credentials=credentials, project='your-gcp-project-id', location='us-central1', # the region where the model is available ) model = GoogleModel('meta/llama-3.3-70b-instruct-maas', provider=provider) agent = Agent(model) ... ``` ## Provider Argument You can supply a custom `GoogleProvider` instance using the `provider` argument to configure advanced client options, such as setting a custom `base_url`. This is useful if you're using a custom-compatible endpoint with the Google Generative Language API. ```python from google.genai import Client from google.genai.types import HttpOptions from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel from pydantic_ai.providers.google import GoogleProvider client = Client( api_key='gemini-custom-api-key', http_options=HttpOptions(base_url='gemini-custom-base-url'), ) provider = GoogleProvider(client=client) model = GoogleModel('gemini-1.5-flash', provider=provider) agent = Agent(model) ... ``` ## Model Settings You can customize model behavior using GoogleModelSettings: ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings settings = GoogleModelSettings( temperature=0.2, max_tokens=1024, google_thinking_config={'thinking_budget': 2048}, google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-1.5-flash') agent = Agent(model, model_settings=settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings, and [thinking config](https://ai.google.dev/gemini-api/docs/thinking). ## Document, Image, Audio, and Video Input `GoogleModel` supports multi-modal input, including documents, images, audio, and video. See the [input documentation](../../input/) for details and examples. ## Model settings You can use the GoogleModelSettings class to customize the model request. ### Disable thinking You can disable thinking by setting the `thinking_budget` to `0` on the `google_thinking_config`: ```python from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings(google_thinking_config={'thinking_budget': 0}) model = GoogleModel('gemini-2.0-flash') agent = Agent(model, model_settings=model_settings) ... ``` Check out the [Gemini API docs](https://ai.google.dev/gemini-api/docs/thinking) for more on thinking. ### Safety settings You can customize the safety settings by setting the `google_safety_settings` field. ```python from google.genai.types import HarmBlockThreshold, HarmCategory from pydantic_ai import Agent from pydantic_ai.models.google import GoogleModel, GoogleModelSettings model_settings = GoogleModelSettings( google_safety_settings=[ { 'category': HarmCategory.HARM_CATEGORY_HATE_SPEECH, 'threshold': HarmBlockThreshold.BLOCK_LOW_AND_ABOVE, } ] ) model = GoogleModel('gemini-2.0-flash') agent = Agent(model, model_settings=model_settings) ... ``` See the [Gemini API docs](https://ai.google.dev/gemini-api/docs/safety-settings) for more on safety settings. # Groq ## Install To use `GroqModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `groq` optional group: ```bash pip install "pydantic-ai-slim[groq]" ``` ```bash uv add "pydantic-ai-slim[groq]" ``` ## Configuration To use [Groq](https://groq.com/) through their API, go to [console.groq.com/keys](https://console.groq.com/keys) and follow your nose until you find the place to generate an API key. `GroqModelName` contains a list of available Groq models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export GROQ_API_KEY='your-api-key' ``` You can then use `GroqModel` by name: ```python from pydantic_ai import Agent agent = Agent('groq:llama-3.3-70b-versatile') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel model = GroqModel('llama-3.3-70b-versatile') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key') ) agent = Agent(model) ... ``` You can also customize the `GroqProvider` with a custom `httpx.AsyncHTTPClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.groq import GroqModel from pydantic_ai.providers.groq import GroqProvider custom_http_client = AsyncClient(timeout=30) model = GroqModel( 'llama-3.3-70b-versatile', provider=GroqProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # Hugging Face [Hugging Face](https://huggingface.co/) is an AI platform with all major open source models, datasets, MCPs, and demos. You can use [Inference Providers](https://huggingface.co/docs/inference-providers) to run open source models like DeepSeek R1 on scalable serverless infrastructure. ## Install To use `HuggingFaceModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `huggingface` optional group: ```bash pip install "pydantic-ai-slim[huggingface]" ``` ```bash uv add "pydantic-ai-slim[huggingface]" ``` ## Configuration To use [Hugging Face](https://huggingface.co/) inference, you'll need to set up an account which will give you [free tier](https://huggingface.co/docs/inference-providers/pricing) allowance on [Inference Providers](https://huggingface.co/docs/inference-providers). To setup inference, follow these steps: 1. Go to [Hugging Face](https://huggingface.co/join) and sign up for an account. 1. Create a new access token in [Hugging Face](https://huggingface.co/settings/tokens). 1. Set the `HF_TOKEN` environment variable to the token you just created. Once you have a Hugging Face access token, you can set it as an environment variable: ```bash export HF_TOKEN='hf_token' ``` ## Usage You can then use HuggingFaceModel by name: ```python from pydantic_ai import Agent agent = Agent('huggingface:Qwen/Qwen3-235B-A22B') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel model = HuggingFaceModel('Qwen/Qwen3-235B-A22B') agent = Agent(model) ... ``` By default, the HuggingFaceModel uses the HuggingFaceProvider that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in https://hf.co/settings/inference-providers. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the HuggingFaceProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius')) agent = Agent(model) ... ``` ## Custom Hugging Face client HuggingFaceProvider also accepts a custom [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client via the `hf_client` parameter, so you can customise the `headers`, `bill_to` (billing to an HF organization you're a member of), `base_url` etc. as defined in the [Hugging Face Hub python library docs](https://huggingface.co/docs/huggingface_hub/package_reference/inference_client). ```python from huggingface_hub import AsyncInferenceClient from pydantic_ai import Agent from pydantic_ai.models.huggingface import HuggingFaceModel from pydantic_ai.providers.huggingface import HuggingFaceProvider client = AsyncInferenceClient( bill_to='openai', api_key='hf_token', provider='fireworks-ai', ) model = HuggingFaceModel( 'Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(hf_client=client), ) agent = Agent(model) ... ``` # Mistral ## Install To use `MistralModel`, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `mistral` optional group: ```bash pip install "pydantic-ai-slim[mistral]" ``` ```bash uv add "pydantic-ai-slim[mistral]" ``` ## Configuration To use [Mistral](https://mistral.ai) through their API, go to [console.mistral.ai/api-keys/](https://console.mistral.ai/api-keys/) and follow your nose until you find the place to generate an API key. `LatestMistralModelNames` contains a list of the most popular Mistral models. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export MISTRAL_API_KEY='your-api-key' ``` You can then use `MistralModel` by name: ```python from pydantic_ai import Agent agent = Agent('mistral:mistral-large-latest') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel model = MistralModel('mistral-small-latest') agent = Agent(model) ... ``` ## `provider` argument You can provide a custom `Provider` via the `provider` argument: ```python from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', base_url='https://') ) agent = Agent(model) ... ``` You can also customize the provider with a custom `httpx.AsyncHTTPClient`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.mistral import MistralModel from pydantic_ai.providers.mistral import MistralProvider custom_http_client = AsyncClient(timeout=30) model = MistralModel( 'mistral-large-latest', provider=MistralProvider(api_key='your-api-key', http_client=custom_http_client), ) agent = Agent(model) ... ``` # OpenAI ## Install To use OpenAI models or OpenAI-compatible APIs, you need to either install `pydantic-ai`, or install `pydantic-ai-slim` with the `openai` optional group: ```bash pip install "pydantic-ai-slim[openai]" ``` ```bash uv add "pydantic-ai-slim[openai]" ``` ## Configuration To use `OpenAIChatModel` with the OpenAI API, go to [platform.openai.com](https://platform.openai.com/) and follow your nose until you find the place to generate an API key. ## Environment variable Once you have the API key, you can set it as an environment variable: ```bash export OPENAI_API_KEY='your-api-key' ``` You can then use `OpenAIChatModel` by name: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-5') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel model = OpenAIChatModel('gpt-5') agent = Agent(model) ... ``` By default, the `OpenAIChatModel` uses the `OpenAIProvider` with the `base_url` set to `https://api.openai.com/v1`. ## Configure the provider If you want to pass parameters in code to the provider, you can programmatically instantiate the OpenAIProvider and pass it to the model: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel('gpt-5', provider=OpenAIProvider(api_key='your-api-key')) agent = Agent(model) ... ``` ## Custom OpenAI Client `OpenAIProvider` also accepts a custom `AsyncOpenAI` client via the `openai_client` parameter, so you can customise the `organization`, `project`, `base_url` etc. as defined in the [OpenAI API docs](https://platform.openai.com/docs/api-reference). custom_openai_client.py ```python from openai import AsyncOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncOpenAI(max_retries=3) model = OpenAIChatModel('gpt-5', provider=OpenAIProvider(openai_client=client)) agent = Agent(model) ... ``` You could also use the [`AsyncAzureOpenAI`](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/switching-endpoints) client to use the Azure OpenAI API. Note that the `AsyncAzureOpenAI` is a subclass of `AsyncOpenAI`. ```python from openai import AsyncAzureOpenAI from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider client = AsyncAzureOpenAI( azure_endpoint='...', api_version='2024-07-01-preview', api_key='your-api-key', ) model = OpenAIChatModel( 'gpt-5', provider=OpenAIProvider(openai_client=client), ) agent = Agent(model) ... ``` ## OpenAI Responses API Pydantic AI also supports OpenAI's [Responses API](https://platform.openai.com/docs/api-reference/responses) through the You can use OpenAIResponsesModel by name: ```python from pydantic_ai import Agent agent = Agent('openai-responses:gpt-5') ... ``` Or initialise the model directly with just the model name: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel model = OpenAIResponsesModel('gpt-5') agent = Agent(model) ... ``` You can learn more about the differences between the Responses API and Chat Completions API in the [OpenAI API docs](https://platform.openai.com/docs/guides/migrate-to-responses). ### Built-in tools The Responses API has built-in tools that you can use instead of building your own: - [Web search](https://platform.openai.com/docs/guides/tools-web-search): allow models to search the web for the latest information before generating a response. - [Code interpreter](https://platform.openai.com/docs/guides/tools-code-interpreter): allow models to write and run Python code in a sandboxed environment before generating a response. - [Image generation](https://platform.openai.com/docs/guides/tools-image-generation): allow models to generate images based on a text prompt. - [File search](https://platform.openai.com/docs/guides/tools-file-search): allow models to search your files for relevant information before generating a response. - [Computer use](https://platform.openai.com/docs/guides/tools-computer-use): allow models to use a computer to perform tasks on your behalf. Web search, Code interpreter, and Image generation are natively supported through the [Built-in tools](../../builtin-tools/) feature. File search and Computer use can be enabled by passing an [`openai.types.responses.FileSearchToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/file_search_tool_param.py) or [`openai.types.responses.ComputerToolParam`](https://github.com/openai/openai-python/blob/main/src/openai/types/responses/computer_tool_param.py) in the `openai_builtin_tools` setting on OpenAIResponsesModelSettings. They don't currently generate BuiltinToolCallPart or BuiltinToolReturnPart parts in the message history, or streamed events; please submit an issue if you need native support for these built-in tools. file_search_tool.py ```python from openai.types.responses import FileSearchToolParam from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model_settings = OpenAIResponsesModelSettings( openai_builtin_tools=[ FileSearchToolParam( type='file_search', vector_store_ids=['your-history-book-vector-store-id'] ) ], ) model = OpenAIResponsesModel('gpt-5') agent = Agent(model=model, model_settings=model_settings) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> Albert Einstein was a German-born theoretical physicist. ``` #### Referencing earlier responses The Responses API supports referencing earlier model responses in a new request using a `previous_response_id` parameter, to ensure the full [conversation state](https://platform.openai.com/docs/guides/conversation-state?api-mode=responses#passing-context-from-the-previous-response) including [reasoning items](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) are kept in context. This is available through the `openai_previous_response_id` field in OpenAIResponsesModelSettings. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5') agent = Agent(model=model) result = agent.run_sync('The secret is 1234') model_settings = OpenAIResponsesModelSettings( openai_previous_response_id=result.all_messages()[-1].provider_response_id ) result = agent.run_sync('What is the secret code?', model_settings=model_settings) print(result.output) #> 1234 ``` By passing the `provider_response_id` from an earlier run, you can allow the model to build on its own prior reasoning without needing to resend the full message history. ##### Automatically referencing earlier responses When the `openai_previous_response_id` field is set to `'auto'`, Pydantic AI will automatically select the most recent `provider_response_id` from message history and omit messages that came before it, letting the OpenAI API leverage server-side history instead for improved efficiency. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIResponsesModel, OpenAIResponsesModelSettings model = OpenAIResponsesModel('gpt-5') agent = Agent(model=model) result1 = agent.run_sync('Tell me a joke.') print(result1.output) #> Did you hear about the toothpaste scandal? They called it Colgate. # When set to 'auto', the most recent provider_response_id # and messages after it are sent as request. model_settings = OpenAIResponsesModelSettings(openai_previous_response_id='auto') result2 = agent.run_sync( 'Explain?', message_history=result1.new_messages(), model_settings=model_settings ) print(result2.output) #> This is an excellent joke invented by Samuel Colvin, it needs no explanation. ``` ## OpenAI-compatible Models Many providers and models are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI. Before getting started, check the [installation and configuration](#install) instructions above. To use another OpenAI-compatible API, you can make use of the `base_url` and `api_key` arguments from `OpenAIProvider`: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://', api_key='your-api-key' ), ) agent = Agent(model) ... ``` Various providers also have their own provider classes so that you don't need to specify the base URL yourself and you can use the standard `_API_KEY` environment variable to set the API key. When a provider has its own provider class, you can use the `Agent(":")` shorthand, e.g. `Agent("deepseek:deepseek-chat")` or `Agent("openrouter:google/gemini-2.5-pro-preview")`, instead of building the `OpenAIChatModel` explicitly. Similarly, you can pass the provider name as a string to the `provider` argument on `OpenAIChatModel` instead of building instantiating the provider class explicitly. #### Model Profile Sometimes, the provider or model you're using will have slightly different requirements than OpenAI's API or models, like having different restrictions on JSON schemas for tool definitions, or not supporting tool definitions to be marked as strict. When using an alternative provider class provided by Pydantic AI, an appropriate model profile is typically selected automatically based on the model name. If the model you're using is not working correctly out of the box, you can tweak various aspects of how model requests are constructed by providing your own ModelProfile (for behaviors shared among all model classes) or OpenAIModelProfile (for behaviors specific to `OpenAIChatModel`): ```py from pydantic_ai import Agent, InlineDefsJsonSchemaTransformer from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.profiles.openai import OpenAIModelProfile from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'model_name', provider=OpenAIProvider( base_url='https://.com', api_key='your-api-key' ), profile=OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, # Supported by any model class on a plain ModelProfile openai_supports_strict_tool_definition=False # Supported by OpenAIModel only, requires OpenAIModelProfile ) ) agent = Agent(model) ``` ### DeepSeek To use the [DeepSeek](https://deepseek.com) provider, first create an API key by following the [Quick Start guide](https://api-docs.deepseek.com/). You can then set the `DEEPSEEK_API_KEY` environment variable and use DeepSeekProvider by name: ```python from pydantic_ai import Agent agent = Agent('deepseek:deepseek-chat') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider(api_key='your-deepseek-api-key'), ) agent = Agent(model) ... ``` You can also customize any provider with a custom `http_client`: ```python from httpx import AsyncClient from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.deepseek import DeepSeekProvider custom_http_client = AsyncClient(timeout=30) model = OpenAIChatModel( 'deepseek-chat', provider=DeepSeekProvider( api_key='your-deepseek-api-key', http_client=custom_http_client ), ) agent = Agent(model) ... ``` ### Ollama Pydantic AI supports both self-hosted [Ollama](https://ollama.com/) servers (running locally or remotely) and [Ollama Cloud](https://ollama.com/cloud). For servers running locally, use the `http://localhost:11434/v1` base URL. For Ollama Cloud, use `https://ollama.com/v1` and ensure an API key is set. You can set the `OLLAMA_BASE_URL` and (optionally) `OLLAMA_API_KEY` environment variables and use OllamaProvider by name: ```python from pydantic_ai import Agent agent = Agent('ollama:gpt-oss:20b') ... ``` Or initialise the model and provider directly: ```python from pydantic import BaseModel from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.ollama import OllamaProvider class CityLocation(BaseModel): city: str country: str ollama_model = OpenAIChatModel( model_name='gpt-oss:20b', provider=OllamaProvider(base_url='http://localhost:11434/v1'), # (1)! ) agent = Agent(ollama_model, output_type=CityLocation) result = agent.run_sync('Where were the olympics held in 2012?') print(result.output) #> city='London' country='United Kingdom' print(result.usage()) #> RunUsage(input_tokens=57, output_tokens=8, requests=1) ``` 1. For Ollama Cloud, use the `base_url='https://ollama.com/v1'` and set the `OLLAMA_API_KEY` environment variable. ### Azure AI Foundry To use [Azure AI Foundry](https://ai.azure.com/) as your provider, you can set the `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION` environment variables and use AzureProvider by name: ```python from pydantic_ai import Agent agent = Agent('azure:gpt-5') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.azure import AzureProvider model = OpenAIChatModel( 'gpt-5', provider=AzureProvider( azure_endpoint='your-azure-endpoint', api_version='your-api-version', api_key='your-api-key', ), ) agent = Agent(model) ... ``` ### OpenRouter To use [OpenRouter](https://openrouter.ai), first create an API key at [openrouter.ai/keys](https://openrouter.ai/keys). You can set the `OPENROUTER_API_KEY` environment variable and use OpenRouterProvider by name: ```python from pydantic_ai import Agent agent = Agent('openrouter:anthropic/claude-3.5-sonnet') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openrouter import OpenRouterProvider model = OpenAIChatModel( 'anthropic/claude-3.5-sonnet', provider=OpenRouterProvider(api_key='your-openrouter-api-key'), ) agent = Agent(model) ... ``` ### Vercel AI Gateway To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token. You can set the `VERCEL_AI_GATEWAY_API_KEY` and `VERCEL_OIDC_TOKEN` environment variables and use VercelProvider by name: ```python from pydantic_ai import Agent agent = Agent('vercel:anthropic/claude-4-sonnet') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.vercel import VercelProvider model = OpenAIChatModel( 'anthropic/claude-4-sonnet', provider=VercelProvider(api_key='your-vercel-ai-gateway-api-key'), ) agent = Agent(model) ... ``` ### Grok (xAI) Go to [xAI API Console](https://console.x.ai/) and create an API key. You can set the `GROK_API_KEY` environment variable and use GrokProvider by name: ```python from pydantic_ai import Agent agent = Agent('grok:grok-2-1212') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.grok import GrokProvider model = OpenAIChatModel( 'grok-2-1212', provider=GrokProvider(api_key='your-xai-api-key'), ) agent = Agent(model) ... ``` ### MoonshotAI Create an API key in the [Moonshot Console](https://platform.moonshot.ai/console). You can set the `MOONSHOTAI_API_KEY` environment variable and use MoonshotAIProvider by name: ```python from pydantic_ai import Agent agent = Agent('moonshotai:kimi-k2-0711-preview') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.moonshotai import MoonshotAIProvider model = OpenAIChatModel( 'kimi-k2-0711-preview', provider=MoonshotAIProvider(api_key='your-moonshot-api-key'), ) agent = Agent(model) ... ``` ### GitHub Models To use [GitHub Models](https://docs.github.com/en/github-models), you'll need a GitHub personal access token with the `models: read` permission. You can set the `GITHUB_API_KEY` environment variable and use GitHubProvider by name: ```python from pydantic_ai import Agent agent = Agent('github:xai/grok-3-mini') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.github import GitHubProvider model = OpenAIChatModel( 'xai/grok-3-mini', # GitHub Models uses prefixed model names provider=GitHubProvider(api_key='your-github-token'), ) agent = Agent(model) ... ``` GitHub Models supports various model families with different prefixes. You can see the full list on the [GitHub Marketplace](https://github.com/marketplace?type=models) or the public [catalog endpoint](https://models.github.ai/catalog/models). ### Perplexity Follow the Perplexity [getting started](https://docs.perplexity.ai/guides/getting-started) guide to create an API key. ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.openai import OpenAIProvider model = OpenAIChatModel( 'sonar-pro', provider=OpenAIProvider( base_url='https://api.perplexity.ai', api_key='your-perplexity-api-key', ), ) agent = Agent(model) ... ``` ### Fireworks AI Go to [Fireworks.AI](https://fireworks.ai/) and create an API key in your account settings. You can set the `FIREWORKS_API_KEY` environment variable and use FireworksProvider by name: ```python from pydantic_ai import Agent agent = Agent('fireworks:accounts/fireworks/models/qwq-32b') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.fireworks import FireworksProvider model = OpenAIChatModel( 'accounts/fireworks/models/qwq-32b', # model library available at https://fireworks.ai/models provider=FireworksProvider(api_key='your-fireworks-api-key'), ) agent = Agent(model) ... ``` ### Together AI Go to [Together.ai](https://www.together.ai/) and create an API key in your account settings. You can set the `TOGETHER_API_KEY` environment variable and use TogetherProvider by name: ```python from pydantic_ai import Agent agent = Agent('together:meta-llama/Llama-3.3-70B-Instruct-Turbo-Free') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.together import TogetherProvider model = OpenAIChatModel( 'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free', # model library available at https://www.together.ai/models provider=TogetherProvider(api_key='your-together-api-key'), ) agent = Agent(model) ... ``` ### Heroku AI To use [Heroku AI](https://www.heroku.com/ai), first create an API key. You can set the `HEROKU_INFERENCE_KEY` and (optionally )`HEROKU_INFERENCE_URL` environment variables and use HerokuProvider by name: ```python from pydantic_ai import Agent agent = Agent('heroku:claude-3-7-sonnet') ... ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.heroku import HerokuProvider model = OpenAIChatModel( 'claude-3-7-sonnet', provider=HerokuProvider(api_key='your-heroku-inference-key'), ) agent = Agent(model) ... ``` ### Cerebras To use [Cerebras](https://cerebras.ai/), you need to create an API key in the [Cerebras Console](https://cloud.cerebras.ai/). You can set the `CEREBRAS_API_KEY` environment variable and use CerebrasProvider by name: ```python from pydantic_ai import Agent agent = Agent('cerebras:llama3.3-70b') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.cerebras import CerebrasProvider model = OpenAIChatModel( 'llama3.3-70b', provider=CerebrasProvider(api_key='your-cerebras-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ### LiteLLM To use [LiteLLM](https://www.litellm.ai/), set the configs as outlined in the [doc](https://docs.litellm.ai/docs/set_keys). In `LiteLLMProvider`, you can pass `api_base` and `api_key`. The value of these configs will depend on your setup. For example, if you are using OpenAI models, then you need to pass `https://api.openai.com/v1` as the `api_base` and your OpenAI API key as the `api_key`. If you are using a LiteLLM proxy server running on your local machine, then you need to pass `http://localhost:` as the `api_base` and your LiteLLM API key (or a placeholder) as the `api_key`. To use custom LLMs, use `custom/` prefix in the model name. Once you have the configs, use the LiteLLMProvider as follows: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.litellm import LiteLLMProvider model = OpenAIChatModel( 'openai/gpt-3.5-turbo', provider=LiteLLMProvider( api_base='', api_key='' ) ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ... ``` ### Nebius AI Studio Go to [Nebius AI Studio](https://studio.nebius.com/) and create an API key. You can set the `NEBIUS_API_KEY` environment variable and use NebiusProvider by name: ```python from pydantic_ai import Agent agent = Agent('nebius:Qwen/Qwen3-32B-fast') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Or initialise the model and provider directly: ```python from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIChatModel from pydantic_ai.providers.nebius import NebiusProvider model = OpenAIChatModel( 'Qwen/Qwen3-32B-fast', provider=NebiusProvider(api_key='your-nebius-api-key'), ) agent = Agent(model) result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` # Model Providers Pydantic AI is model-agnostic and has built-in support for multiple model providers: - [OpenAI](../openai/) - [Anthropic](../anthropic/) - [Gemini](../google/) (via two different APIs: Generative Language API and VertexAI API) - [Groq](../groq/) - [Mistral](../mistral/) - [Cohere](../cohere/) - [Bedrock](../bedrock/) - [Hugging Face](../huggingface/) ## OpenAI-compatible Providers In addition, many providers are compatible with the OpenAI API, and can be used with `OpenAIChatModel` in Pydantic AI: - [DeepSeek](../openai/#deepseek) - [Grok (xAI)](../openai/#grok-xai) - [Ollama](../openai/#ollama) - [OpenRouter](../openai/#openrouter) - [Vercel AI Gateway](../openai/#vercel-ai-gateway) - [Perplexity](../openai/#perplexity) - [Fireworks AI](../openai/#fireworks-ai) - [Together AI](../openai/#together-ai) - [Azure AI Foundry](../openai/#azure-ai-foundry) - [Heroku](../openai/#heroku-ai) - [GitHub Models](../openai/#github-models) - [Cerebras](../openai/#cerebras) - [LiteLLM](../openai/#litellm) - [Nebius AI Studio](../openai/#nebius-ai-studio) Pydantic AI also comes with [`TestModel`](../../api/models/test/) and [`FunctionModel`](../../api/models/function/) for testing and development. To use each model provider, you need to configure your local environment and make sure you have the right packages installed. If you try to use the model without having done so, you'll be told what to install. ## Models and Providers Pydantic AI uses a few key terms to describe how it interacts with different LLMs: - **Model**: This refers to the Pydantic AI class used to make requests following a specific LLM API (generally by wrapping a vendor-provided SDK, like the `openai` python SDK). These classes implement a vendor-SDK-agnostic API, ensuring a single Pydantic AI agent is portable to different LLM vendors without any other code changes just by swapping out the Model it uses. Model classes are named roughly in the format `Model`, for example, we have `OpenAIChatModel`, `AnthropicModel`, `GoogleModel`, etc. When using a Model class, you specify the actual LLM model name (e.g., `gpt-4o`, `claude-3-5-sonnet-latest`, `gemini-1.5-flash`) as a parameter. - **Provider**: This refers to provider-specific classes which handle the authentication and connections to an LLM vendor. Passing a non-default *Provider* as a parameter to a Model is how you can ensure that your agent will make requests to a specific endpoint, or make use of a specific approach to authentication (e.g., you can use Azure auth with the `OpenAIChatModel` by way of the `AzureProvider`). In particular, this is how you can make use of an AI gateway, or an LLM vendor that offers API compatibility with the vendor SDK used by an existing Model (such as `OpenAIChatModel`). - **Profile**: This refers to a description of how requests to a specific model or family of models need to be constructed to get the best results, independent of the model and provider classes used. For example, different models have different restrictions on the JSON schemas that can be used for tools, and the same schema transformer needs to be used for Gemini models whether you're using `GoogleModel` with model name `gemini-2.5-pro-preview`, or `OpenAIChatModel` with `OpenRouterProvider` and model name `google/gemini-2.5-pro-preview`. When you instantiate an Agent with just a name formatted as `:`, e.g. `openai:gpt-4o` or `openrouter:google/gemini-2.5-pro-preview`, Pydantic AI will automatically select the appropriate model class, provider, and profile. If you want to use a different provider or profile, you can instantiate a model class directly and pass in `provider` and/or `profile` arguments. ## Custom Models Note If a model API is compatible with the OpenAI API, you do not need a custom model class and can provide your own [custom provider](../openai/#openai-compatible-models) instead. To implement support for a model API that's not already supported, you will need to subclass the Model abstract base class. For streaming, you'll also need to implement the StreamedResponse abstract base class. The best place to start is to review the source code for existing implementations, e.g. [`OpenAIChatModel`](https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py). For details on when we'll accept contributions adding new models to Pydantic AI, see the [contributing guidelines](../../contributing/#new-model-rules). ## Fallback Model You can use FallbackModel to attempt multiple models in sequence until one successfully returns a result. Under the hood, Pydantic AI automatically switches from one model to the next if the current model returns a 4xx or 5xx status code. In the following example, the agent first makes a request to the OpenAI model (which fails due to an invalid API key), and then falls back to the Anthropic model. fallback_model.py ```python from pydantic_ai import Agent from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) response = agent.run_sync('What is the capital of France?') print(response.data) #> Paris print(response.all_messages()) """ [ ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ), ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='claude-3-5-sonnet-latest', timestamp=datetime.datetime(...), kind='response', provider_response_id=None, ), ] """ ``` The `ModelResponse` message above indicates in the `model_name` field that the output was returned by the Anthropic model, which is the second model specified in the `FallbackModel`. Note Each model's options should be configured individually. For example, `base_url`, `api_key`, and custom clients should be set on each model itself, not on the `FallbackModel`. ### Per-Model Settings You can configure different ModelSettings for each model in a fallback chain by passing the `settings` parameter when creating each model. This is particularly useful when different providers have different optimal configurations: fallback_model_per_settings.py ```python from pydantic_ai import Agent, ModelSettings from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel # Configure each model with provider-specific optimal settings openai_model = OpenAIChatModel( 'gpt-4o', settings=ModelSettings(temperature=0.7, max_tokens=1000) # Higher creativity for OpenAI ) anthropic_model = AnthropicModel( 'claude-3-5-sonnet-latest', settings=ModelSettings(temperature=0.2, max_tokens=1000) # Lower temperature for consistency ) fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) result = agent.run_sync('Write a creative story about space exploration') print(result.output) """ In the year 2157, Captain Maya Chen piloted her spacecraft through the vast expanse of the Andromeda Galaxy. As she discovered a planet with crystalline mountains that sang in harmony with the cosmic winds, she realized that space exploration was not just about finding new worlds, but about finding new ways to understand the universe and our place within it. """ ``` In this example, if the OpenAI model fails, the agent will automatically fall back to the Anthropic model with its own configured settings. The `FallbackModel` itself doesn't have settings - it uses the individual settings of whichever model successfully handles the request. In this next example, we demonstrate the exception-handling capabilities of `FallbackModel`. If all models fail, a FallbackExceptionGroup is raised, which contains all the exceptions encountered during the `run` execution. fallback_model_failure.py ```python from pydantic_ai import Agent, ModelHTTPError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) try: response = agent.run_sync('What is the capital of France?') except* ModelHTTPError as exc_group: for exc in exc_group.exceptions: print(exc) ``` Since [`except*`](https://docs.python.org/3/reference/compound_stmts.html#except-star) is only supported in Python 3.11+, we use the [`exceptiongroup`](https://github.com/agronholm/exceptiongroup) backport package for earlier Python versions: fallback_model_failure.py ```python from exceptiongroup import catch from pydantic_ai import Agent, ModelHTTPError from pydantic_ai.models.anthropic import AnthropicModel from pydantic_ai.models.fallback import FallbackModel from pydantic_ai.models.openai import OpenAIChatModel def model_status_error_handler(exc_group: BaseExceptionGroup) -> None: for exc in exc_group.exceptions: print(exc) openai_model = OpenAIChatModel('gpt-4o') anthropic_model = AnthropicModel('claude-3-5-sonnet-latest') fallback_model = FallbackModel(openai_model, anthropic_model) agent = Agent(fallback_model) with catch({ModelHTTPError: model_status_error_handler}): response = agent.run_sync('What is the capital of France?') ``` By default, the `FallbackModel` only moves on to the next model if the current model raises a ModelHTTPError. You can customize this behavior by passing a custom `fallback_on` argument to the `FallbackModel` constructor. # Graphs # Graphs Don't use a nail gun unless you need a nail gun If Pydantic AI [agents](../agents/) are a hammer, and [multi-agent workflows](../multi-agent-applications/) are a sledgehammer, then graphs are a nail gun: - sure, nail guns look cooler than hammers - but nail guns take a lot more setup than hammers - and nail guns don't make you a better builder, they make you a builder with a nail gun - Lastly, (and at the risk of torturing this metaphor), if you're a fan of medieval tools like mallets and untyped Python, you probably won't like nail guns or our approach to graphs. (But then again, if you're not a fan of type hints in Python, you've probably already bounced off Pydantic AI to use one of the toy agent frameworks — good luck, and feel free to borrow my sledgehammer when you realize you need it) In short, graphs are a powerful tool, but they're not the right tool for every job. Please consider other [multi-agent approaches](../multi-agent-applications/) before proceeding. If you're not confident a graph-based approach is a good idea, it might be unnecessary. Graphs and finite state machines (FSMs) are a powerful abstraction to model, execute, control and visualize complex workflows. Alongside Pydantic AI, we've developed `pydantic-graph` — an async graph and state machine library for Python where nodes and edges are defined using type hints. While this library is developed as part of Pydantic AI; it has no dependency on `pydantic-ai` and can be considered as a pure graph-based state machine library. You may find it useful whether or not you're using Pydantic AI or even building with GenAI. `pydantic-graph` is designed for advanced users and makes heavy use of Python generics and type hints. It is not designed to be as beginner-friendly as Pydantic AI. ## Installation `pydantic-graph` is a required dependency of `pydantic-ai`, and an optional dependency of `pydantic-ai-slim`, see [installation instructions](../install/#slim-install) for more information. You can also install it directly: ```bash pip install pydantic-graph ``` ```bash uv add pydantic-graph ``` ## Graph Types `pydantic-graph` is made up of a few key components: ### GraphRunContext GraphRunContext — The context for the graph run, similar to Pydantic AI's RunContext. This holds the state of the graph and dependencies and is passed to nodes when they're run. `GraphRunContext` is generic in the state type of the graph it's used in, StateT. ### End End — return value to indicate the graph run should end. `End` is generic in the graph return type of the graph it's used in, RunEndT. ### Nodes Subclasses of BaseNode define nodes for execution in the graph. Nodes, which are generally dataclasses, generally consist of: - fields containing any parameters required/optional when calling the node - the business logic to execute the node, in the run method - return annotations of the run method, which are read by `pydantic-graph` to determine the outgoing edges of the node Nodes are generic in: - **state**, which must have the same type as the state of graphs they're included in, StateT has a default of `None`, so if you're not using state you can omit this generic parameter, see [stateful graphs](#stateful-graphs) for more information - **deps**, which must have the same type as the deps of the graph they're included in, DepsT has a default of `None`, so if you're not using deps you can omit this generic parameter, see [dependency injection](#dependency-injection) for more information - **graph return type** — this only applies if the node returns End. RunEndT has a default of Never so this generic parameter can be omitted if the node doesn't return `End`, but must be included if it does. Here's an example of a start or intermediate node in a graph — it can't end the run as it doesn't return End: intermediate_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, GraphRunContext @dataclass class MyNode(BaseNode[MyState]): # (1)! foo: int # (2)! async def run( self, ctx: GraphRunContext[MyState], # (3)! ) -> AnotherNode: # (4)! ... return AnotherNode() ``` 1. State in this example is `MyState` (not shown), hence `BaseNode` is parameterized with `MyState`. This node can't end the run, so the `RunEndT` generic parameter is omitted and defaults to `Never`. 1. `MyNode` is a dataclass and has a single field `foo`, an `int`. 1. The `run` method takes a `GraphRunContext` parameter, again parameterized with state `MyState`. 1. The return type of the `run` method is `AnotherNode` (not shown), this is used to determine the outgoing edges of the node. We could extend `MyNode` to optionally end the run if `foo` is divisible by 5: intermediate_or_end_node.py ```py from dataclasses import dataclass from pydantic_graph import BaseNode, End, GraphRunContext @dataclass class MyNode(BaseNode[MyState, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext[MyState], ) -> AnotherNode | End[int]: # (2)! if self.foo % 5 == 0: return End(self.foo) else: return AnotherNode() ``` 1. We parameterize the node with the return type (`int` in this case) as well as state. Because generic parameters are positional-only, we have to include `None` as the second parameter representing deps. 1. The return type of the `run` method is now a union of `AnotherNode` and `End[int]`, this allows the node to end the run if `foo` is divisible by 5. ### Graph Graph — this is the execution graph itself, made up of a set of [node classes](#nodes) (i.e., `BaseNode` subclasses). `Graph` is generic in: - **state** the state type of the graph, StateT - **deps** the deps type of the graph, DepsT - **graph return type** the return type of the graph run, RunEndT Here's an example of a simple graph: graph_example.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class DivisibleBy5(BaseNode[None, None, int]): # (1)! foo: int async def run( self, ctx: GraphRunContext, ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode): # (2)! foo: int async def run(self, ctx: GraphRunContext) -> DivisibleBy5: return DivisibleBy5(self.foo + 1) fives_graph = Graph(nodes=[DivisibleBy5, Increment]) # (3)! result = fives_graph.run_sync(DivisibleBy5(4)) # (4)! print(result.output) #> 5 ``` 1. The `DivisibleBy5` node is parameterized with `None` for the state param and `None` for the deps param as this graph doesn't use state or deps, and `int` as it can end the run. 1. The `Increment` node doesn't return `End`, so the `RunEndT` generic parameter is omitted, state can also be omitted as the graph doesn't use state. 1. The graph is created with a sequence of nodes. 1. The graph is run synchronously with run_sync. The initial node is `DivisibleBy5(4)`. Because the graph doesn't use external state or deps, we don't pass `state` or `deps`. *(This example is complete, it can be run "as is")* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: graph_example_diagram.py ```py from graph_example import DivisibleBy5, fives_graph fives_graph.mermaid_code(start_node=DivisibleBy5) ``` ``` --- title: fives_graph --- stateDiagram-v2 [*] --> DivisibleBy5 DivisibleBy5 --> Increment DivisibleBy5 --> [*] Increment --> DivisibleBy5 ``` In order to visualize a graph within a `jupyter-notebook`, `IPython.display` needs to be used: jupyter_display_mermaid.py ```python from graph_example import DivisibleBy5, fives_graph from IPython.display import Image, display display(Image(fives_graph.mermaid_image(start_node=DivisibleBy5))) ``` ## Stateful Graphs The "state" concept in `pydantic-graph` provides an optional way to access and mutate an object (often a `dataclass` or Pydantic model) as nodes run in a graph. If you think of Graphs as a production line, then your state is the engine being passed along the line and built up by each node as the graph is run. `pydantic-graph` provides state persistence, with the state recorded after each node is run. (See [State Persistence](#state-persistence).) Here's an example of a graph which represents a vending machine where the user may insert coins and select a product to purchase. vending_machine.py ```python from __future__ import annotations from dataclasses import dataclass from rich.prompt import Prompt from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MachineState: # (1)! user_balance: float = 0.0 product: str | None = None @dataclass class InsertCoin(BaseNode[MachineState]): # (3)! async def run(self, ctx: GraphRunContext[MachineState]) -> CoinsInserted: # (16)! return CoinsInserted(float(Prompt.ask('Insert coins'))) # (4)! @dataclass class CoinsInserted(BaseNode[MachineState]): amount: float # (5)! async def run( self, ctx: GraphRunContext[MachineState] ) -> SelectProduct | Purchase: # (17)! ctx.state.user_balance += self.amount # (6)! if ctx.state.product is not None: # (7)! return Purchase(ctx.state.product) else: return SelectProduct() @dataclass class SelectProduct(BaseNode[MachineState]): async def run(self, ctx: GraphRunContext[MachineState]) -> Purchase: return Purchase(Prompt.ask('Select product')) PRODUCT_PRICES = { # (2)! 'water': 1.25, 'soda': 1.50, 'crisps': 1.75, 'chocolate': 2.00, } @dataclass class Purchase(BaseNode[MachineState, None, None]): # (18)! product: str async def run( self, ctx: GraphRunContext[MachineState] ) -> End | InsertCoin | SelectProduct: if price := PRODUCT_PRICES.get(self.product): # (8)! ctx.state.product = self.product # (9)! if ctx.state.user_balance >= price: # (10)! ctx.state.user_balance -= price return End(None) else: diff = price - ctx.state.user_balance print(f'Not enough money for {self.product}, need {diff:0.2f} more') #> Not enough money for crisps, need 0.75 more return InsertCoin() # (11)! else: print(f'No such product: {self.product}, try again') return SelectProduct() # (12)! vending_machine_graph = Graph( # (13)! nodes=[InsertCoin, CoinsInserted, SelectProduct, Purchase] ) async def main(): state = MachineState() # (14)! await vending_machine_graph.run(InsertCoin(), state=state) # (15)! print(f'purchase successful item={state.product} change={state.user_balance:0.2f}') #> purchase successful item=crisps change=0.25 ``` 1. The state of the vending machine is defined as a dataclass with the user's balance and the product they've selected, if any. 1. A dictionary of products mapped to prices. 1. The `InsertCoin` node, BaseNode is parameterized with `MachineState` as that's the state used in this graph. 1. The `InsertCoin` node prompts the user to insert coins. We keep things simple by just entering a monetary amount as a float. Before you start thinking this is a toy too since it's using rich's Prompt.ask within nodes, see [below](#example-human-in-the-loop) for how control flow can be managed when nodes require external input. 1. The `CoinsInserted` node; again this is a dataclass with one field `amount`. 1. Update the user's balance with the amount inserted. 1. If the user has already selected a product, go to `Purchase`, otherwise go to `SelectProduct`. 1. In the `Purchase` node, look up the price of the product if the user entered a valid product. 1. If the user did enter a valid product, set the product in the state so we don't revisit `SelectProduct`. 1. If the balance is enough to purchase the product, adjust the balance to reflect the purchase and return End to end the graph. We're not using the run return type, so we call `End` with `None`. 1. If the balance is insufficient, go to `InsertCoin` to prompt the user to insert more coins. 1. If the product is invalid, go to `SelectProduct` to prompt the user to select a product again. 1. The graph is created by passing a list of nodes to Graph. Order of nodes is not important, but it can affect how [diagrams](#mermaid-diagrams) are displayed. 1. Initialize the state. This will be passed to the graph run and mutated as the graph runs. 1. Run the graph with the initial state. Since the graph can be run from any node, we must pass the start node — in this case, `InsertCoin`. Graph.run returns a GraphRunResult that provides the final data and a history of the run. 1. The return type of the node's run method is important as it is used to determine the outgoing edges of the node. This information in turn is used to render [mermaid diagrams](#mermaid-diagrams) and is enforced at runtime to detect misbehavior as soon as possible. 1. The return type of `CoinsInserted`'s run method is a union, meaning multiple outgoing edges are possible. 1. Unlike other nodes, `Purchase` can end the run, so the RunEndT generic parameter must be set. In this case it's `None` since the graph run return type is `None`. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* A [mermaid diagram](#mermaid-diagrams) for this graph can be generated with the following code: vending_machine_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin) ``` The diagram generated by the above code is: ``` --- title: vending_machine_graph --- stateDiagram-v2 [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` See [below](#mermaid-diagrams) for more information on generating diagrams. ## GenAI Example So far we haven't shown an example of a Graph that actually uses Pydantic AI or GenAI at all. In this example, one agent generates a welcome email to a user and the other agent provides feedback on the email. This graph has a very simple structure: ``` --- title: feedback_graph --- stateDiagram-v2 [*] --> WriteEmail WriteEmail --> Feedback Feedback --> WriteEmail Feedback --> [*] ``` genai_email_feedback.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass, field from pydantic import BaseModel, EmailStr from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class User: name: str email: EmailStr interests: list[str] @dataclass class Email: subject: str body: str @dataclass class State: user: User write_agent_messages: list[ModelMessage] = field(default_factory=list) email_writer_agent = Agent( 'google-gla:gemini-1.5-pro', output_type=Email, system_prompt='Write a welcome email to our tech blog.', ) @dataclass class WriteEmail(BaseNode[State]): email_feedback: str | None = None async def run(self, ctx: GraphRunContext[State]) -> Feedback: if self.email_feedback: prompt = ( f'Rewrite the email for the user:\n' f'{format_as_xml(ctx.state.user)}\n' f'Feedback: {self.email_feedback}' ) else: prompt = ( f'Write a welcome email for the user:\n' f'{format_as_xml(ctx.state.user)}' ) result = await email_writer_agent.run( prompt, message_history=ctx.state.write_agent_messages, ) ctx.state.write_agent_messages += result.new_messages() return Feedback(result.output) class EmailRequiresWrite(BaseModel): feedback: str class EmailOk(BaseModel): pass feedback_agent = Agent[None, EmailRequiresWrite | EmailOk]( 'openai:gpt-4o', output_type=EmailRequiresWrite | EmailOk, # type: ignore system_prompt=( 'Review the email and provide feedback, email must reference the users specific interests.' ), ) @dataclass class Feedback(BaseNode[State, None, Email]): email: Email async def run( self, ctx: GraphRunContext[State], ) -> WriteEmail | End[Email]: prompt = format_as_xml({'user': ctx.state.user, 'email': self.email}) result = await feedback_agent.run(prompt) if isinstance(result.output, EmailRequiresWrite): return WriteEmail(email_feedback=result.output.feedback) else: return End(self.email) async def main(): user = User( name='John Doe', email='john.joe@example.com', interests=['Haskel', 'Lisp', 'Fortran'], ) state = State(user) feedback_graph = Graph(nodes=(WriteEmail, Feedback)) result = await feedback_graph.run(WriteEmail(), state=state) print(result.output) """ Email( subject='Welcome to our tech blog!', body='Hello John, Welcome to our tech blog! ...', ) """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Iterating Over a Graph ### Using `Graph.iter` for `async for` iteration Sometimes you want direct control or insight into each node as the graph executes. The easiest way to do that is with the Graph.iter method, which returns a **context manager** that yields a GraphRun object. The `GraphRun` is an async-iterable over the nodes of your graph, allowing you to record or modify them as they execute. Here's an example: count_down.py ```python from __future__ import annotations as _annotations from dataclasses import dataclass from pydantic_graph import Graph, BaseNode, End, GraphRunContext @dataclass class CountDownState: counter: int @dataclass class CountDown(BaseNode[CountDownState, None, int]): async def run(self, ctx: GraphRunContext[CountDownState]) -> CountDown | End[int]: if ctx.state.counter <= 0: return End(ctx.state.counter) ctx.state.counter -= 1 return CountDown() count_down_graph = Graph(nodes=[CountDown]) async def main(): state = CountDownState(counter=3) async with count_down_graph.iter(CountDown(), state=state) as run: # (1)! async for node in run: # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) print('Final output:', run.result.output) # (3)! #> Final output: 0 ``` 1. `Graph.iter(...)` returns a GraphRun. 1. Here, we step through each node as it is executed. 1. Once the graph returns an End, the loop ends, and `run.result` becomes a GraphRunResult containing the final outcome (`0` here). ### Using `GraphRun.next(node)` manually Alternatively, you can drive iteration manually with the GraphRun.next method, which allows you to pass in whichever node you want to run next. You can modify or selectively skip nodes this way. Below is a contrived example that stops whenever the counter is at 2, ignoring any node runs beyond that: count_down_next.py ```python from pydantic_graph import End, FullStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): state = CountDownState(counter=5) persistence = FullStatePersistence() # (7)! async with count_down_graph.iter( CountDown(), state=state, persistence=persistence ) as run: node = run.next_node # (1)! while not isinstance(node, End): # (2)! print('Node:', node) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() if state.counter == 2: break # (3)! node = await run.next(node) # (4)! print(run.result) # (5)! #> None for step in persistence.history: # (6)! print('History Step:', step.state, step.state) #> History Step: CountDownState(counter=5) CountDownState(counter=5) #> History Step: CountDownState(counter=4) CountDownState(counter=4) #> History Step: CountDownState(counter=3) CountDownState(counter=3) #> History Step: CountDownState(counter=2) CountDownState(counter=2) ``` 1. We start by grabbing the first node that will be run in the agent's graph. 1. The agent run is finished once an `End` node has been produced; instances of `End` cannot be passed to `next`. 1. If the user decides to stop early, we break out of the loop. The graph run won't have a real final result in that case (`run.result` remains `None`). 1. At each step, we call `await run.next(node)` to run it and get the next node (or an `End`). 1. Because we did not continue the run until it finished, the `result` is not set. 1. The run's history is still populated with the steps we executed so far. 1. Use FullStatePersistence so we can show the history of the run, see [State Persistence](#state-persistence) below for more information. ## State Persistence One of the biggest benefits of finite state machine (FSM) graphs is how they simplify the handling of interrupted execution. This might happen for a variety of reasons: - the state machine logic might fundamentally need to be paused — e.g. the returns workflow for an e-commerce order needs to wait for the item to be posted to the returns center or because execution of the next node needs input from a user so needs to wait for a new http request, - the execution takes so long that the entire graph can't reliably be executed in a single continuous run — e.g. a deep research agent that might take hours to run, - you want to run multiple graph nodes in parallel in different processes / hardware instances (note: parallel node execution is not yet supported in `pydantic-graph`, see [#704](https://github.com/pydantic/pydantic-ai/issues/704)). Trying to make a conventional control flow (i.e., boolean logic and nested function calls) implementation compatible with these usage scenarios generally results in brittle and over-complicated spaghetti code, with the logic required to interrupt and resume execution dominating the implementation. To allow graph runs to be interrupted and resumed, `pydantic-graph` provides state persistence — a system for snapshotting the state of a graph run before and after each node is run, allowing a graph run to be resumed from any point in the graph. `pydantic-graph` includes three state persistence implementations: - SimpleStatePersistence — Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. - FullStatePersistence — In memory state persistence that hold a list of snapshots. - FileStatePersistence — File-based state persistence that saves snapshots to a JSON file. In production applications, developers should implement their own state persistence by subclassing BaseStatePersistence abstract base class, which might persist runs in a relational database like PostgresQL. At a high level the role of `StatePersistence` implementations is to store and retrieve NodeSnapshot and EndSnapshot objects. graph.iter_from_persistence() may be used to run the graph based on the state stored in persistence. We can run the `count_down_graph` from [above](#iterating-over-a-graph), using graph.iter_from_persistence() and FileStatePersistence. As you can see in this code, `run_node` requires no external application state (apart from state persistence) to be run, meaning graphs can easily be executed by distributed execution and queueing systems. count_down_from_persistence.py ```python from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from count_down import CountDown, CountDownState, count_down_graph async def main(): run_id = 'run_abc123' persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) # (1)! state = CountDownState(counter=5) await count_down_graph.initialize( # (2)! CountDown(), state=state, persistence=persistence ) done = False while not done: done = await run_node(run_id) async def run_node(run_id: str) -> bool: # (3)! persistence = FileStatePersistence(Path(f'count_down_{run_id}.json')) async with count_down_graph.iter_from_persistence(persistence) as run: # (4)! node_or_end = await run.next() # (5)! print('Node:', node_or_end) #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: CountDown() #> Node: End(data=0) return isinstance(node_or_end, End) # (6)! ``` 1. Create a FileStatePersistence to use to start the graph. 1. Call graph.initialize() to set the initial graph state in the persistence object. 1. `run_node` is a pure function that doesn't need access to any other process state to run the next node of the graph, except the ID of the run. 1. Call graph.iter_from_persistence() create a GraphRun object that will run the next node of the graph from the state stored in persistence. This will return either a node or an `End` object. 1. graph.run() will return either a node or an End object. 1. Check if the node is an End object, if it is, the graph run is complete. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Example: Human in the loop. As noted above, state persistence allows graphs to be interrupted and resumed. One use case of this is to allow user input to continue. In this example, an AI asks the user a question, the user provides an answer, the AI evaluates the answer and ends if the user got it right or asks another question if they got it wrong. Instead of running the entire graph in a single process invocation, we run the graph by running the process repeatedly, optionally providing an answer to the question as a command line argument. `ai_q_and_a_graph.py` — `question_graph` definition ai_q_and_a_graph.py ```python from __future__ import annotations as _annotations from typing import Annotated from pydantic_graph import Edge from dataclasses import dataclass, field from pydantic import BaseModel from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_ai import Agent, format_as_xml from pydantic_ai import ModelMessage ask_agent = Agent('openai:gpt-4o', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-4o.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationResult, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is complete, it can be run "as is")* ai_q_and_a_run.py ```python import sys from pathlib import Path from pydantic_graph import End from pydantic_graph.persistence.file import FileStatePersistence from pydantic_ai import ModelMessage # noqa: F401 from ai_q_and_a_graph import Ask, question_graph, Evaluate, QuestionState, Answer async def main(): answer: str | None = sys.argv[1] if len(sys.argv) > 1 else None # (1)! persistence = FileStatePersistence(Path('question_graph.json')) # (2)! persistence.set_graph_types(question_graph) # (3)! if snapshot := await persistence.load_next(): # (4)! state = snapshot.state assert answer is not None node = Evaluate(answer) else: state = QuestionState() node = Ask() # (5)! async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() # (6)! if isinstance(node, End): # (7)! print('END:', node.data) history = await persistence.load_all() # (8)! print([e.node for e in history]) break elif isinstance(node, Answer): # (9)! print(node.question) #> What is the capital of France? break # otherwise just continue ``` 1. Get the user's answer from the command line, if provided. See [question graph example](../examples/question-graph/) for a complete example. 1. Create a state persistence instance the `'question_graph.json'` file may or may not already exist. 1. Since we're using the persistence interface outside a graph, we need to call set_graph_types to set the graph generic types `StateT` and `RunEndT` for the persistence instance. This is necessary to allow the persistence instance to know how to serialize and deserialize graph nodes. 1. If we're run the graph before, load_next will return a snapshot of the next node to run, here we use `state` from that snapshot, and create a new `Evaluate` node with the answer provided on the command line. 1. If the graph hasn't been run before, we create a new `QuestionState` and start with the `Ask` node. 1. Call GraphRun.next() to run the node. This will return either a node or an `End` object. 1. If the node is an `End` object, the graph run is complete. The `data` field of the `End` object contains the comment returned by the `evaluate_agent` about the correct answer. 1. To demonstrate the state persistence, we call load_all to get all the snapshots from the persistence instance. This will return a list of Snapshot objects. 1. If the node is an `Answer` object, we print the question and break out of the loop to end the process and wait for user input. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* For a complete example of this graph, see the [question graph example](../examples/question-graph/). ## Dependency Injection As with Pydantic AI, `pydantic-graph` supports dependency injection via a generic parameter on Graph and BaseNode, and the GraphRunContext.deps field. As an example of dependency injection, let's modify the `DivisibleBy5` example [above](#graph) to use a ProcessPoolExecutor to run the compute load in a separate process (this is a contrived example, `ProcessPoolExecutor` wouldn't actually improve performance in this example): deps_example.py ```py from __future__ import annotations import asyncio from concurrent.futures import ProcessPoolExecutor from dataclasses import dataclass from pydantic_graph import BaseNode, End, FullStatePersistence, Graph, GraphRunContext @dataclass class GraphDeps: executor: ProcessPoolExecutor @dataclass class DivisibleBy5(BaseNode[None, GraphDeps, int]): foo: int async def run( self, ctx: GraphRunContext[None, GraphDeps], ) -> Increment | End[int]: if self.foo % 5 == 0: return End(self.foo) else: return Increment(self.foo) @dataclass class Increment(BaseNode[None, GraphDeps]): foo: int async def run(self, ctx: GraphRunContext[None, GraphDeps]) -> DivisibleBy5: loop = asyncio.get_running_loop() compute_result = await loop.run_in_executor( ctx.deps.executor, self.compute, ) return DivisibleBy5(compute_result) def compute(self) -> int: return self.foo + 1 fives_graph = Graph(nodes=[DivisibleBy5, Increment]) async def main(): with ProcessPoolExecutor() as executor: deps = GraphDeps(executor) result = await fives_graph.run(DivisibleBy5(3), deps=deps, persistence=FullStatePersistence()) print(result.output) #> 5 # the full history is quite verbose (see below), so we'll just print the summary print([item.node for item in result.persistence.history]) """ [ DivisibleBy5(foo=3), Increment(foo=3), DivisibleBy5(foo=4), Increment(foo=4), DivisibleBy5(foo=5), End(data=5), ] """ ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Mermaid Diagrams Pydantic Graph can generate [mermaid](https://mermaid.js.org/) [`stateDiagram-v2`](https://mermaid.js.org/syntax/stateDiagram.html) diagrams for graphs, as shown above. These diagrams can be generated with: - Graph.mermaid_code to generate the mermaid code for a graph - Graph.mermaid_image to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) - Graph.mermaid_save to generate an image of the graph using [mermaid.ink](https://mermaid.ink/) and save it to a file Beyond the diagrams shown above, you can also customize mermaid diagrams with the following options: - Edge allows you to apply a label to an edge - BaseNode.docstring_notes and BaseNode.get_note allows you to add notes to nodes - The highlighted_nodes parameter allows you to highlight specific node(s) in the diagram Putting that together, we can edit the last [`ai_q_and_a_graph.py`](#example-human-in-the-loop) example to: - add labels to some edges - add a note to the `Ask` node - highlight the `Answer` node - save the diagram as a `PNG` image to file ai_q_and_a_graph_extra.py ```python from typing import Annotated from pydantic_graph import BaseNode, End, Graph, GraphRunContext, Edge ask_agent = Agent('openai:gpt-4o', output_type=str, instrument=True) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): """Generate question using GPT-4o.""" docstring_notes = True async def run( self, ctx: GraphRunContext[QuestionState] ) -> Annotated[Answer, Edge(label='Ask the question')]: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.new_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationResult(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationResult, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> Annotated[End[str], Edge(label='success')] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.new_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) ``` *(This example is not complete and cannot be run directly)* This would generate an image that looks like this: ``` --- title: question_graph --- stateDiagram-v2 Ask --> Answer: Ask the question note right of Ask Judge the answer. Decide on next step. end note Answer --> Evaluate Evaluate --> Reprimand Evaluate --> [*]: success Reprimand --> Ask classDef highlighted fill:#fdff32 class Answer highlighted ``` ### Setting Direction of the State Diagram You can specify the direction of the state diagram using one of the following values: - `'TB'`: Top to bottom, the diagram flows vertically from top to bottom. - `'LR'`: Left to right, the diagram flows horizontally from left to right. - `'RL'`: Right to left, the diagram flows horizontally from right to left. - `'BT'`: Bottom to top, the diagram flows vertically from bottom to top. Here is an example of how to do this using 'Left to Right' (LR) instead of the default 'Top to Bottom' (TB): vending_machine_diagram.py ```py from vending_machine import InsertCoin, vending_machine_graph vending_machine_graph.mermaid_code(start_node=InsertCoin, direction='LR') ``` ``` --- title: vending_machine_graph --- stateDiagram-v2 direction LR [*] --> InsertCoin InsertCoin --> CoinsInserted CoinsInserted --> SelectProduct CoinsInserted --> Purchase SelectProduct --> Purchase Purchase --> InsertCoin Purchase --> SelectProduct Purchase --> [*] ``` # API Reference # `pydantic_ai.ag_ui` Provides an AG-UI protocol adapter for the Pydantic AI agent. This package provides seamless integration between pydantic-ai agents and ag-ui for building interactive AI applications with streaming event-based communication. ### SSE_CONTENT_TYPE ```python SSE_CONTENT_TYPE: Final[str] = 'text/event-stream' ``` Content type header value for Server-Sent Events (SSE). ### OnCompleteFunc ```python OnCompleteFunc: TypeAlias = ( Callable[[AgentRunResult[Any]], None] | Callable[[AgentRunResult[Any]], Awaitable[None]] ) ``` Callback function type that receives the `AgentRunResult` of the completed run. Can be sync or async. ### AGUIApp Bases: `Generic[AgentDepsT, OutputDataT]`, `Starlette` ASGI application for running Pydantic AI agents with AG-UI protocol support. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python class AGUIApp(Generic[AgentDepsT, OutputDataT], Starlette): """ASGI application for running Pydantic AI agents with AG-UI protocol support.""" def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # Agent.iter parameters. output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette parameters. debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> None: """An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def endpoint(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" return await handle_ag_ui_request( agent, request, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) self.router.add_route('/', endpoint, methods=['POST'], name='run_agent') ``` #### __init__ ```python __init__( agent: AbstractAgent[AgentDepsT, OutputDataT], *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: ( Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None ) = None ) -> None ``` An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use pydantic_ai.ag_ui.run_ag_ui or pydantic_ai.ag_ui.handle_ag_ui_request instead. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to run. | *required* | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | `Sequence[BaseRoute] | None` | A list of routes to serve incoming HTTP and WebSocket requests. | `None` | | `middleware` | `Sequence[Middleware] | None` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | `None` | | `exception_handlers` | `Mapping[Any, ExceptionHandler] | None` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | `None` | | `on_startup` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `on_shutdown` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `lifespan` | `Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python def __init__( self, agent: AbstractAgent[AgentDepsT, OutputDataT], *, # Agent.iter parameters. output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette parameters. debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> None: """An ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Args: agent: The agent to run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. """ super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) async def endpoint(request: Request) -> Response: """Endpoint to run the agent with the provided input data.""" return await handle_ag_ui_request( agent, request, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) self.router.add_route('/', endpoint, methods=['POST'], name='run_agent') ``` ### handle_ag_ui_request ```python handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc | None = None ) -> Response ``` Handle an AG-UI request by running the agent and returning a streaming response. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `request` | `Request` | The Starlette request (e.g. from FastAPI) containing the AG-UI run input. | *required* | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `on_complete` | `OnCompleteFunc | None` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | `None` | Returns: | Type | Description | | --- | --- | | `Response` | A streaming Starlette response with AG-UI protocol events. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python async def handle_ag_ui_request( agent: AbstractAgent[AgentDepsT, Any], request: Request, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc | None = None, ) -> Response: """Handle an AG-UI request by running the agent and returning a streaming response. Args: agent: The agent to run. request: The Starlette request (e.g. from FastAPI) containing the AG-UI run input. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Returns: A streaming Starlette response with AG-UI protocol events. """ accept = request.headers.get('accept', SSE_CONTENT_TYPE) try: input_data = RunAgentInput.model_validate(await request.json()) except ValidationError as e: # pragma: no cover return Response( content=json.dumps(e.json()), media_type='application/json', status_code=HTTPStatus.UNPROCESSABLE_ENTITY, ) return StreamingResponse( run_ag_ui( agent, input_data, accept, output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, on_complete=on_complete, ), media_type=accept, ) ``` ### run_ag_ui ```python run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, on_complete: OnCompleteFunc | None = None ) -> AsyncIterator[str] ``` Run the agent with the AG-UI run input and stream AG-UI protocol events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `agent` | `AbstractAgent[AgentDepsT, Any]` | The agent to run. | *required* | | `run_input` | `RunAgentInput` | The AG-UI run input containing thread_id, run_id, messages, etc. | *required* | | `accept` | `str` | The accept header value for the run. | `SSE_CONTENT_TYPE` | | `output_type` | `OutputSpec[Any] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `on_complete` | `OnCompleteFunc | None` | Optional callback function called when the agent run completes successfully. The callback receives the completed AgentRunResult and can access all_messages() and other result data. | `None` | Yields: | Type | Description | | --- | --- | | `AsyncIterator[str]` | Streaming event chunks encoded as strings according to the accept header value. | Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python async def run_ag_ui( agent: AbstractAgent[AgentDepsT, Any], run_input: RunAgentInput, accept: str = SSE_CONTENT_TYPE, *, output_type: OutputSpec[Any] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, on_complete: OnCompleteFunc | None = None, ) -> AsyncIterator[str]: """Run the agent with the AG-UI run input and stream AG-UI protocol events. Args: agent: The agent to run. run_input: The AG-UI run input containing thread_id, run_id, messages, etc. accept: The accept header value for the run. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. on_complete: Optional callback function called when the agent run completes successfully. The callback receives the completed [`AgentRunResult`][pydantic_ai.agent.AgentRunResult] and can access `all_messages()` and other result data. Yields: Streaming event chunks encoded as strings according to the accept header value. """ encoder = EventEncoder(accept=accept) if run_input.tools: # AG-UI tools can't be prefixed as that would result in a mismatch between the tool names in the # Pydantic AI events and actual AG-UI tool names, preventing the tool from being called. If any # conflicts arise, the AG-UI tool should be renamed or a `PrefixedToolset` used for local toolsets. toolset = _AGUIFrontendToolset[AgentDepsT](run_input.tools) toolsets = [*toolsets, toolset] if toolsets else [toolset] try: yield encoder.encode( RunStartedEvent( thread_id=run_input.thread_id, run_id=run_input.run_id, ), ) if not run_input.messages: raise _NoMessagesError raw_state: dict[str, Any] = run_input.state or {} if isinstance(deps, StateHandler): if isinstance(deps.state, BaseModel): try: state = type(deps.state).model_validate(raw_state) except ValidationError as e: # pragma: no cover raise _InvalidStateError from e else: state = raw_state deps = replace(deps, state=state) elif raw_state: raise UserError( f'AG-UI state is provided but `deps` of type `{type(deps).__name__}` does not implement the `StateHandler` protocol: it needs to be a dataclass with a non-optional `state` field.' ) else: # `deps` not being a `StateHandler` is OK if there is no state. pass messages = _messages_from_ag_ui(run_input.messages) async with agent.iter( user_prompt=None, output_type=[output_type or agent.output_type, DeferredToolRequests], message_history=messages, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: async for event in _agent_stream(run): yield encoder.encode(event) if on_complete is not None and run.result is not None: if _utils.is_async_callable(on_complete): await on_complete(run.result) else: await _utils.run_in_executor(on_complete, run.result) except _RunError as e: yield encoder.encode( RunErrorEvent(message=e.message, code=e.code), ) except Exception as e: yield encoder.encode( RunErrorEvent(message=str(e)), ) raise e else: yield encoder.encode( RunFinishedEvent( thread_id=run_input.thread_id, run_id=run_input.run_id, ), ) ``` ### StateHandler Bases: `Protocol` Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python @runtime_checkable class StateHandler(Protocol): """Protocol for state handlers in agent runs. Requires the class to be a dataclass with a `state` field.""" # Has to be a dataclass so we can use `replace` to update the state. # From https://github.com/python/typeshed/blob/9ab7fde0a0cd24ed7a72837fcb21093b811b80d8/stdlib/_typeshed/__init__.pyi#L352 __dataclass_fields__: ClassVar[dict[str, Field[Any]]] @property def state(self) -> State: """Get the current state of the agent run.""" ... @state.setter def state(self, state: State) -> None: """Set the state of the agent run. This method is called to update the state of the agent run with the provided state. Args: state: The run state. Raises: InvalidStateError: If `state` does not match the expected model. """ ... ``` #### state ```python state: State ``` Get the current state of the agent run. ### StateDeps Bases: `Generic[StateT]` Provides AG-UI state management. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. Source code in `pydantic_ai_slim/pydantic_ai/ag_ui.py` ```python @dataclass class StateDeps(Generic[StateT]): """Provides AG-UI state management. This class is used to manage the state of an agent run. It allows setting the state of the agent run with a specific type of state model, which must be a subclass of `BaseModel`. The state is set using the `state` setter by the `Adapter` when the run starts. Implements the `StateHandler` protocol. """ state: StateT ``` # `pydantic_ai.agent` ### Agent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take AgentDepsT and the output type they return, OutputDataT. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @dataclasses.dataclass(init=False) class Agent(AbstractAgent[AgentDepsT, OutputDataT]): """Class for defining "agents" - a way to have a specific type of "conversation" with an LLM. Agents are generic in the dependency type they take [`AgentDepsT`][pydantic_ai.tools.AgentDepsT] and the output type they return, [`OutputDataT`][pydantic_ai.output.OutputDataT]. By default, if neither generic parameter is customised, agents have type `Agent[None, str]`. Minimal usage example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` """ _model: models.Model | models.KnownModelName | str | None _name: str | None end_strategy: EndStrategy """Strategy for handling tool calls when a final result is found.""" model_settings: ModelSettings | None """Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. """ _output_type: OutputSpec[OutputDataT] instrument: InstrumentationSettings | bool | None """Options to automatically instrument with OpenTelemetry.""" _instrument_default: ClassVar[InstrumentationSettings | bool] = False _deps_type: type[AgentDepsT] = dataclasses.field(repr=False) _output_schema: _output.BaseOutputSchema[OutputDataT] = dataclasses.field(repr=False) _output_validators: list[_output.OutputValidator[AgentDepsT, OutputDataT]] = dataclasses.field(repr=False) _instructions: list[str | _system_prompt.SystemPromptFunc[AgentDepsT]] = dataclasses.field(repr=False) _system_prompts: tuple[str, ...] = dataclasses.field(repr=False) _system_prompt_functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field(repr=False) _system_prompt_dynamic_functions: dict[str, _system_prompt.SystemPromptRunner[AgentDepsT]] = dataclasses.field( repr=False ) _function_toolset: FunctionToolset[AgentDepsT] = dataclasses.field(repr=False) _output_toolset: OutputToolset[AgentDepsT] | None = dataclasses.field(repr=False) _user_toolsets: list[AbstractToolset[AgentDepsT]] = dataclasses.field(repr=False) _prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = dataclasses.field(repr=False) _max_result_retries: int = dataclasses.field(repr=False) _max_tool_retries: int = dataclasses.field(repr=False) _event_stream_handler: EventStreamHandler[AgentDepsT] | None = dataclasses.field(repr=False) _enter_lock: Lock = dataclasses.field(repr=False) _entered_count: int = dataclasses.field(repr=False) _exit_stack: AsyncExitStack | None = dataclasses.field(repr=False) @overload def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> None: ... @overload @deprecated('`mcp_servers` is deprecated, use `toolsets` instead.') def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> None: ... def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.Agent.instructions]. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the [HTTP Request Retries](../retries.md) documentation. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) default_output_mode = ( self.model.profile.default_structured_output_mode if isinstance(self.model, models.Model) else None ) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type, default_mode=default_output_mode) self._output_validators = [] self._instructions = self._normalize_instructions(instructions) self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, output_schema=self._output_schema ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._override_name: ContextVar[_utils.Option[str]] = ContextVar('_override_name', default=None) self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._override_instructions: ContextVar[ _utils.Option[list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]] ] = ContextVar('_override_instructions', default=None) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument @property def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" return self._model @model.setter def model(self, value: models.Model | models.KnownModelName | str | None) -> None: """Set the default model configured for this agent. We allow `str` here since the actual list of allowed models changes frequently. """ self._model = value @property def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ name_ = self._override_name.get() return name_.value if name_ else self._name @name.setter def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" self._name = value @property def deps_type(self) -> type: """The type of dependencies used by the agent.""" return self._deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" return self._output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" return self._event_stream_handler def __repr__(self) -> str: return f'{type(self).__name__}(model={self.model!r}, name={self.name!r}, end_strategy={self.end_strategy!r}, model_settings={self.model_settings!r}, output_type={self.output_type!r}, instrument={self.instrument!r})' @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) output_schema = self._prepare_output_schema(output_type, model_used.profile) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = cast(list[_output.OutputValidator[AgentDepsT, RunOutputDataT]], self._output_validators) output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = cast(OutputToolset[AgentDepsT], output_schema.toolset) if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset) # Build the graph graph: Graph[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[Any]] = ( _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) ) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=list(message_history) if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() instructions_literal, instructions_functions = self._get_instructions() async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ instructions_literal, *[await func.run(run_context) for func in instructions_functions], ] model_profile = model_used.profile if isinstance(output_schema, _output.PromptedOutputSchema): instructions = output_schema.instructions(model_profile.prompted_output_template) parts.append(instructions) parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() if builtin_tools: # Deduplicate builtin tools passed to the agent and the run based on type builtin_tools = list( { **({type(tool): tool for tool in self._builtin_tools or []}), **({type(tool): tool for tool in builtin_tools}), }.values() ) else: builtin_tools = list(self._builtin_tools) graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, RunOutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=len(message_history) if message_history else 0, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, history_processors=self.history_processors, builtin_tools=builtin_tools, tool_manager=tool_manager, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) start_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, deferred_tool_results=deferred_tool_results, instructions=instructions_literal, instructions_functions=instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' instrumentation_names = InstrumentationNames.for_version( instrumentation_settings.version if instrumentation_settings else DEFAULT_INSTRUMENTATION_VERSION ) run_span = tracer.start_span( instrumentation_names.get_agent_run_span_name(agent_name), attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'gen_ai.agent.name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) try: async with toolset: async with graph.iter( start_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run: agent_run = AgentRun(graph_run) yield agent_run if (final_result := agent_run.result) is not None and run_span.is_recording(): if instrumentation_settings and instrumentation_settings.include_content: run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes( self._run_span_end_attributes( instrumentation_settings, usage, state.message_history, graph_deps.new_message_index ) ) finally: run_span.end() def _run_span_end_attributes( self, settings: InstrumentationSettings, usage: _usage.RunUsage, message_history: list[_messages.ModelMessage], new_message_index: int, ): if settings.version == 1: attrs = { 'all_messages_events': json.dumps( [InstrumentedModel.event_to_dict(e) for e in settings.messages_to_otel_events(message_history)] ) } else: # Store the last instructions here for convenience last_instructions = InstrumentedModel._get_instructions(message_history) # pyright: ignore[reportPrivateUsage] attrs: dict[str, Any] = { 'pydantic_ai.all_messages': json.dumps(settings.messages_to_otel_messages(list(message_history))), **settings.system_instructions_attributes(last_instructions), } # If this agent run was provided with existing history, store an attribute indicating the point at which the # new messages begin. if new_message_index > 0: attrs['pydantic_ai.new_message_index'] = new_message_index # If the instructions for this agent run were not always the same, store an attribute that indicates that. # This can signal to an observability UI that different steps in the agent run had different instructions. # Note: We purposely only look at "new" messages because they are the only ones produced by this agent run. if any( ( isinstance(m, _messages.ModelRequest) and m.instructions is not None and m.instructions != last_instructions ) for m in message_history[new_message_index:] ): attrs['pydantic_ai.variable_instructions'] = True return { **usage.opentelemetry_attributes(), **attrs, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { **{k: {'type': 'array'} if isinstance(v, str) else {} for k, v in attrs.items()}, 'final_result': {'type': 'object'}, }, } ), } @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(name): name_token = self._override_name.set(_utils.Some(name)) else: name_token = None if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None if _utils.is_set(instructions): normalized_instructions = self._normalize_instructions(instructions) instructions_token = self._override_instructions.set(_utils.Some(normalized_instructions)) else: instructions_token = None try: yield finally: if name_token is not None: self._override_name.reset(name_token) if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) if instructions_token is not None: self._override_instructions.reset(instructions_token) @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], str], / ) -> Callable[[RunContext[AgentDepsT]], str]: ... @overload def instructions( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]]: ... @overload def instructions(self, func: Callable[[], str], /) -> Callable[[], str]: ... @overload def instructions(self, func: Callable[[], Awaitable[str]], /) -> Callable[[], Awaitable[str]]: ... @overload def instructions( self, / ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions.append(func_) return func_ return decorator else: self._instructions.append(func) return func @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], str], / ) -> Callable[[RunContext[AgentDepsT]], str]: ... @overload def system_prompt( self, func: Callable[[RunContext[AgentDepsT]], Awaitable[str]], / ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]]: ... @overload def system_prompt(self, func: Callable[[], str], /) -> Callable[[], str]: ... @overload def system_prompt(self, func: Callable[[], Awaitable[str]], /) -> Callable[[], Awaitable[str]]: ... @overload def system_prompt( self, /, *, dynamic: bool = False ) -> Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]]: ... def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT]]: ... @overload def output_validator( self, func: Callable[[OutputDataT], OutputDataT], / ) -> Callable[[OutputDataT], OutputDataT]: ... @overload def output_validator( self, func: Callable[[OutputDataT], Awaitable[OutputDataT]], / ) -> Callable[[OutputDataT], Awaitable[OutputDataT]]: ... def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func @overload def tool(self, func: ToolFuncContext[AgentDepsT, ToolParams], /) -> ToolFuncContext[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Callable[[ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=True, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def tool_plain(self, func: ToolFuncPlain[ToolParams], /) -> ToolFuncPlain[ToolParams]: ... @overload def tool_plain( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Callable[[ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams]]: ... def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=False, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) @overload def toolset(self, func: ToolsetFunc[AgentDepsT], /) -> ToolsetFunc[AgentDepsT]: ... @overload def toolset( self, /, *, per_run_step: bool = True, ) -> Callable[[ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT]]: ... def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step)) return func_ return toolset_decorator if func is None else toolset_decorator(func) def _get_model(self, model: models.Model | models.KnownModelName | str | None) -> models.Model: """Create a model configured for this agent. Args: model: model to use for this run, required if `model` was not set when creating the agent. Returns: The model used """ model_: models.Model if some_model := self._override_model.get(): # we don't want `override()` to cover up errors from the model not being defined, hence this check if model is None and self.model is None: raise exceptions.UserError( '`model` must either be set on the agent or included when calling it. ' '(Even when `override(model=...)` is customizing the model that will actually be called)' ) model_ = some_model.value elif model is not None: model_ = models.infer_model(model) elif self.model is not None: # noinspection PyTypeChecker model_ = self.model = models.infer_model(self.model) else: raise exceptions.UserError('`model` must either be set on the agent or included when calling it.') instrument = self.instrument if instrument is None: instrument = self._instrument_default return instrument_model(model_, instrument) def _get_deps(self: Agent[T, OutputDataT], deps: T) -> T: """Get deps for a run. If we've overridden deps via `_override_deps`, use that, otherwise use the deps passed to the call. We could do runtime type checking of deps against `self._deps_type`, but that's a slippery slope. """ if some_deps := self._override_deps.get(): return some_deps.value else: return deps def _normalize_instructions( self, instructions: Instructions[AgentDepsT], ) -> list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]: if instructions is None: return [] if isinstance(instructions, str) or callable(instructions): return [instructions] return list(instructions) def _get_instructions( self, ) -> tuple[str | None, list[_system_prompt.SystemPromptRunner[AgentDepsT]]]: override_instructions = self._override_instructions.get() instructions = override_instructions.value if override_instructions else self._instructions literal_parts: list[str] = [] functions: list[_system_prompt.SystemPromptRunner[AgentDepsT]] = [] for instruction in instructions: if isinstance(instruction, str): literal_parts.append(instruction) else: functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](instruction)) literal = '\n'.join(literal_parts).strip() or None return literal, functions def _get_toolset( self, output_toolset: AbstractToolset[AgentDepsT] | None | _utils.Unset = _utils.UNSET, additional_toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, ) -> AbstractToolset[AgentDepsT]: """Get the complete toolset. Args: output_toolset: The output toolset to use instead of the one built at agent construction time. additional_toolsets: Additional toolsets to add, unless toolsets have been overridden. """ toolsets = self.toolsets # Don't add additional toolsets if the toolsets have been overridden if additional_toolsets and self._override_toolsets.get() is None: toolsets = [*toolsets, *additional_toolsets] toolset = CombinedToolset(toolsets) # Copy the dynamic toolsets to ensure each run has its own instances def copy_dynamic_toolsets(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: if isinstance(toolset, DynamicToolset): return dataclasses.replace(toolset) else: return toolset toolset = toolset.visit_and_replace(copy_dynamic_toolsets) if self._prepare_tools: toolset = PreparedToolset(toolset, self._prepare_tools) output_toolset = output_toolset if _utils.is_set(output_toolset) else self._output_toolset if output_toolset is not None: if self._prepare_output_tools: output_toolset = PreparedToolset(output_toolset, self._prepare_output_tools) toolset = CombinedToolset([output_toolset, toolset]) return toolset @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. """ toolsets: list[AbstractToolset[AgentDepsT]] = [] if some_tools := self._override_tools.get(): function_toolset = _AgentFunctionToolset( some_tools.value, max_retries=self._max_tool_retries, output_schema=self._output_schema ) else: function_toolset = self._function_toolset toolsets.append(function_toolset) if some_user_toolsets := self._override_toolsets.get(): user_toolsets = some_user_toolsets.value else: user_toolsets = [*self._user_toolsets, *self._dynamic_toolsets] toolsets.extend(user_toolsets) return toolsets def _prepare_output_schema( self, output_type: OutputSpec[RunOutputDataT] | None, model_profile: ModelProfile ) -> _output.OutputSchema[RunOutputDataT]: if output_type is not None: if self._output_validators: raise exceptions.UserError('Cannot set a custom run `output_type` when the agent has output validators') schema = _output.OutputSchema[RunOutputDataT].build( output_type, default_mode=model_profile.default_structured_output_mode ) else: schema = self._output_schema.with_default_mode(model_profile.default_structured_output_mode) schema.raise_if_unsupported(model_profile) return schema # pyright: ignore[reportReturnType] async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ```` #### __init__ ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, mcp_servers: Sequence[MCPServer] = (), defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> None ``` ```python __init__( model: Model | KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, prepare_output_tools: ( ToolsPrepareFunc[AgentDepsT] | None ) = None, toolsets: ( Sequence[ AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT] ] | None ) = None, defer_model_check: bool = False, end_strategy: EndStrategy = "early", instrument: ( InstrumentationSettings | bool | None ) = None, history_processors: ( Sequence[HistoryProcessor[AgentDepsT]] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Any ) ``` Create an agent. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str | None` | The default model to use for this agent, if not provided, you must provide the model when calling it. We allow str here since the actual list of allowed models changes frequently. | `None` | | `output_type` | `OutputSpec[OutputDataT]` | The type of the output data, used to validate the data returned by the model, defaults to str. | `str` | | `instructions` | `Instructions[AgentDepsT]` | Instructions to use for this agent, you can also register instructions via a function with instructions. | `None` | | `system_prompt` | `str | Sequence[str]` | Static system prompts to use for this agent, you can also register system prompts via a function with system_prompt. | `()` | | `deps_type` | `type[AgentDepsT]` | The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set deps=None to satisfy Pyright or add a type hint : Agent\[None, \]. | `NoneType` | | `name` | `str | None` | The name of the agent, used for logging. If None, we try to infer the agent name from the call frame when the agent is first run. | `None` | | `model_settings` | `ModelSettings | None` | Optional model request settings to use for this agent's runs, by default. | `None` | | `retries` | `int` | The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the HTTP Request Retries documentation. | `1` | | `output_retries` | `int | None` | The maximum number of retries to allow for output validation, defaults to retries. | `None` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]` | Tools to register with the agent, you can also register tools via the decorators @agent.tool and @agent.tool_plain. | `()` | | `builtin_tools` | `Sequence[AbstractBuiltinTool]` | The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. | `()` | | `prepare_tools` | `ToolsPrepareFunc[AgentDepsT] | None` | Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See ToolsPrepareFunc | `None` | | `prepare_output_tools` | `ToolsPrepareFunc[AgentDepsT] | None` | Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See ToolsPrepareFunc | `None` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None` | Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See ToolsetFunc for more information. | `None` | | `defer_model_check` | `bool` | by default, if you provide a named model, it's evaluated to create a Model instance immediately, which checks for the necessary environment variables. Set this to false to defer the evaluation until the first run. Useful if you want to override the model for testing. | `False` | | `end_strategy` | `EndStrategy` | Strategy for handling tool calls that are requested alongside a final result. See EndStrategy for more information. | `'early'` | | `instrument` | `InstrumentationSettings | bool | None` | Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of InstrumentationSettings to customize. If this isn't set, then the last value set by Agent.instrument_all() will be used, which defaults to False. See the Debugging and Monitoring guide for more info. | `None` | | `history_processors` | `Sequence[HistoryProcessor[AgentDepsT]] | None` | Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def __init__( self, model: models.Model | models.KnownModelName | str | None = None, *, output_type: OutputSpec[OutputDataT] = str, instructions: Instructions[AgentDepsT] = None, system_prompt: str | Sequence[str] = (), deps_type: type[AgentDepsT] = NoneType, name: str | None = None, model_settings: ModelSettings | None = None, retries: int = 1, output_retries: int | None = None, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = (), builtin_tools: Sequence[AbstractBuiltinTool] = (), prepare_tools: ToolsPrepareFunc[AgentDepsT] | None = None, prepare_output_tools: ToolsPrepareFunc[AgentDepsT] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT] | ToolsetFunc[AgentDepsT]] | None = None, defer_model_check: bool = False, end_strategy: EndStrategy = 'early', instrument: InstrumentationSettings | bool | None = None, history_processors: Sequence[HistoryProcessor[AgentDepsT]] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Any, ): """Create an agent. Args: model: The default model to use for this agent, if not provided, you must provide the model when calling it. We allow `str` here since the actual list of allowed models changes frequently. output_type: The type of the output data, used to validate the data returned by the model, defaults to `str`. instructions: Instructions to use for this agent, you can also register instructions via a function with [`instructions`][pydantic_ai.Agent.instructions]. system_prompt: Static system prompts to use for this agent, you can also register system prompts via a function with [`system_prompt`][pydantic_ai.Agent.system_prompt]. deps_type: The type used for dependency injection, this parameter exists solely to allow you to fully parameterize the agent, and therefore get the best out of static type checking. If you're not using deps, but want type checking to pass, you can set `deps=None` to satisfy Pyright or add a type hint `: Agent[None, ]`. name: The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. model_settings: Optional model request settings to use for this agent's runs, by default. retries: The default number of retries to allow for tool calls and output validation, before raising an error. For model request retries, see the [HTTP Request Retries](../retries.md) documentation. output_retries: The maximum number of retries to allow for output validation, defaults to `retries`. tools: Tools to register with the agent, you can also register tools via the decorators [`@agent.tool`][pydantic_ai.Agent.tool] and [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain]. builtin_tools: The builtin tools that the agent will use. This depends on the model, as some models may not support certain tools. If the model doesn't support the builtin tools, an error will be raised. prepare_tools: Custom function to prepare the tool definition of all tools for each step, except output tools. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] prepare_output_tools: Custom function to prepare the tool definition of all output tools for each step. This is useful if you want to customize the definition of multiple output tools or you want to register a subset of output tools for a given step. See [`ToolsPrepareFunc`][pydantic_ai.tools.ToolsPrepareFunc] toolsets: Toolsets to register with the agent, including MCP servers and functions which take a run context and return a toolset. See [`ToolsetFunc`][pydantic_ai.toolsets.ToolsetFunc] for more information. defer_model_check: by default, if you provide a [named][pydantic_ai.models.KnownModelName] model, it's evaluated to create a [`Model`][pydantic_ai.models.Model] instance immediately, which checks for the necessary environment variables. Set this to `false` to defer the evaluation until the first run. Useful if you want to [override the model][pydantic_ai.Agent.override] for testing. end_strategy: Strategy for handling tool calls that are requested alongside a final result. See [`EndStrategy`][pydantic_ai.agent.EndStrategy] for more information. instrument: Set to True to automatically instrument with OpenTelemetry, which will use Logfire if it's configured. Set to an instance of [`InstrumentationSettings`][pydantic_ai.agent.InstrumentationSettings] to customize. If this isn't set, then the last value set by [`Agent.instrument_all()`][pydantic_ai.Agent.instrument_all] will be used, which defaults to False. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. history_processors: Optional list of callables to process the message history before sending it to the model. Each processor takes a list of messages and returns a modified list of messages. Processors can be sync or async and are applied in sequence. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools. """ if model is None or defer_model_check: self._model = model else: self._model = models.infer_model(model) self._name = name self.end_strategy = end_strategy self.model_settings = model_settings self._output_type = output_type self.instrument = instrument self._deps_type = deps_type if mcp_servers := _deprecated_kwargs.pop('mcp_servers', None): if toolsets is not None: # pragma: no cover raise TypeError('`mcp_servers` and `toolsets` cannot be set at the same time.') warnings.warn('`mcp_servers` is deprecated, use `toolsets` instead', DeprecationWarning) toolsets = mcp_servers _utils.validate_empty_kwargs(_deprecated_kwargs) default_output_mode = ( self.model.profile.default_structured_output_mode if isinstance(self.model, models.Model) else None ) self._output_schema = _output.OutputSchema[OutputDataT].build(output_type, default_mode=default_output_mode) self._output_validators = [] self._instructions = self._normalize_instructions(instructions) self._system_prompts = (system_prompt,) if isinstance(system_prompt, str) else tuple(system_prompt) self._system_prompt_functions = [] self._system_prompt_dynamic_functions = {} self._max_result_retries = output_retries if output_retries is not None else retries self._max_tool_retries = retries self._builtin_tools = builtin_tools self._prepare_tools = prepare_tools self._prepare_output_tools = prepare_output_tools self._output_toolset = self._output_schema.toolset if self._output_toolset: self._output_toolset.max_retries = self._max_result_retries self._function_toolset = _AgentFunctionToolset( tools, max_retries=self._max_tool_retries, output_schema=self._output_schema ) self._dynamic_toolsets = [ DynamicToolset[AgentDepsT](toolset_func=toolset) for toolset in toolsets or [] if not isinstance(toolset, AbstractToolset) ] self._user_toolsets = [toolset for toolset in toolsets or [] if isinstance(toolset, AbstractToolset)] self.history_processors = history_processors or [] self._event_stream_handler = event_stream_handler self._override_name: ContextVar[_utils.Option[str]] = ContextVar('_override_name', default=None) self._override_deps: ContextVar[_utils.Option[AgentDepsT]] = ContextVar('_override_deps', default=None) self._override_model: ContextVar[_utils.Option[models.Model]] = ContextVar('_override_model', default=None) self._override_toolsets: ContextVar[_utils.Option[Sequence[AbstractToolset[AgentDepsT]]]] = ContextVar( '_override_toolsets', default=None ) self._override_tools: ContextVar[ _utils.Option[Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]] ] = ContextVar('_override_tools', default=None) self._override_instructions: ContextVar[ _utils.Option[list[str | _system_prompt.SystemPromptFunc[AgentDepsT]]] ] = ContextVar('_override_instructions', default=None) self._enter_lock = Lock() self._entered_count = 0 self._exit_stack = None ``` #### end_strategy ```python end_strategy: EndStrategy = end_strategy ``` Strategy for handling tool calls when a final result is found. #### model_settings ```python model_settings: ModelSettings | None = model_settings ``` Optional model request settings to use for this agents's runs, by default. Note, if `model_settings` is provided by `run`, `run_sync`, or `run_stream`, those settings will be merged with this value, with the runtime argument taking priority. #### instrument ```python instrument: InstrumentationSettings | bool | None = ( instrument ) ``` Options to automatically instrument with OpenTelemetry. #### instrument_all ```python instrument_all( instrument: InstrumentationSettings | bool = True, ) -> None ``` Set the instrumentation options for all agents where `instrument` is not set. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @staticmethod def instrument_all(instrument: InstrumentationSettings | bool = True) -> None: """Set the instrumentation options for all agents where `instrument` is not set.""" Agent._instrument_default = instrument ``` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) model_used = self._get_model(model) del model deps = self._get_deps(deps) output_schema = self._prepare_output_schema(output_type, model_used.profile) output_type_ = output_type or self.output_type # We consider it a user error if a user tries to restrict the result type while having an output validator that # may change the result type from the restricted type to something else. Therefore, we consider the following # typecast reasonable, even though it is possible to violate it with otherwise-type-checked code. output_validators = cast(list[_output.OutputValidator[AgentDepsT, RunOutputDataT]], self._output_validators) output_toolset = self._output_toolset if output_schema != self._output_schema or output_validators: output_toolset = cast(OutputToolset[AgentDepsT], output_schema.toolset) if output_toolset: output_toolset.max_retries = self._max_result_retries output_toolset.output_validators = output_validators toolset = self._get_toolset(output_toolset=output_toolset, additional_toolsets=toolsets) tool_manager = ToolManager[AgentDepsT](toolset) # Build the graph graph: Graph[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[Any]] = ( _agent_graph.build_agent_graph(self.name, self._deps_type, output_type_) ) # Build the initial state usage = usage or _usage.RunUsage() state = _agent_graph.GraphAgentState( message_history=list(message_history) if message_history else [], usage=usage, retries=0, run_step=0, ) # Merge model settings in order of precedence: run > agent > model merged_settings = merge_model_settings(model_used.settings, self.model_settings) model_settings = merge_model_settings(merged_settings, model_settings) usage_limits = usage_limits or _usage.UsageLimits() instructions_literal, instructions_functions = self._get_instructions() async def get_instructions(run_context: RunContext[AgentDepsT]) -> str | None: parts = [ instructions_literal, *[await func.run(run_context) for func in instructions_functions], ] model_profile = model_used.profile if isinstance(output_schema, _output.PromptedOutputSchema): instructions = output_schema.instructions(model_profile.prompted_output_template) parts.append(instructions) parts = [p for p in parts if p] if not parts: return None return '\n\n'.join(parts).strip() if isinstance(model_used, InstrumentedModel): instrumentation_settings = model_used.instrumentation_settings tracer = model_used.instrumentation_settings.tracer else: instrumentation_settings = None tracer = NoOpTracer() if builtin_tools: # Deduplicate builtin tools passed to the agent and the run based on type builtin_tools = list( { **({type(tool): tool for tool in self._builtin_tools or []}), **({type(tool): tool for tool in builtin_tools}), }.values() ) else: builtin_tools = list(self._builtin_tools) graph_deps = _agent_graph.GraphAgentDeps[AgentDepsT, RunOutputDataT]( user_deps=deps, prompt=user_prompt, new_message_index=len(message_history) if message_history else 0, model=model_used, model_settings=model_settings, usage_limits=usage_limits, max_result_retries=self._max_result_retries, end_strategy=self.end_strategy, output_schema=output_schema, output_validators=output_validators, history_processors=self.history_processors, builtin_tools=builtin_tools, tool_manager=tool_manager, tracer=tracer, get_instructions=get_instructions, instrumentation_settings=instrumentation_settings, ) start_node = _agent_graph.UserPromptNode[AgentDepsT]( user_prompt=user_prompt, deferred_tool_results=deferred_tool_results, instructions=instructions_literal, instructions_functions=instructions_functions, system_prompts=self._system_prompts, system_prompt_functions=self._system_prompt_functions, system_prompt_dynamic_functions=self._system_prompt_dynamic_functions, ) agent_name = self.name or 'agent' instrumentation_names = InstrumentationNames.for_version( instrumentation_settings.version if instrumentation_settings else DEFAULT_INSTRUMENTATION_VERSION ) run_span = tracer.start_span( instrumentation_names.get_agent_run_span_name(agent_name), attributes={ 'model_name': model_used.model_name if model_used else 'no-model', 'agent_name': agent_name, 'gen_ai.agent.name': agent_name, 'logfire.msg': f'{agent_name} run', }, ) try: async with toolset: async with graph.iter( start_node, state=state, deps=graph_deps, span=use_span(run_span) if run_span.is_recording() else None, infer_name=False, ) as graph_run: agent_run = AgentRun(graph_run) yield agent_run if (final_result := agent_run.result) is not None and run_span.is_recording(): if instrumentation_settings and instrumentation_settings.include_content: run_span.set_attribute( 'final_result', ( final_result.output if isinstance(final_result.output, str) else json.dumps(InstrumentedModel.serialize_any(final_result.output)) ), ) finally: try: if instrumentation_settings and run_span.is_recording(): run_span.set_attributes( self._run_span_end_attributes( instrumentation_settings, usage, state.message_history, graph_deps.new_message_index ) ) finally: run_span.end() ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(name): name_token = self._override_name.set(_utils.Some(name)) else: name_token = None if _utils.is_set(deps): deps_token = self._override_deps.set(_utils.Some(deps)) else: deps_token = None if _utils.is_set(model): model_token = self._override_model.set(_utils.Some(models.infer_model(model))) else: model_token = None if _utils.is_set(toolsets): toolsets_token = self._override_toolsets.set(_utils.Some(toolsets)) else: toolsets_token = None if _utils.is_set(tools): tools_token = self._override_tools.set(_utils.Some(tools)) else: tools_token = None if _utils.is_set(instructions): normalized_instructions = self._normalize_instructions(instructions) instructions_token = self._override_instructions.set(_utils.Some(normalized_instructions)) else: instructions_token = None try: yield finally: if name_token is not None: self._override_name.reset(name_token) if deps_token is not None: self._override_deps.reset(deps_token) if model_token is not None: self._override_model.reset(model_token) if toolsets_token is not None: self._override_toolsets.reset(toolsets_token) if tools_token is not None: self._override_tools.reset(tools_token) if instructions_token is not None: self._override_instructions.reset(instructions_token) ``` #### instructions ```python instructions( func: Callable[[RunContext[AgentDepsT]], str], ) -> Callable[[RunContext[AgentDepsT]], str] ``` ```python instructions( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str] ], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]] ``` ```python instructions(func: Callable[[], str]) -> Callable[[], str] ``` ```python instructions( func: Callable[[], Awaitable[str]], ) -> Callable[[], Awaitable[str]] ``` ```python instructions() -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python instructions( func: SystemPromptFunc[AgentDepsT] | None = None, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register an instructions function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def instructions( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register an instructions function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.instructions`). Overloads for every possible signature of `instructions` are included so the decorator doesn't obscure the type of the function. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.instructions def simple_instructions() -> str: return 'foobar' @agent.instructions async def async_instructions(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: self._instructions.append(func_) return func_ return decorator else: self._instructions.append(func) return func ```` #### system_prompt ```python system_prompt( func: Callable[[RunContext[AgentDepsT]], str], ) -> Callable[[RunContext[AgentDepsT]], str] ``` ```python system_prompt( func: Callable[ [RunContext[AgentDepsT]], Awaitable[str] ], ) -> Callable[[RunContext[AgentDepsT]], Awaitable[str]] ``` ```python system_prompt(func: Callable[[], str]) -> Callable[[], str] ``` ```python system_prompt( func: Callable[[], Awaitable[str]], ) -> Callable[[], Awaitable[str]] ``` ```python system_prompt(*, dynamic: bool = False) -> Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] ``` ```python system_prompt( func: SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[ [SystemPromptFunc[AgentDepsT]], SystemPromptFunc[AgentDepsT], ] | SystemPromptFunc[AgentDepsT] ) ``` Decorator to register a system prompt function. Optionally takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `SystemPromptFunc[AgentDepsT] | None` | The function to decorate | `None` | | `dynamic` | `bool` | If True, the system prompt will be reevaluated even when messages_history is provided, see SystemPromptPart.dynamic_ref | `False` | Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def system_prompt( self, func: _system_prompt.SystemPromptFunc[AgentDepsT] | None = None, /, *, dynamic: bool = False, ) -> ( Callable[[_system_prompt.SystemPromptFunc[AgentDepsT]], _system_prompt.SystemPromptFunc[AgentDepsT]] | _system_prompt.SystemPromptFunc[AgentDepsT] ): """Decorator to register a system prompt function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used either bare (`agent.system_prompt`) or as a function call (`agent.system_prompt(...)`), see the examples below. Overloads for every possible signature of `system_prompt` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Args: func: The function to decorate dynamic: If True, the system prompt will be reevaluated even when `messages_history` is provided, see [`SystemPromptPart.dynamic_ref`][pydantic_ai.messages.SystemPromptPart.dynamic_ref] Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=str) @agent.system_prompt def simple_system_prompt() -> str: return 'foobar' @agent.system_prompt(dynamic=True) async def async_system_prompt(ctx: RunContext[str]) -> str: return f'{ctx.deps} is the best' ``` """ if func is None: def decorator( func_: _system_prompt.SystemPromptFunc[AgentDepsT], ) -> _system_prompt.SystemPromptFunc[AgentDepsT]: runner = _system_prompt.SystemPromptRunner[AgentDepsT](func_, dynamic=dynamic) self._system_prompt_functions.append(runner) if dynamic: # pragma: lax no cover self._system_prompt_dynamic_functions[func_.__qualname__] = runner return func_ return decorator else: assert not dynamic, "dynamic can't be True in this case" self._system_prompt_functions.append(_system_prompt.SystemPromptRunner[AgentDepsT](func, dynamic=dynamic)) return func ```` #### output_validator ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], OutputDataT ] ``` ```python output_validator( func: Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ], ) -> Callable[ [RunContext[AgentDepsT], OutputDataT], Awaitable[OutputDataT], ] ``` ```python output_validator( func: Callable[[OutputDataT], OutputDataT], ) -> Callable[[OutputDataT], OutputDataT] ``` ```python output_validator( func: Callable[[OutputDataT], Awaitable[OutputDataT]], ) -> Callable[[OutputDataT], Awaitable[OutputDataT]] ``` ```python output_validator( func: OutputValidatorFunc[AgentDepsT, OutputDataT], ) -> OutputValidatorFunc[AgentDepsT, OutputDataT] ``` Decorator to register an output validator function. Optionally takes RunContext as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def output_validator( self, func: _output.OutputValidatorFunc[AgentDepsT, OutputDataT], / ) -> _output.OutputValidatorFunc[AgentDepsT, OutputDataT]: """Decorator to register an output validator function. Optionally takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. Overloads for every possible signature of `output_validator` are included so the decorator doesn't obscure the type of the function, see `tests/typed_agent.py` for tests. Example: ```python from pydantic_ai import Agent, ModelRetry, RunContext agent = Agent('test', deps_type=str) @agent.output_validator def output_validator_simple(data: str) -> str: if 'wrong' in data: raise ModelRetry('wrong response') return data @agent.output_validator async def output_validator_deps(ctx: RunContext[str], data: str) -> str: if ctx.deps in data: raise ModelRetry('wrong response') return data result = agent.run_sync('foobar', deps='spam') print(result.output) #> success (no tool calls) ``` """ self._output_validators.append(_output.OutputValidator[AgentDepsT, Any](func)) return func ```` #### tool ```python tool( func: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None ) -> Callable[ [ToolFuncContext[AgentDepsT, ToolParams]], ToolFuncContext[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncContext[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncContext[AgentDepsT, ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `description` | `str | None` | The description of the tool, defaults to the function docstring. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool( self, func: ToolFuncContext[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test', deps_type=int) @agent.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @agent.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. """ def tool_decorator( func_: ToolFuncContext[AgentDepsT, ToolParams], ) -> ToolFuncContext[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=True, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### tool_plain ```python tool_plain( func: ToolFuncPlain[ToolParams], ) -> ToolFuncPlain[ToolParams] ``` ```python tool_plain( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None ) -> Callable[ [ToolFuncPlain[ToolParams]], ToolFuncPlain[ToolParams] ] ``` ```python tool_plain( func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any ``` Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncPlain[ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `description` | `str | None` | The description of the tool, defaults to the function docstring. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def tool_plain( self, func: ToolFuncPlain[ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which DOES NOT take `RunContext` as an argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@agent.tool` is obscured. Example: ```python from pydantic_ai import Agent, RunContext agent = Agent('test') @agent.tool def foobar(ctx: RunContext[int]) -> int: return 123 @agent.tool(retries=2) async def spam(ctx: RunContext[str]) -> float: return 3.14 result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":123,"spam":3.14} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. """ def tool_decorator(func_: ToolFuncPlain[ToolParams]) -> ToolFuncPlain[ToolParams]: # noinspection PyTypeChecker self._function_toolset.add_function( func_, takes_ctx=False, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### toolset ```python toolset( func: ToolsetFunc[AgentDepsT], ) -> ToolsetFunc[AgentDepsT] ``` ```python toolset( *, per_run_step: bool = True ) -> Callable[ [ToolsetFunc[AgentDepsT]], ToolsetFunc[AgentDepsT] ] ``` ```python toolset( func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any ``` Decorator to register a toolset function which takes RunContext as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolsetFunc[AgentDepsT] | None` | The toolset function to register. | `None` | | `per_run_step` | `bool` | Whether to re-evaluate the toolset for each run step. Defaults to True. | `True` | Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ````python def toolset( self, func: ToolsetFunc[AgentDepsT] | None = None, /, *, per_run_step: bool = True, ) -> Any: """Decorator to register a toolset function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its only argument. Can decorate a sync or async functions. The decorator can be used bare (`agent.toolset`). Example: ```python from pydantic_ai import AbstractToolset, Agent, FunctionToolset, RunContext agent = Agent('test', deps_type=str) @agent.toolset async def simple_toolset(ctx: RunContext[str]) -> AbstractToolset[str]: return FunctionToolset() ``` Args: func: The toolset function to register. per_run_step: Whether to re-evaluate the toolset for each run step. Defaults to True. """ def toolset_decorator(func_: ToolsetFunc[AgentDepsT]) -> ToolsetFunc[AgentDepsT]: self._dynamic_toolsets.append(DynamicToolset(func_, per_run_step=per_run_step)) return func_ return toolset_decorator if func is None else toolset_decorator(func) ```` #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent, including a function toolset holding tools that were registered on the agent directly. Output tools are not included. #### __aenter__ ```python __aenter__() -> Self ``` Enter the agent context. This will start all MCPServerStdios registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python async def __aenter__(self) -> Self: """Enter the agent context. This will start all [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] registered as `toolsets` so they are ready to be used. This is a no-op if the agent has already been entered. """ async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: toolset = self._get_toolset() await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self ``` #### set_mcp_sampling_model ```python set_mcp_sampling_model( model: Model | KnownModelName | str | None = None, ) -> None ``` Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python def set_mcp_sampling_model(self, model: models.Model | models.KnownModelName | str | None = None) -> None: """Set the sampling model on all MCP servers registered with the agent. If no sampling model is provided, the agent's model will be used. """ try: sampling_model = models.infer_model(model) if model else self._get_model(None) except exceptions.UserError as e: raise exceptions.UserError('No sampling model provided and no model set on the agent.') from e from ..mcp import MCPServer def _set_sampling_model(toolset: AbstractToolset[AgentDepsT]) -> None: if isinstance(toolset, MCPServer): toolset.sampling_model = sampling_model self._get_toolset().apply(_set_sampling_model) ``` #### run_mcp_servers ```python run_mcp_servers( model: Model | KnownModelName | str | None = None, ) -> AsyncIterator[None] ``` Deprecated `run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`. Run MCPServerStdios so they can be used by the agent. Deprecated: use async with agent instead. If you need to set a sampling model on all MCP servers, use agent.set_mcp_sampling_model(). Returns: a context manager to start and shutdown the servers. Source code in `pydantic_ai_slim/pydantic_ai/agent/__init__.py` ```python @asynccontextmanager @deprecated( '`run_mcp_servers` is deprecated, use `async with agent:` instead. If you need to set a sampling model on all MCP servers, use `agent.set_mcp_sampling_model()`.' ) async def run_mcp_servers( self, model: models.Model | models.KnownModelName | str | None = None ) -> AsyncIterator[None]: """Run [`MCPServerStdio`s][pydantic_ai.mcp.MCPServerStdio] so they can be used by the agent. Deprecated: use [`async with agent`][pydantic_ai.agent.Agent.__aenter__] instead. If you need to set a sampling model on all MCP servers, use [`agent.set_mcp_sampling_model()`][pydantic_ai.agent.Agent.set_mcp_sampling_model]. Returns: a context manager to start and shutdown the servers. """ try: self.set_mcp_sampling_model(model) except exceptions.UserError: if model is not None: raise async with self: yield ``` ### AbstractAgent Bases: `Generic[AgentDepsT, OutputDataT]`, `ABC` Abstract superclass for Agent, WrapperAgent, and your own custom agent implementations. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python class AbstractAgent(Generic[AgentDepsT, OutputDataT], ABC): """Abstract superclass for [`Agent`][pydantic_ai.agent.Agent], [`WrapperAgent`][pydantic_ai.agent.WrapperAgent], and your own custom agent implementations.""" @property @abstractmethod def model(self) -> models.Model | models.KnownModelName | str | None: """The default model configured for this agent.""" raise NotImplementedError @property @abstractmethod def name(self) -> str | None: """The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. """ raise NotImplementedError @name.setter @abstractmethod def name(self, value: str | None) -> None: """Set the name of the agent, used for logging.""" raise NotImplementedError @property @abstractmethod def deps_type(self) -> type: """The type of dependencies used by the agent.""" raise NotImplementedError @property @abstractmethod def output_type(self) -> OutputSpec[OutputDataT]: """The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`.""" raise NotImplementedError @property @abstractmethod def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: """Optional handler for events from the model's streaming response and the agent's execution of tools.""" raise NotImplementedError @property @abstractmethod def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: """All toolsets registered on the agent. Output tools are not included. """ raise NotImplementedError @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[result.StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( # noqa C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( None, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ nonlocal final_result final_result = FinalResult( await stream.get_output(), final_result.tool_name, final_result.tool_call_id ) parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_tool_calls( tool_manager=graph_ctx.deps.tool_manager, tool_calls=stream.response.tool_calls, tool_call_results=None, final_result=final_result, ctx=graph_ctx, output_parts=parts, ): pass if parts: messages.append(_messages.ModelRequest(parts)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ # unfortunately this hack of returning a generator rather than defining it right here is # required to allow overloads of this method to work in python's typing system, or at least with pyright # or at least I couldn't make it work without return self._run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) async def _run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: send_stream, receive_stream = anyio.create_memory_object_stream[ _messages.AgentStreamEvent | AgentRunResultEvent[Any] ]() async def event_stream_handler( _: RunContext[AgentDepsT], events: AsyncIterable[_messages.AgentStreamEvent] ) -> None: async for event in events: await send_stream.send(event) async def run_agent() -> AgentRunResult[Any]: async with send_stream: return await self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) task = asyncio.create_task(run_agent()) async with receive_stream: async for message in receive_stream: yield message result = await task yield AgentRunResultEvent(result) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ raise NotImplementedError yield @contextmanager @abstractmethod def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ raise NotImplementedError yield def _infer_name(self, function_frame: FrameType | None) -> None: """Infer the agent name from the call frame. RunUsage should be `self._infer_name(inspect.currentframe())`. """ assert self.name is None, 'Name already set' if function_frame is not None: # pragma: no branch if parent_frame := function_frame.f_back: # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): if item is self: self.name = name return @staticmethod @contextmanager def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.sequential_tool_calls(): yield @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) @abstractmethod async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: raise NotImplementedError @abstractmethod async def __aexit__(self, *args: Any) -> bool | None: raise NotImplementedError def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from ..ag_ui import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, ) def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return get_event_loop().run_until_complete( self.to_cli(deps=deps, prog_name=prog_name, message_history=message_history) ) ```` #### model ```python model: Model | KnownModelName | str | None ``` The default model configured for this agent. #### name ```python name: str | None ``` The name of the agent, used for logging. If `None`, we try to infer the agent name from the call frame when the agent is first run. #### deps_type ```python deps_type: type ``` The type of dependencies used by the agent. #### output_type ```python output_type: OutputSpec[OutputDataT] ``` The type of data output by agent runs, used to validate the data returned by the model, defaults to `str`. #### event_stream_handler ```python event_stream_handler: EventStreamHandler[AgentDepsT] | None ``` Optional handler for events from the model's streaming response and the agent's execution of tools. #### toolsets ```python toolsets: Sequence[AbstractToolset[AgentDepsT]] ``` All toolsets registered on the agent. Output tools are not included. #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) event_stream_handler = event_stream_handler or self.event_stream_handler async with self.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: async for node in agent_run: if event_stream_handler is not None and ( self.is_model_request_node(node) or self.is_call_tools_node(node) ): async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) assert agent_run.result is not None, 'The graph run did not finish properly' return agent_run.result ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) return get_event_loop().run_until_complete( self.run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, ) ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use agent.run() with an `event_stream_handler` or agent.iter() instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does not receive any events after the final result is found. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager async def run_stream( # noqa C901 self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AsyncIterator[result.StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async streaming mode. This method builds an internal agent graph (using system prompts, tools and output schemas) and then runs the graph until the model produces output matching the `output_type`, for example text or structured data. At this point, a streaming run result object is yielded from which you can stream the output as it comes in, and -- once this output has completed streaming -- get the complete output, message history, and usage. As this method will consider the first output matching the `output_type` to be the final output, it will stop running the agent graph and will not execute any tool calls made by the model after this "final" output. If you want to always run the agent graph to completion and stream events and output at the same time, use [`agent.run()`][pydantic_ai.agent.AbstractAgent.run] with an `event_stream_handler` or [`agent.iter()`][pydantic_ai.agent.AbstractAgent.iter] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional handler for events from the model's streaming response and the agent's execution of tools to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Note that it does _not_ receive any events after the final result is found. Returns: The result of the run. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) event_stream_handler = event_stream_handler or self.event_stream_handler yielded = False async with self.iter( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=False, toolsets=toolsets, builtin_tools=builtin_tools, ) as agent_run: first_node = agent_run.next_node # start with the first node assert isinstance(first_node, _agent_graph.UserPromptNode) # the first node should be a user prompt node node = first_node while True: graph_ctx = agent_run.ctx if self.is_model_request_node(node): async with node.stream(graph_ctx) as stream: final_result_event = None async def stream_to_final( stream: AgentStream, ) -> AsyncIterator[_messages.ModelResponseStreamEvent]: nonlocal final_result_event async for event in stream: yield event if isinstance(event, _messages.FinalResultEvent): final_result_event = event break if event_stream_handler is not None: await event_stream_handler( _agent_graph.build_run_context(graph_ctx), stream_to_final(stream) ) else: async for _ in stream_to_final(stream): pass if final_result_event is not None: final_result = FinalResult( None, final_result_event.tool_name, final_result_event.tool_call_id ) if yielded: raise exceptions.AgentRunError('Agent run produced final results') # pragma: no cover yielded = True messages = graph_ctx.state.message_history.copy() async def on_complete() -> None: """Called when the stream has completed. The model response will have been added to messages by now by `StreamedRunResult._marked_completed`. """ nonlocal final_result final_result = FinalResult( await stream.get_output(), final_result.tool_name, final_result.tool_call_id ) parts: list[_messages.ModelRequestPart] = [] async for _event in _agent_graph.process_tool_calls( tool_manager=graph_ctx.deps.tool_manager, tool_calls=stream.response.tool_calls, tool_call_results=None, final_result=final_result, ctx=graph_ctx, output_parts=parts, ): pass if parts: messages.append(_messages.ModelRequest(parts)) yield StreamedRunResult( messages, graph_ctx.deps.new_message_index, stream, on_complete, ) break elif self.is_call_tools_node(node) and event_stream_handler is not None: async with node.stream(agent_run.ctx) as stream: await event_stream_handler(_agent_graph.build_run_context(agent_run.ctx), stream) next_node = await agent_run.next(node) if isinstance(next_node, End) and agent_run.result is not None: # A final output could have been produced by the CallToolsNode rather than the ModelRequestNode, # if a tool function raised CallDeferred or ApprovalRequired. # In this case there's no response to stream, but we still let the user access the output etc as normal. yield StreamedRunResult( graph_ctx.state.message_history, graph_ctx.deps.new_message_index, run_result=agent_run.result, ) yielded = True break if not isinstance(next_node, _agent_graph.AgentNode): raise exceptions.AgentRunError( # pragma: no cover 'Should have produced a StreamedRunResult before getting here' ) node = cast(_agent_graph.AgentNode[Any, Any], next_node) if not yielded: raise exceptions.AgentRunError('Agent run finished without producing a final result') # pragma: no cover ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | An async iterable of stream events AgentStreamEvent and finally a AgentRunResultEvent with the final | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | run result. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ # unfortunately this hack of returning a generator rather than defining it right here is # required to allow overloads of this method to work in python's typing system, or at least with pyright # or at least I couldn't make it work without return self._run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python @asynccontextmanager @abstractmethod async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ raise NotImplementedError yield ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @contextmanager @abstractmethod def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ raise NotImplementedError yield ``` #### sequential_tool_calls ```python sequential_tool_calls() -> Iterator[None] ``` Run tool calls sequentially during the context. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod @contextmanager def sequential_tool_calls() -> Iterator[None]: """Run tool calls sequentially during the context.""" with ToolManager.sequential_tool_calls(): yield ``` #### is_model_request_node ```python is_model_request_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[ModelRequestNode[T, S]] ``` Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_model_request_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.ModelRequestNode[T, S]]: """Check if the node is a `ModelRequestNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.ModelRequestNode) ``` #### is_call_tools_node ```python is_call_tools_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[CallToolsNode[T, S]] ``` Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_call_tools_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.CallToolsNode[T, S]]: """Check if the node is a `CallToolsNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.CallToolsNode) ``` #### is_user_prompt_node ```python is_user_prompt_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[UserPromptNode[T, S]] ``` Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_user_prompt_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[_agent_graph.UserPromptNode[T, S]]: """Check if the node is a `UserPromptNode`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, _agent_graph.UserPromptNode) ``` #### is_end_node ```python is_end_node( node: AgentNode[T, S] | End[FinalResult[S]], ) -> TypeIs[End[FinalResult[S]]] ``` Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ```python @staticmethod def is_end_node( node: _agent_graph.AgentNode[T, S] | End[result.FinalResult[S]], ) -> TypeIs[End[result.FinalResult[S]]]: """Check if the node is a `End`, narrowing the type if it is. This method preserves the generic parameters while narrowing the type, unlike a direct call to `isinstance`. """ return isinstance(node, End) ``` #### to_ag_ui ```python to_ag_ui( *, output_type: OutputSpec[OutputDataT] | None = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( Mapping[Any, ExceptionHandler] | None ) = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: ( Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None ) = None ) -> AGUIApp[AgentDepsT, OutputDataT] ``` Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the StateHandler protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use pydantic_ai.ag_ui.run_ag_ui or pydantic_ai.ag_ui.handle_ag_ui_request instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../../ag-ui/) for more information. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_type` | `OutputSpec[OutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `debug` | `bool` | Boolean indicating if debug tracebacks should be returned on errors. | `False` | | `routes` | `Sequence[BaseRoute] | None` | A list of routes to serve incoming HTTP and WebSocket requests. | `None` | | `middleware` | `Sequence[Middleware] | None` | A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. ServerErrorMiddleware is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. ExceptionMiddleware is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. | `None` | | `exception_handlers` | `Mapping[Any, ExceptionHandler] | None` | A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form handler(request, exc) -> response and may be either standard functions, or async functions. | `None` | | `on_startup` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `on_shutdown` | `Sequence[Callable[[], Any]] | None` | A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. | `None` | | `lifespan` | `Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None` | A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the on_startup and on_shutdown handlers. Use one or the other, not both. | `None` | Returns: | Type | Description | | --- | --- | | `AGUIApp[AgentDepsT, OutputDataT]` | An ASGI application for running Pydantic AI agents with AG-UI protocol support. | Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_ag_ui( self, *, # Agent.iter parameters output_type: OutputSpec[OutputDataT] | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, # Starlette debug: bool = False, routes: Sequence[BaseRoute] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: Mapping[Any, ExceptionHandler] | None = None, on_startup: Sequence[Callable[[], Any]] | None = None, on_shutdown: Sequence[Callable[[], Any]] | None = None, lifespan: Lifespan[AGUIApp[AgentDepsT, OutputDataT]] | None = None, ) -> AGUIApp[AgentDepsT, OutputDataT]: """Returns an ASGI application that handles every AG-UI request by running the agent. Note that the `deps` will be the same for each request, with the exception of the AG-UI state that's injected into the `state` field of a `deps` object that implements the [`StateHandler`][pydantic_ai.ag_ui.StateHandler] protocol. To provide different `deps` for each request (e.g. based on the authenticated user), use [`pydantic_ai.ag_ui.run_ag_ui`][pydantic_ai.ag_ui.run_ag_ui] or [`pydantic_ai.ag_ui.handle_ag_ui_request`][pydantic_ai.ag_ui.handle_ag_ui_request] instead. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_ag_ui() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` See [AG-UI docs](../ag-ui.md) for more information. Args: output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. debug: Boolean indicating if debug tracebacks should be returned on errors. routes: A list of routes to serve incoming HTTP and WebSocket requests. middleware: A list of middleware to run for every request. A starlette application will always automatically include two middleware classes. `ServerErrorMiddleware` is added as the very outermost middleware, to handle any uncaught errors occurring anywhere in the entire stack. `ExceptionMiddleware` is added as the very innermost middleware, to deal with handled exception cases occurring in the routing or endpoints. exception_handlers: A mapping of either integer status codes, or exception class types onto callables which handle the exceptions. Exception handler callables should be of the form `handler(request, exc) -> response` and may be either standard functions, or async functions. on_startup: A list of callables to run on application startup. Startup handler callables do not take any arguments, and may be either standard functions, or async functions. on_shutdown: A list of callables to run on application shutdown. Shutdown handler callables do not take any arguments, and may be either standard functions, or async functions. lifespan: A lifespan context function, which can be used to perform startup and shutdown tasks. This is a newer style that replaces the `on_startup` and `on_shutdown` handlers. Use one or the other, not both. Returns: An ASGI application for running Pydantic AI agents with AG-UI protocol support. """ from ..ag_ui import AGUIApp return AGUIApp( agent=self, # Agent.iter parameters output_type=output_type, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, # Starlette debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, ) ```` #### to_a2a ```python to_a2a( *, storage: Storage | None = None, broker: Broker | None = None, name: str | None = None, url: str = "http://localhost:8000", version: str = "1.0.0", description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: ( dict[Any, ExceptionHandler] | None ) = None, lifespan: Lifespan[FastA2A] | None = None ) -> FastA2A ``` Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_a2a( self, *, storage: Storage | None = None, broker: Broker | None = None, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ) -> FastA2A: """Convert the agent to a FastA2A application. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') app = agent.to_a2a() ``` The `app` is an ASGI application that can be used with any ASGI server. To run the application, you can use the following command: ```bash uvicorn app:app --host 0.0.0.0 --port 8000 ``` """ from .._a2a import agent_to_a2a return agent_to_a2a( self, storage=storage, broker=broker, name=name, url=url, version=version, description=description, provider=provider, skills=skills, debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) ```` #### to_cli ```python to_cli( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: Sequence[ModelMessage] | None = None, ) -> None ``` Run the agent in a CLI chat interface. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | Example: agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python async def to_cli( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. Example: ```python {title="agent_to_cli.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` """ from rich.console import Console from pydantic_ai._cli import run_chat await run_chat( stream=True, agent=self, deps=deps, console=Console(), code_theme='monokai', prog_name=prog_name, message_history=message_history, ) ```` #### to_cli_sync ```python to_cli_sync( deps: AgentDepsT = None, prog_name: str = "pydantic-ai", message_history: Sequence[ModelMessage] | None = None, ) -> None ``` Run the agent in a CLI chat interface with the non-async interface. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `deps` | `AgentDepsT` | The dependencies to pass to the agent. | `None` | | `prog_name` | `str` | The name of the program to use for the CLI. Defaults to 'pydantic-ai'. | `'pydantic-ai'` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` Source code in `pydantic_ai_slim/pydantic_ai/agent/abstract.py` ````python def to_cli_sync( self: Self, deps: AgentDepsT = None, prog_name: str = 'pydantic-ai', message_history: Sequence[_messages.ModelMessage] | None = None, ) -> None: """Run the agent in a CLI chat interface with the non-async interface. Args: deps: The dependencies to pass to the agent. prog_name: The name of the program to use for the CLI. Defaults to 'pydantic-ai'. message_history: History of the conversation so far. ```python {title="agent_to_cli_sync.py" test="skip"} from pydantic_ai import Agent agent = Agent('openai:gpt-4o', instructions='You always respond in Italian.') agent.to_cli_sync() agent.to_cli_sync(prog_name='assistant') ``` """ return get_event_loop().run_until_complete( self.to_cli(deps=deps, prog_name=prog_name, message_history=message_history) ) ```` ### WrapperAgent Bases: `AbstractAgent[AgentDepsT, OutputDataT]` Agent which wraps another agent. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python class WrapperAgent(AbstractAgent[AgentDepsT, OutputDataT]): """Agent which wraps another agent. Does nothing on its own, used as a base class. """ def __init__(self, wrapped: AbstractAgent[AgentDepsT, OutputDataT]): self.wrapped = wrapped @property def model(self) -> models.Model | models.KnownModelName | str | None: return self.wrapped.model @property def name(self) -> str | None: return self.wrapped.name @name.setter def name(self, value: str | None) -> None: self.wrapped.name = value @property def deps_type(self) -> type: return self.wrapped.deps_type @property def output_type(self) -> OutputSpec[OutputDataT]: return self.wrapped.output_type @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: return self.wrapped.event_stream_handler @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: return self.wrapped.toolsets async def __aenter__(self) -> AbstractAgent[AgentDepsT, OutputDataT]: return await self.wrapped.__aenter__() async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ with self.wrapped.override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ async with self.wrapped.iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/agent/wrapper.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ with self.wrapped.override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### AgentRun Bases: `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an Agent. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an End is reached, the run finishes and result becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the next method for more granular control. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python @dataclasses.dataclass(repr=False) class AgentRun(Generic[AgentDepsT, OutputDataT]): """A stateful, async-iterable run of an [`Agent`][pydantic_ai.agent.Agent]. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`][pydantic_graph.nodes.End] is reached, the run finishes and [`result`][pydantic_ai.agent.AgentRun.result] becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`][pydantic_ai.agent.AgentRun.next] method for more granular control. """ _graph_run: GraphRun[ _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[OutputDataT] ] @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: traceparent = self._graph_run._traceparent(required=False) # type: ignore[reportPrivateUsage] if traceparent is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return traceparent @property def ctx(self) -> GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]: """The current context of the agent run.""" return GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]( state=self._graph_run.state, deps=self._graph_run.deps ) @property def next_node( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ next_node = self._graph_run.next_node if isinstance(next_node, End): return next_node if _agent_graph.is_agent_node(next_node): return next_node raise exceptions.AgentRunError(f'Unexpected node type: {type(next_node)}') # pragma: no cover @property def result(self) -> AgentRunResult[OutputDataT] | None: """The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`][pydantic_graph.nodes.End] node, `result` is populated with an [`AgentRunResult`][pydantic_ai.agent.AgentRunResult]. """ graph_run_result = self._graph_run.result if graph_run_result is None: return None return AgentRunResult( graph_run_result.output.output, graph_run_result.output.tool_name, graph_run_result.state, self._graph_run.deps.new_message_index, self._traceparent(required=False), ) def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage def __repr__(self) -> str: # pragma: no cover result = self._graph_run.result result_repr = '' if result is None else repr(result.output) return f'<{type(self).__name__} result={result_repr} usage={self.usage()}>' ```` #### ctx ```python ctx: GraphRunContext[ GraphAgentState, GraphAgentDeps[AgentDepsT, Any] ] ``` The current context of the agent run. #### next_node ```python next_node: ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: AgentRunResult[OutputDataT] | None ``` The final result of the run if it has ended, otherwise `None`. Once the run returns an End node, `result` is populated with an AgentRunResult. #### __aiter__ ```python __aiter__() -> ( AsyncIterator[ AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ] ) ``` Provide async-iteration over the nodes in the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self ``` #### __anext__ ```python __anext__() -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Advance to the next node automatically based on the last returned node. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ``` #### next ```python next( node: AgentNode[AgentDepsT, OutputDataT], ) -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an End node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `AgentNode[AgentDepsT, OutputDataT]` | The node to run next in the graph. | *required* | Returns: | Type | Description | | --- | --- | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | The next node returned by the graph logic, or an End node if | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | the run has completed. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ```` #### usage ```python usage() -> RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage ``` ### AgentRunResult Bases: `Generic[OutputDataT]` The final result of an agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass class AgentRunResult(Generic[OutputDataT]): """The final result of an agent run.""" output: OutputDataT """The output data from the agent run.""" _output_tool_name: str | None = dataclasses.field(repr=False, compare=False, default=None) _state: _agent_graph.GraphAgentState = dataclasses.field( repr=False, compare=False, default_factory=_agent_graph.GraphAgentState ) _new_message_index: int = dataclasses.field(repr=False, compare=False, default=0) _traceparent_value: str | None = dataclasses.field(repr=False, compare=False, default=None) @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self._traceparent_value is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return self._traceparent_value def _set_output_tool_return(self, return_content: str) -> list[_messages.ModelMessage]: """Set return content for the output tool. Useful if you want to continue the conversation and want to set the response to the output tool call. """ if not self._output_tool_name: raise ValueError('Cannot set output tool return content when the return type is `str`.') messages = self._state.message_history last_message = messages[-1] for idx, part in enumerate(last_message.parts): if isinstance(part, _messages.ToolReturnPart) and part.tool_name == self._output_tool_name: # Only do deepcopy when we have to modify copied_messages = list(messages) copied_last = deepcopy(last_message) copied_last.parts[idx].content = return_content # type: ignore[misc] copied_messages[-1] = copied_last return copied_messages raise LookupError(f'No tool call found with tool name {self._output_tool_name!r}.') def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @property def response(self) -> _messages.ModelResponse: """Return the last response from the message history.""" # The response may not be the very last item if it contained an output tool call. See `CallToolsNode._handle_final_result`. for message in reversed(self.all_messages()): if isinstance(message, _messages.ModelResponse): return message raise ValueError('No response found in the message history') # pragma: no cover # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage # TODO (v2): Make this a property def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` #### output ```python output: OutputDataT ``` The output data from the agent run. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### response ```python response: ModelResponse ``` Return the last response from the message history. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage ``` #### timestamp ```python timestamp() -> datetime ``` Return the timestamp of last response. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` ### EndStrategy ```python EndStrategy = Literal['early', 'exhaustive'] ``` The strategy for handling multiple tool calls when a final result is found. - `'early'`: Stop processing other tool calls once a final result is found - `'exhaustive'`: Process all tool calls even after finding a final result ### RunOutputDataT ```python RunOutputDataT = TypeVar('RunOutputDataT') ``` Type variable for the result data of a run where `output_type` was customized on the run call. ### capture_run_messages ```python capture_run_messages() -> Iterator[list[ModelMessage]] ``` Context manager to access the messages used in a run, run_sync, or run_stream call. Useful when a run may raise an exception, see [model errors](../../agents/#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` Note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. Source code in `pydantic_ai_slim/pydantic_ai/_agent_graph.py` ````python @contextmanager def capture_run_messages() -> Iterator[list[_messages.ModelMessage]]: """Context manager to access the messages used in a [`run`][pydantic_ai.agent.AbstractAgent.run], [`run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], or [`run_stream`][pydantic_ai.agent.AbstractAgent.run_stream] call. Useful when a run may raise an exception, see [model errors](../agents.md#model-errors) for more information. Examples: ```python from pydantic_ai import Agent, capture_run_messages agent = Agent('test') with capture_run_messages() as messages: try: result = agent.run_sync('foobar') except Exception: print(messages) raise ``` !!! note If you call `run`, `run_sync`, or `run_stream` more than once within a single `capture_run_messages` context, `messages` will represent the messages exchanged during the first call only. """ token = None messages: list[_messages.ModelMessage] = [] # Try to reuse existing message context if available try: messages = _messages_ctx_var.get().messages except LookupError: # No existing context, create a new one token = _messages_ctx_var.set(_RunMessages(messages)) try: yield messages finally: # Clean up context if we created it if token is not None: _messages_ctx_var.reset(token) ```` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) event_logger: EventLogger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages(self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span): if self.version == 1: events = self.messages_to_otel_events(input_messages) for event in self.messages_to_otel_events([response]): events.append( Event( 'gen_ai.choice', body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[Event]) -> None: if self.event_mode == 'logs': for event in events: self.event_logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) def record_metrics( self, response: ModelResponse, price_calculation: PriceCalculation | None, attributes: dict[str, AttributeValue], ): for typ in ['input', 'output']: if not (tokens := getattr(response.usage, f'{typ}_tokens', 0)): # pragma: no cover continue token_attributes = {**attributes, 'gen_ai.token.type': typ} self.tokens_histogram.record(tokens, token_attributes) if price_calculation: cost = float(getattr(price_calculation, f'{typ}_price')) self.cost_histogram.record(cost, token_attributes) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[ 1, 2, 3 ] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal[ "attributes", "logs" ] = "attributes", event_logger_provider: EventLoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tracer_provider` | `TracerProvider | None` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | `None` | | `meter_provider` | `MeterProvider | None` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | `None` | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2, 3]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and event_logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `DEFAULT_INSTRUMENTATION_VERSION` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `event_logger_provider` | `EventLoggerProvider | None` | The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling logfire.configure() sets the global event logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], ) -> list[Event] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | Returns: | Type | Description | | --- | --- | | `list[Event]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### EventStreamHandler ```python EventStreamHandler: TypeAlias = Callable[ [ RunContext[AgentDepsT], AsyncIterable[AgentStreamEvent], ], Awaitable[None], ] ``` A function that receives agent RunContext and an async iterable of events from the model's streaming response and the agent's execution of tools. # `pydantic_ai.builtin_tools` ### AbstractBuiltinTool Bases: `ABC` A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class AbstractBuiltinTool(ABC): """A builtin tool that can be used by an agent. This class is abstract and cannot be instantiated directly. The builtin tools are passed to the model as part of the `ModelRequestParameters`. """ kind: str = 'unknown_builtin_tool' """Built-in tool identifier, this should be available on all built-in tools as a discriminator.""" ``` #### kind ```python kind: str = 'unknown_builtin_tool' ``` Built-in tool identifier, this should be available on all built-in tools as a discriminator. ### WebSearchTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: - Anthropic - OpenAI Responses - Groq - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class WebSearchTool(AbstractBuiltinTool): """A builtin tool that allows your agent to search the web for information. The parameters that PydanticAI passes depend on the model, as some parameters may not be supported by certain models. Supported by: * Anthropic * OpenAI Responses * Groq * Google """ search_context_size: Literal['low', 'medium', 'high'] = 'medium' """The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: * OpenAI Responses """ user_location: WebSearchUserLocation | None = None """The `user_location` parameter allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ blocked_domains: list[str] | None = None """If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see """ allowed_domains: list[str] | None = None """If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: * Anthropic, see * Groq, see """ max_uses: int | None = None """If provided, the tool will stop searching the web after the given number of uses. Supported by: * Anthropic """ kind: str = 'web_search' """The kind of tool.""" ``` #### search_context_size ```python search_context_size: Literal["low", "medium", "high"] = ( "medium" ) ``` The `search_context_size` parameter controls how much context is retrieved from the web to help the tool formulate a response. Supported by: - OpenAI Responses #### user_location ```python user_location: WebSearchUserLocation | None = None ``` The `user_location` parameter allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses #### blocked_domains ```python blocked_domains: list[str] | None = None ``` If provided, these domains will never appear in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see #### allowed_domains ```python allowed_domains: list[str] | None = None ``` If provided, only these domains will be included in results. With Anthropic, you can only use one of `blocked_domains` or `allowed_domains`, not both. Supported by: - Anthropic, see - Groq, see #### max_uses ```python max_uses: int | None = None ``` If provided, the tool will stop searching the web after the given number of uses. Supported by: - Anthropic #### kind ```python kind: str = 'web_search' ``` The kind of tool. ### WebSearchUserLocation Bases: `TypedDict` Allows you to localize search results based on a user's location. Supported by: - Anthropic - OpenAI Responses Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class WebSearchUserLocation(TypedDict, total=False): """Allows you to localize search results based on a user's location. Supported by: * Anthropic * OpenAI Responses """ city: str """The city where the user is located.""" country: str """The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB').""" region: str """The region or state where the user is located.""" timezone: str """The timezone of the user's location.""" ``` #### city ```python city: str ``` The city where the user is located. #### country ```python country: str ``` The country where the user is located. For OpenAI, this must be a 2-letter country code (e.g., 'US', 'GB'). #### region ```python region: str ``` The region or state where the user is located. #### timezone ```python timezone: str ``` The timezone of the user's location. ### CodeExecutionTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to execute code. Supported by: - Anthropic - OpenAI Responses - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class CodeExecutionTool(AbstractBuiltinTool): """A builtin tool that allows your agent to execute code. Supported by: * Anthropic * OpenAI Responses * Google """ kind: str = 'code_execution' """The kind of tool.""" ``` #### kind ```python kind: str = 'code_execution' ``` The kind of tool. ### UrlContextTool Bases: `AbstractBuiltinTool` Allows your agent to access contents from URLs. Supported by: - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class UrlContextTool(AbstractBuiltinTool): """Allows your agent to access contents from URLs. Supported by: * Google """ kind: str = 'url_context' """The kind of tool.""" ``` #### kind ```python kind: str = 'url_context' ``` The kind of tool. ### ImageGenerationTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to generate images. Supported by: - OpenAI Responses - Google Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python @dataclass(kw_only=True) class ImageGenerationTool(AbstractBuiltinTool): """A builtin tool that allows your agent to generate images. Supported by: * OpenAI Responses * Google """ background: Literal['transparent', 'opaque', 'auto'] = 'auto' """Background type for the generated image. Supported by: * OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats. """ input_fidelity: Literal['high', 'low'] | None = None """ Control how much effort the model will exert to match the style and features, especially facial features, of input images. Supported by: * OpenAI Responses. Default: 'low'. """ moderation: Literal['auto', 'low'] = 'auto' """Moderation level for the generated image. Supported by: * OpenAI Responses """ output_compression: int = 100 """Compression level for the output image. Supported by: * OpenAI Responses. Only supported for 'png' and 'webp' output formats. """ output_format: Literal['png', 'webp', 'jpeg'] | None = None """The output format of the generated image. Supported by: * OpenAI Responses. Default: 'png'. """ partial_images: int = 0 """ Number of partial images to generate in streaming mode. Supported by: * OpenAI Responses. Supports 0 to 3. """ quality: Literal['low', 'medium', 'high', 'auto'] = 'auto' """The quality of the generated image. Supported by: * OpenAI Responses """ size: Literal['1024x1024', '1024x1536', '1536x1024', 'auto'] = 'auto' """The size of the generated image. Supported by: * OpenAI Responses """ kind: str = 'image_generation' """The kind of tool.""" ``` #### background ```python background: Literal["transparent", "opaque", "auto"] = ( "auto" ) ``` Background type for the generated image. Supported by: - OpenAI Responses. 'transparent' is only supported for 'png' and 'webp' output formats. #### input_fidelity ```python input_fidelity: Literal['high', 'low'] | None = None ``` Control how much effort the model will exert to match the style and features, especially facial features, of input images. Supported by: - OpenAI Responses. Default: 'low'. #### moderation ```python moderation: Literal['auto', 'low'] = 'auto' ``` Moderation level for the generated image. Supported by: - OpenAI Responses #### output_compression ```python output_compression: int = 100 ``` Compression level for the output image. Supported by: - OpenAI Responses. Only supported for 'png' and 'webp' output formats. #### output_format ```python output_format: Literal['png', 'webp', 'jpeg'] | None = None ``` The output format of the generated image. Supported by: - OpenAI Responses. Default: 'png'. #### partial_images ```python partial_images: int = 0 ``` Number of partial images to generate in streaming mode. Supported by: - OpenAI Responses. Supports 0 to 3. #### quality ```python quality: Literal['low', 'medium', 'high', 'auto'] = 'auto' ``` The quality of the generated image. Supported by: - OpenAI Responses #### size ```python size: Literal[ "1024x1024", "1024x1536", "1536x1024", "auto" ] = "auto" ``` The size of the generated image. Supported by: - OpenAI Responses #### kind ```python kind: str = 'image_generation' ``` The kind of tool. ### MemoryTool Bases: `AbstractBuiltinTool` A builtin tool that allows your agent to use memory. Supported by: - Anthropic Source code in `pydantic_ai_slim/pydantic_ai/builtin_tools.py` ```python class MemoryTool(AbstractBuiltinTool): """A builtin tool that allows your agent to use memory. Supported by: * Anthropic """ kind: str = 'memory' """The kind of tool.""" ``` #### kind ```python kind: str = 'memory' ``` The kind of tool. # `pydantic_ai.common_tools` ### duckduckgo_search_tool ```python duckduckgo_search_tool( duckduckgo_client: DDGS | None = None, max_results: int | None = None, ) ``` Creates a DuckDuckGo search tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `duckduckgo_client` | `DDGS | None` | The DuckDuckGo search client. | `None` | | `max_results` | `int | None` | The maximum number of results. If None, returns results only from the first response. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/duckduckgo.py` ```python def duckduckgo_search_tool(duckduckgo_client: DDGS | None = None, max_results: int | None = None): """Creates a DuckDuckGo search tool. Args: duckduckgo_client: The DuckDuckGo search client. max_results: The maximum number of results. If None, returns results only from the first response. """ return Tool[Any]( DuckDuckGoSearchTool(client=duckduckgo_client or DDGS(), max_results=max_results).__call__, name='duckduckgo_search', description='Searches DuckDuckGo for the given query and returns the results.', ) ``` ### tavily_search_tool ```python tavily_search_tool(api_key: str) ``` Creates a Tavily search tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str` | The Tavily API key. You can get one by signing up at https://app.tavily.com/home. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/common_tools/tavily.py` ```python def tavily_search_tool(api_key: str): """Creates a Tavily search tool. Args: api_key: The Tavily API key. You can get one by signing up at [https://app.tavily.com/home](https://app.tavily.com/home). """ return Tool[Any]( TavilySearchTool(client=AsyncTavilyClient(api_key)).__call__, name='tavily_search', description='Searches Tavily for the given query and returns the results.', ) ``` # `pydantic_ai.direct` Methods for making imperative requests to language models with minimal abstraction. These methods allow you to make requests to LLMs where the only abstraction is input and output schema translation so you can use all models with the same API. These methods are thin wrappers around Model implementations. ### model_request ```python model_request( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a non-streamed request to a model. model_request_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request async def main(): model_response = await model_request( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python async def model_request( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a non-streamed request to a model. ```py title="model_request_example.py" from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request async def main(): model_response = await model_request( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ model_instance = _prepare_model(model, instrument) return await model_instance.request( list(messages), model_settings, model_request_parameters or models.ModelRequestParameters(), ) ```` ### model_request_sync ```python model_request_sync( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> ModelResponse ``` Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps model_request with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. model_request_sync_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `ModelResponse` | The model response and token usage associated with the request. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> messages.ModelResponse: """Make a Synchronous, non-streamed request to a model. This is a convenience method that wraps [`model_request`][pydantic_ai.direct.model_request] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. ```py title="model_request_sync_example.py" from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_sync model_response = model_request_sync( 'anthropic:claude-3-5-haiku-latest', [ModelRequest.user_text_prompt('What is the capital of France?')] # (1)! ) print(model_response) ''' ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='claude-3-5-haiku-latest', timestamp=datetime.datetime(...), ) ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: The model response and token usage associated with the request. """ return _get_event_loop().run_until_complete( model_request( model, list(messages), model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) ) ```` ### model_request_stream ```python model_request_stream( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> AbstractAsyncContextManager[StreamedResponse] ``` Make a streamed async request to a model. model_request_stream_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-4.1-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` 1. See ModelRequest.user_text_prompt for details. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `AbstractAsyncContextManager[StreamedResponse]` | A stream response async context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> AbstractAsyncContextManager[models.StreamedResponse]: """Make a streamed async request to a model. ```py {title="model_request_stream_example.py"} from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream async def main(): messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] # (1)! async with model_request_stream('openai:gpt-4.1-mini', messages) as stream: chunks = [] async for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` 1. See [`ModelRequest.user_text_prompt`][pydantic_ai.messages.ModelRequest.user_text_prompt] for details. Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [stream response][pydantic_ai.models.StreamedResponse] async context manager. """ model_instance = _prepare_model(model, instrument) return model_instance.request_stream( list(messages), model_settings, model_request_parameters or models.ModelRequestParameters(), ) ```` ### model_request_stream_sync ```python model_request_stream_sync( model: Model | KnownModelName | str, messages: Sequence[ModelMessage], *, model_settings: ModelSettings | None = None, model_request_parameters: ( ModelRequestParameters | None ) = None, instrument: InstrumentationSettings | bool | None = None ) -> StreamedResponseSync ``` Make a streamed synchronous request to a model. This is the synchronous version of model_request_stream. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. model_request_stream_sync_example.py ```py from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream_sync messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-4.1-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model` | `Model | KnownModelName | str` | The model to make a request to. We allow str here since the actual list of allowed models changes frequently. | *required* | | `messages` | `Sequence[ModelMessage]` | Messages to send to the model | *required* | | `model_settings` | `ModelSettings | None` | optional model settings | `None` | | `model_request_parameters` | `ModelRequestParameters | None` | optional model request parameters | `None` | | `instrument` | `InstrumentationSettings | bool | None` | Whether to instrument the request with OpenTelemetry/Logfire, if None the value from logfire.instrument_pydantic_ai is used. | `None` | Returns: | Type | Description | | --- | --- | | `StreamedResponseSync` | A sync stream response context manager. | Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ````python def model_request_stream_sync( model: models.Model | models.KnownModelName | str, messages: Sequence[messages.ModelMessage], *, model_settings: settings.ModelSettings | None = None, model_request_parameters: models.ModelRequestParameters | None = None, instrument: instrumented_models.InstrumentationSettings | bool | None = None, ) -> StreamedResponseSync: """Make a streamed synchronous request to a model. This is the synchronous version of [`model_request_stream`][pydantic_ai.direct.model_request_stream]. It uses threading to run the asynchronous stream in the background while providing a synchronous iterator interface. ```py {title="model_request_stream_sync_example.py"} from pydantic_ai import ModelRequest from pydantic_ai.direct import model_request_stream_sync messages = [ModelRequest.user_text_prompt('Who was Albert Einstein?')] with model_request_stream_sync('openai:gpt-4.1-mini', messages) as stream: chunks = [] for chunk in stream: chunks.append(chunk) print(chunks) ''' [ PartStartEvent(index=0, part=TextPart(content='Albert Einstein was ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent( index=0, delta=TextPartDelta(content_delta='a German-born theoretical ') ), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='physicist.')), ] ''' ``` Args: model: The model to make a request to. We allow `str` here since the actual list of allowed models changes frequently. messages: Messages to send to the model model_settings: optional model settings model_request_parameters: optional model request parameters instrument: Whether to instrument the request with OpenTelemetry/Logfire, if `None` the value from [`logfire.instrument_pydantic_ai`][logfire.Logfire.instrument_pydantic_ai] is used. Returns: A [sync stream response][pydantic_ai.direct.StreamedResponseSync] context manager. """ async_stream_cm = model_request_stream( model=model, messages=list(messages), model_settings=model_settings, model_request_parameters=model_request_parameters, instrument=instrument, ) return StreamedResponseSync(async_stream_cm) ```` ### StreamedResponseSync Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python @dataclass class StreamedResponseSync: """Synchronous wrapper to async streaming responses by running the async producer in a background thread and providing a synchronous iterator. This class must be used as a context manager with the `with` statement. """ _async_stream_cm: AbstractAsyncContextManager[StreamedResponse] _queue: queue.Queue[messages.ModelResponseStreamEvent | Exception | None] = field( default_factory=queue.Queue, init=False ) _thread: threading.Thread | None = field(default=None, init=False) _stream_response: StreamedResponse | None = field(default=None, init=False) _exception: Exception | None = field(default=None, init=False) _context_entered: bool = field(default=False, init=False) _stream_ready: threading.Event = field(default_factory=threading.Event, init=False) def __enter__(self) -> StreamedResponseSync: self._context_entered = True self._start_producer() return self def __exit__( self, _exc_type: type[BaseException] | None, _exc_val: BaseException | None, _exc_tb: TracebackType | None, ) -> None: self._cleanup() def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item def __repr__(self) -> str: if self._stream_response: return repr(self._stream_response) else: return f'{self.__class__.__name__}(context_entered={self._context_entered})' __str__ = __repr__ def _check_context_manager_usage(self) -> None: if not self._context_entered: raise RuntimeError( 'StreamedResponseSync must be used as a context manager. ' 'Use: `with model_request_stream_sync(...) as stream:`' ) def _ensure_stream_ready(self) -> StreamedResponse: self._check_context_manager_usage() if self._stream_response is None: # Wait for the background thread to signal that the stream is ready if not self._stream_ready.wait(timeout=STREAM_INITIALIZATION_TIMEOUT): raise RuntimeError('Stream failed to initialize within timeout') if self._stream_response is None: # pragma: no cover raise RuntimeError('Stream failed to initialize') return self._stream_response def _start_producer(self): self._thread = threading.Thread(target=self._async_producer, daemon=True) self._thread.start() def _async_producer(self): async def _consume_async_stream(): try: async with self._async_stream_cm as stream: self._stream_response = stream # Signal that the stream is ready self._stream_ready.set() async for event in stream: self._queue.put(event) except Exception as e: # Signal ready even on error so waiting threads don't hang self._stream_ready.set() self._queue.put(e) finally: self._queue.put(None) # Signal end _get_event_loop().run_until_complete(_consume_async_stream()) def _cleanup(self): if self._thread and self._thread.is_alive(): self._thread.join() # TODO (v2): Drop in favor of `response` property def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() @property def response(self) -> messages.ModelResponse: """Get the current state of the response.""" return self.get() # TODO (v2): Make this a property def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() @property def model_name(self) -> str: """Get the model name of the response.""" return self._ensure_stream_ready().model_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._ensure_stream_ready().timestamp ``` #### __iter__ ```python __iter__() -> Iterator[ModelResponseStreamEvent] ``` Stream the response as an iterable of ModelResponseStreamEvents. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def __iter__(self) -> Iterator[messages.ModelResponseStreamEvent]: """Stream the response as an iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s.""" self._check_context_manager_usage() while True: item = self._queue.get() if item is None: # End of stream break elif isinstance(item, Exception): raise item else: yield item ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def get(self) -> messages.ModelResponse: """Build a ModelResponse from the data received from the stream so far.""" return self._ensure_stream_ready().get() ``` #### response ```python response: ModelResponse ``` Get the current state of the response. #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. Source code in `pydantic_ai_slim/pydantic_ai/direct.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far.""" return self._ensure_stream_ready().usage() ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.durable_exec` ### TemporalAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python class TemporalAgent(WrapperAgent[AgentDepsT, OutputDataT]): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Temporal, it cannot be set at agent run time.' ) async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) temporal_model = TemporalModel( wrapped.model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, ) activities.extend(temporal_model.temporal_activities) def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._model = temporal_model self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) @property def name(self) -> str | None: return self._name or super().name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif workflow.in_workflow(): return self._call_event_stream_handler_activity else: return handler async def _call_event_stream_handler_activity( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: serialized_run_context = self.run_context_type.serialize_run_context(ctx) async for event in stream: await workflow.execute_activity( # pyright: ignore[reportUnknownMemberType] activity=self.event_stream_handler_activity, args=[ _EventStreamHandlerParams( event=event, serialized_run_context=serialized_run_context, ), ctx.deps, ], **self.activity_config, ) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._temporal_overrides(): return super().toolsets @property def temporal_activities(self) -> list[Callable[..., Any]]: return self._temporal_activities @contextmanager def _temporal_overrides(self) -> Iterator[None]: # We reset tools here as the temporalized function toolset is already in self._toolsets. with super().override(model=self._model, toolsets=self._toolsets, tools=[]): token = self._temporal_overrides_active.set(True) try: yield except PydanticSerializationError as e: raise UserError( "The `deps` object failed to be serialized. Temporal requires all objects that are passed to activities to be serializable using Pydantic's `TypeAdapter`." ) from e finally: self._temporal_overrides_active.reset(token) @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow() and event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) with self._temporal_overrides(): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream_events()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) if model is not None: raise UserError( 'Model cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: ( dict[str, ActivityConfig] | None ) = None, tool_activity_config: ( dict[ str, dict[str, ActivityConfig | Literal[False]] ] | None ) = None, run_context_type: type[ TemporalRunContext[AgentDepsT] ] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset ) ``` Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | `str | None` | Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's name will be used. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use instead of the one set on the wrapped agent. | `None` | | `activity_config` | `ActivityConfig | None` | The base Temporal activity config to use for all activities. If no config is provided, a start_to_close_timeout of 60 seconds is used. | `None` | | `model_activity_config` | `ActivityConfig | None` | The Temporal activity config to use for model request activities. This is merged with the base activity config. | `None` | | `toolset_activity_config` | `dict[str, ActivityConfig] | None` | The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. | `None` | | `tool_activity_config` | `dict[str, dict[str, ActivityConfig | Literal[False]]] | None` | The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify False to disable using an activity. Note that the tool is required to be defined as an async function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. | `None` | | `run_context_type` | `type[TemporalRunContext[AgentDepsT]]` | The TemporalRunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the deps, retries, tool_call_id, tool_name, retry and run_step attributes will be available. To make another attribute available, create a TemporalRunContext subclass with a custom serialize_run_context class method that returns a dictionary that includes the attribute. | `TemporalRunContext[AgentDepsT]` | | `temporalize_toolset_func` | `Callable[[AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]]], AbstractToolset[AgentDepsT]]` | Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a TemporalWrapperToolset that moves methods that require IO to Temporal activities. If not provided, only FunctionToolset and MCPServer will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. | `temporalize_toolset` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, activity_config: ActivityConfig | None = None, model_activity_config: ActivityConfig | None = None, toolset_activity_config: dict[str, ActivityConfig] | None = None, tool_activity_config: dict[str, dict[str, ActivityConfig | Literal[False]]] | None = None, run_context_type: type[TemporalRunContext[AgentDepsT]] = TemporalRunContext[AgentDepsT], temporalize_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], str, ActivityConfig, dict[str, ActivityConfig | Literal[False]], type[AgentDepsT], type[TemporalRunContext[AgentDepsT]], ], AbstractToolset[AgentDepsT], ] = temporalize_toolset, ): """Wrap an agent to enable it to be used inside a Temporal workflow, by automatically offloading model requests, tool calls, and MCP server communication to Temporal activities. After wrapping, the original agent can still be used as normal outside of the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Args: wrapped: The agent to wrap. name: Optional unique agent name to use in the Temporal activities' names. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. activity_config: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. model_activity_config: The Temporal activity config to use for model request activities. This is merged with the base activity config. toolset_activity_config: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. tool_activity_config: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use IO, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. run_context_type: The `TemporalRunContext` subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `retry` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute. temporalize_toolset_func: Optional function to use to prepare "leaf" toolsets (i.e. those that implement their own tool listing and calling) for Temporal by wrapping them in a `TemporalWrapperToolset` that moves methods that require IO to Temporal activities. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Temporal. The function takes the toolset, the activity name prefix, the toolset-specific activity config, the tool-specific activity configs and the run context type. """ super().__init__(wrapped) self._name = name self._event_stream_handler = event_stream_handler self.run_context_type = run_context_type # start_to_close_timeout is required activity_config = activity_config or ActivityConfig(start_to_close_timeout=timedelta(seconds=60)) # `pydantic_ai.exceptions.UserError` and `pydantic.errors.PydanticUserError` are not retryable retry_policy = activity_config.get('retry_policy') or RetryPolicy() retry_policy.non_retryable_error_types = [ *(retry_policy.non_retryable_error_types or []), UserError.__name__, PydanticUserError.__name__, ] activity_config['retry_policy'] = retry_policy self.activity_config = activity_config model_activity_config = model_activity_config or {} toolset_activity_config = toolset_activity_config or {} tool_activity_config = tool_activity_config or {} if self.name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Temporal. The name will be used to identify the agent's activities within the workflow." ) activity_name_prefix = f'agent__{self.name}' activities: list[Callable[..., Any]] = [] if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Temporal, it cannot be set at agent run time.' ) async def event_stream_handler_activity(params: _EventStreamHandlerParams, deps: AgentDepsT) -> None: # We can never get here without an `event_stream_handler`, as `TemporalAgent.run_stream` and `TemporalAgent.iter` raise an error saying to use `TemporalAgent.run` instead, # and that only ends up calling `event_stream_handler` if it is set. assert self.event_stream_handler is not None run_context = self.run_context_type.deserialize_run_context(params.serialized_run_context, deps=deps) async def streamed_response(): yield params.event await self.event_stream_handler(run_context, streamed_response()) # Set type hint explicitly so that Temporal can take care of serialization and deserialization event_stream_handler_activity.__annotations__['deps'] = self.deps_type self.event_stream_handler_activity = activity.defn(name=f'{activity_name_prefix}__event_stream_handler')( event_stream_handler_activity ) activities.append(self.event_stream_handler_activity) temporal_model = TemporalModel( wrapped.model, activity_name_prefix=activity_name_prefix, activity_config=activity_config | model_activity_config, deps_type=self.deps_type, run_context_type=self.run_context_type, event_stream_handler=self.event_stream_handler, ) activities.extend(temporal_model.temporal_activities) def temporalize_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: id = toolset.id if id is None: raise UserError( "Toolsets that are 'leaves' (i.e. those that implement their own tool listing and calling) need to have a unique `id` in order to be used with Temporal. The ID will be used to identify the toolset's activities within the workflow." ) toolset = temporalize_toolset_func( toolset, activity_name_prefix, activity_config | toolset_activity_config.get(id, {}), tool_activity_config.get(id, {}), self.deps_type, self.run_context_type, ) if isinstance(toolset, TemporalWrapperToolset): activities.extend(toolset.temporal_activities) return toolset temporal_toolsets = [toolset.visit_and_replace(temporalize_toolset) for toolset in wrapped.toolsets] self._model = temporal_model self._toolsets = temporal_toolsets self._temporal_activities = activities self._temporal_overrides_active: ContextVar[bool] = ContextVar('_temporal_overrides_active', default=False) ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow() and event_stream_handler is not None: raise UserError( 'Event stream handler cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) with self._temporal_overrides(): return await super().run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler or self.event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_sync()` cannot be used inside a Temporal workflow. Use `await agent.run()` instead.' ) return super().run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | An async iterable of stream events AgentStreamEvent and finally a AgentRunResultEvent with the final | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | run result. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if workflow.in_workflow(): raise UserError( '`agent.run_stream_events()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if workflow.in_workflow(): if not self._temporal_overrides_active.get(): raise UserError( '`agent.iter()` cannot be used inside a Temporal workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) if model is not None: raise UserError( 'Model cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) if toolsets is not None: raise UserError( 'Toolsets cannot be set at agent run time inside a Temporal workflow, it must be set at agent creation time.' ) async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if workflow.in_workflow(): if _utils.is_set(model): raise UserError( 'Model cannot be contextually overridden inside a Temporal workflow, it must be set at agent creation time.' ) if _utils.is_set(toolsets): raise UserError( 'Toolsets cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) if _utils.is_set(tools): raise UserError( 'Tools cannot be contextually overridden inside a Temporal workflow, they must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### LogfirePlugin Bases: `Plugin` Temporal client plugin for Logfire. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_logfire.py` ```python class LogfirePlugin(ClientPlugin): """Temporal client plugin for Logfire.""" def __init__(self, setup_logfire: Callable[[], Logfire] = _default_setup_logfire, *, metrics: bool = True): try: import logfire # noqa: F401 # pyright: ignore[reportUnusedImport] except ImportError as _import_error: raise ImportError( 'Please install the `logfire` package to use the Logfire plugin, ' 'you can use the `logfire` optional group — `pip install "pydantic-ai-slim[logfire]"`' ) from _import_error self.setup_logfire = setup_logfire self.metrics = metrics def init_client_plugin(self, next: ClientPlugin) -> None: self.next_client_plugin = next def configure_client(self, config: ClientConfig) -> ClientConfig: from opentelemetry.trace import get_tracer from temporalio.contrib.opentelemetry import TracingInterceptor interceptors = config.get('interceptors', []) config['interceptors'] = [*interceptors, TracingInterceptor(get_tracer('temporalio'))] return self.next_client_plugin.configure_client(config) async def connect_service_client(self, config: ConnectConfig) -> ServiceClient: logfire = self.setup_logfire() if self.metrics: logfire_config = logfire.config token = logfire_config.token if logfire_config.send_to_logfire and token is not None and logfire_config.metrics is not False: base_url = logfire_config.advanced.generate_base_url(token) metrics_url = base_url + '/v1/metrics' headers = {'Authorization': f'Bearer {token}'} config.runtime = Runtime( telemetry=TelemetryConfig(metrics=OpenTelemetryConfig(url=metrics_url, headers=headers)) ) return await self.next_client_plugin.connect_service_client(config) ``` ### TemporalRunContext Bases: `RunContext[AgentDepsT]` The RunContext subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to TemporalAgent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python class TemporalRunContext(RunContext[AgentDepsT]): """The [`RunContext`][pydantic_ai.tools.RunContext] subclass to use to serialize and deserialize the run context for use inside a Temporal activity. By default, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries` and `run_step` attributes will be available. To make another attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to [`TemporalAgent`][pydantic_ai.durable_exec.temporal.TemporalAgent]. """ def __init__(self, deps: AgentDepsT, **kwargs: Any): self.__dict__ = {**kwargs, 'deps': deps} setattr( self, '__dataclass_fields__', {name: field for name, field in RunContext.__dataclass_fields__.items() if name in self.__dict__}, ) def __getattribute__(self, name: str) -> Any: try: return super().__getattribute__(name) except AttributeError as e: # pragma: no cover if name in RunContext.__dataclass_fields__: raise UserError( f'{self.__class__.__name__!r} object has no attribute {name!r}. ' 'To make the attribute available, create a `TemporalRunContext` subclass with a custom `serialize_run_context` class method that returns a dictionary that includes the attribute and pass it to `TemporalAgent`.' ) else: raise e @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'retry': ctx.retry, 'max_retries': ctx.max_retries, 'run_step': ctx.run_step, } @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: AgentDepsT) -> TemporalRunContext[AgentDepsT]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` #### serialize_run_context ```python serialize_run_context( ctx: RunContext[Any], ) -> dict[str, Any] ``` Serialize the run context to a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def serialize_run_context(cls, ctx: RunContext[Any]) -> dict[str, Any]: """Serialize the run context to a `dict[str, Any]`.""" return { 'retries': ctx.retries, 'tool_call_id': ctx.tool_call_id, 'tool_name': ctx.tool_name, 'tool_call_approved': ctx.tool_call_approved, 'retry': ctx.retry, 'max_retries': ctx.max_retries, 'run_step': ctx.run_step, } ``` #### deserialize_run_context ```python deserialize_run_context( ctx: dict[str, Any], deps: AgentDepsT ) -> TemporalRunContext[AgentDepsT] ``` Deserialize the run context from a `dict[str, Any]`. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/_run_context.py` ```python @classmethod def deserialize_run_context(cls, ctx: dict[str, Any], deps: AgentDepsT) -> TemporalRunContext[AgentDepsT]: """Deserialize the run context from a `dict[str, Any]`.""" return cls(**ctx, deps=deps) ``` ### PydanticAIPlugin Bases: `Plugin`, `Plugin` Temporal client and worker plugin for Pydantic AI. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class PydanticAIPlugin(ClientPlugin, WorkerPlugin): """Temporal client and worker plugin for Pydantic AI.""" def init_client_plugin(self, next: ClientPlugin) -> None: self.next_client_plugin = next def init_worker_plugin(self, next: WorkerPlugin) -> None: self.next_worker_plugin = next def configure_client(self, config: ClientConfig) -> ClientConfig: config['data_converter'] = self._get_new_data_converter(config.get('data_converter')) return self.next_client_plugin.configure_client(config) def configure_worker(self, config: WorkerConfig) -> WorkerConfig: runner = config.get('workflow_runner') # pyright: ignore[reportUnknownMemberType] if isinstance(runner, SandboxedWorkflowRunner): # pragma: no branch config['workflow_runner'] = replace( runner, restrictions=runner.restrictions.with_passthrough_modules( 'pydantic_ai', 'pydantic', 'pydantic_core', 'logfire', 'rich', 'httpx', 'anyio', 'httpcore', # Imported inside `logfire._internal.json_encoder` when running `logfire.info` inside an activity with attributes to serialize 'attrs', # Imported inside `logfire._internal.json_schema` when running `logfire.info` inside an activity with attributes to serialize 'numpy', 'pandas', ), ) config['workflow_failure_exception_types'] = [ *config.get('workflow_failure_exception_types', []), # pyright: ignore[reportUnknownMemberType] UserError, PydanticUserError, ] return self.next_worker_plugin.configure_worker(config) async def connect_service_client(self, config: ConnectConfig) -> ServiceClient: return await self.next_client_plugin.connect_service_client(config) async def run_worker(self, worker: Worker) -> None: await self.next_worker_plugin.run_worker(worker) def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover config['data_converter'] = self._get_new_data_converter(config.get('data_converter')) # pyright: ignore[reportUnknownMemberType] return self.next_worker_plugin.configure_replayer(config) def run_replayer( self, replayer: Replayer, histories: AsyncIterator[WorkflowHistory], ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover return self.next_worker_plugin.run_replayer(replayer, histories) def _get_new_data_converter(self, converter: DataConverter | None) -> DataConverter: if converter and converter.payload_converter_class not in ( DefaultPayloadConverter, PydanticPayloadConverter, ): warnings.warn( # pragma: no cover 'A non-default Temporal data converter was used which has been replaced with the Pydantic data converter.' ) return pydantic_data_converter ``` ### AgentPlugin Bases: `Plugin` Temporal worker plugin for a specific Pydantic AI agent. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/temporal/__init__.py` ```python class AgentPlugin(WorkerPlugin): """Temporal worker plugin for a specific Pydantic AI agent.""" def __init__(self, agent: TemporalAgent[Any, Any]): self.agent = agent def init_worker_plugin(self, next: WorkerPlugin) -> None: self.next_worker_plugin = next def configure_worker(self, config: WorkerConfig) -> WorkerConfig: activities: Sequence[Callable[..., Any]] = config.get('activities', []) # pyright: ignore[reportUnknownMemberType] # Activities are checked for name conflicts by Temporal. config['activities'] = [*activities, *self.agent.temporal_activities] return self.next_worker_plugin.configure_worker(config) async def run_worker(self, worker: Worker) -> None: await self.next_worker_plugin.run_worker(worker) def configure_replayer(self, config: ReplayerConfig) -> ReplayerConfig: # pragma: no cover return self.next_worker_plugin.configure_replayer(config) def run_replayer( self, replayer: Replayer, histories: AsyncIterator[WorkflowHistory], ) -> AbstractAsyncContextManager[AsyncIterator[WorkflowReplayResult]]: # pragma: no cover return self.next_worker_plugin.run_replayer(replayer, histories) ``` ### DBOSAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]`, `DBOSConfiguredInstance` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @DBOS.dbos_class() class DBOSAgent(WrapperAgent[AgentDepsT, OutputDataT], DBOSConfiguredInstance): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer try: from pydantic_ai.mcp import MCPServer from ._mcp_server import DBOSMCPServer except ImportError: pass else: if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow @property def name(self) -> str | None: return self._name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif DBOS.workflow_id is not None and DBOS.step_id is None: # Special case if it's in a DBOS workflow but not a step, we need to iterate through all events and call the handler. return self._call_event_stream_handler_in_workflow else: return handler async def _call_event_stream_handler_in_workflow( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: handler = self._event_stream_handler or super().event_stream_handler assert handler is not None async def streamed_response(event: _messages.AgentStreamEvent): yield event async for event in stream: await handler(ctx, streamed_response(event)) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._dbos_overrides(): return super().toolsets @contextmanager def _dbos_overrides(self) -> Iterator[None]: # Override with DBOSModel and DBOSMCPServer in the toolsets. with ( super().override(model=self._model, toolsets=self._toolsets, tools=[]), self.sequential_tool_calls(), ): yield @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ raise UserError( '`agent.run_stream_events()` cannot be used with DBOS. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None ) ``` Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | `str | None` | Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's name will be used. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use instead of the one set on the wrapped agent. | `None` | | `mcp_step_config` | `StepConfig | None` | The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. | `None` | | `model_step_config` | `StepConfig | None` | The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_step_config: StepConfig | None = None, model_step_config: StepConfig | None = None, ): """Wrap an agent to enable it with DBOS durable workflows, by automatically offloading model requests, tool calls, and MCP server communication to DBOS steps. After wrapping, the original agent can still be used as normal outside of the DBOS workflow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the DBOS configured instance name. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_step_config: The base DBOS step config to use for MCP server steps. If no config is provided, use the default settings of DBOS. model_step_config: The DBOS step config to use for model request steps. If no config is provided, use the default settings of DBOS. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with DBOS. The name will be used to identify the agent's workflows and steps." ) # Merge the config with the default DBOS config self._mcp_step_config = mcp_step_config or {} self._model_step_config = model_step_config or {} if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with DBOS, it cannot be set at agent run time.' ) dbos_model = DBOSModel( wrapped.model, step_name_prefix=self._name, step_config=self._model_step_config, event_stream_handler=self.event_stream_handler, ) self._model = dbos_model dbosagent_name = self._name def dbosify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: # Replace MCPServer with DBOSMCPServer try: from pydantic_ai.mcp import MCPServer from ._mcp_server import DBOSMCPServer except ImportError: pass else: if isinstance(toolset, MCPServer): return DBOSMCPServer( wrapped=toolset, step_name_prefix=dbosagent_name, step_config=self._mcp_step_config, ) return toolset dbos_toolsets = [toolset.visit_and_replace(dbosify_toolset) for toolset in wrapped.toolsets] self._toolsets = dbos_toolsets DBOSConfiguredInstance.__init__(self, self._name) # Wrap the `run` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run') async def wrapped_run_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_workflow = wrapped_run_workflow # Wrap the `run_sync` method in a DBOS workflow @DBOS.workflow(name=f'{self._name}.run_sync') def wrapped_run_sync_workflow( user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: with self._dbos_overrides(): return super(DBOSAgent, self).run_sync( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) self.dbos_wrapped_run_sync_workflow = wrapped_run_sync_workflow ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return await self.dbos_wrapped_run_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. Returns: The result of the run. """ return self.dbos_wrapped_run_sync_workflow( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if DBOS.workflow_id is not None and DBOS.step_id is None: raise UserError( '`agent.run_stream()` cannot be used inside a DBOS workflow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, event_stream_handler=event_stream_handler, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | An async iterable of stream events AgentStreamEvent and finally a AgentRunResultEvent with the final | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | run result. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ raise UserError( '`agent.run_stream_events()` cannot be used with DBOS. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, **_deprecated_kwargs: Never ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, DBOSModel): raise UserError( 'Non-DBOS model cannot be set at agent run time inside a DBOS workflow, it must be set at agent creation time.' ) with self._dbos_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent name, dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, (DBOSModel)): raise UserError( 'Non-DBOS model cannot be contextually overridden inside a DBOS workflow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions, ): yield ``` ### DBOSMCPServer Bases: `WrapperToolset[AgentDepsT]`, `ABC` A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_mcp_server.py` ```python class DBOSMCPServer(WrapperToolset[AgentDepsT], ABC): """A wrapper for MCPServer that integrates with DBOS, turning call_tool and get_tools to DBOS steps.""" def __init__( self, wrapped: MCPServer, *, step_name_prefix: str, step_config: StepConfig, ): super().__init__(wrapped) self._step_config = step_config or {} self._step_name_prefix = step_name_prefix id_suffix = f'__{wrapped.id}' if wrapped.id else '' self._name = f'{step_name_prefix}__mcp_server{id_suffix}' # Wrap get_tools in a DBOS step. @DBOS.step( name=f'{self._name}.get_tools', **self._step_config, ) async def wrapped_get_tools_step( ctx: RunContext[AgentDepsT], ) -> dict[str, ToolsetTool[AgentDepsT]]: return await super(DBOSMCPServer, self).get_tools(ctx) self._dbos_wrapped_get_tools_step = wrapped_get_tools_step # Wrap call_tool in a DBOS step. @DBOS.step( name=f'{self._name}.call_tool', **self._step_config, ) async def wrapped_call_tool_step( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await super(DBOSMCPServer, self).call_tool(name, tool_args, ctx, tool) self._dbos_wrapped_call_tool_step = wrapped_call_tool_step @property def id(self) -> str | None: return self.wrapped.id async def __aenter__(self) -> Self: # The wrapped MCPServer enters itself around listing and calling tools # so we don't need to enter it here (nor could we because we're not inside a DBOS step). return self async def __aexit__(self, *args: Any) -> bool | None: return None def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: # DBOS-ified toolsets cannot be swapped out after the fact. return self async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return await self._dbos_wrapped_get_tools_step(ctx) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await self._dbos_wrapped_call_tool_step(name, tool_args, ctx, tool) ``` ### DBOSModel Bases: `WrapperModel` A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_model.py` ```python class DBOSModel(WrapperModel): """A wrapper for Model that integrates with DBOS, turning request and request_stream to DBOS steps.""" def __init__( self, model: Model, *, step_name_prefix: str, step_config: StepConfig, event_stream_handler: EventStreamHandler[Any] | None = None, ): super().__init__(model) self.step_config = step_config self.event_stream_handler = event_stream_handler self._step_name_prefix = step_name_prefix # Wrap the request in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request', **self.step_config, ) async def wrapped_request_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await super(DBOSModel, self).request(messages, model_settings, model_request_parameters) self._dbos_wrapped_request_step = wrapped_request_step # Wrap the request_stream in a DBOS step. @DBOS.step( name=f'{self._step_name_prefix}__model.request_stream', **self.step_config, ) async def wrapped_request_stream_step( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> ModelResponse: async with super(DBOSModel, self).request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: if self.event_stream_handler is not None: assert run_context is not None, ( 'A DBOS model cannot be used with `pydantic_ai.direct.model_request_stream()` as it requires a `run_context`. Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) await self.event_stream_handler(run_context, streamed_response) async for _ in streamed_response: pass return streamed_response.get() self._dbos_wrapped_request_stream_step = wrapped_request_stream_step async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: return await self._dbos_wrapped_request_step(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: # If not in a workflow (could be in a step), just call the wrapped request_stream method. if DBOS.workflow_id is None or DBOS.step_id is not None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return response = await self._dbos_wrapped_request_stream_step( messages, model_settings, model_request_parameters, run_context ) yield DBOSStreamedResponse(model_request_parameters, response) ``` ### StepConfig Bases: `TypedDict` Configuration for a step in the DBOS workflow. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/dbos/_utils.py` ```python class StepConfig(TypedDict, total=False): """Configuration for a step in the DBOS workflow.""" retries_allowed: bool interval_seconds: float max_attempts: int backoff_rate: float ``` ### PrefectAgent Bases: `WrapperAgent[AgentDepsT, OutputDataT]` Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python class PrefectAgent(WrapperAgent[AgentDepsT, OutputDataT]): def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: dict[str, TaskConfig | None] | None = None, event_stream_handler_task_config: TaskConfig | None = None, prefectify_toolset_func: Callable[ [AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT], ] = prefectify_toolset, ): """Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_task_config: The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. model_task_config: The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. tool_task_config: The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. tool_task_config_by_name: Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). event_stream_handler_task_config: The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. prefectify_toolset_func: Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Prefect. The name will be used to identify the agent's flows and tasks." ) # Merge the config with the default Prefect config self._mcp_task_config = default_task_config | (mcp_task_config or {}) self._model_task_config = default_task_config | (model_task_config or {}) self._tool_task_config = default_task_config | (tool_task_config or {}) self._tool_task_config_by_name = tool_task_config_by_name or {} self._event_stream_handler_task_config = default_task_config | (event_stream_handler_task_config or {}) if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Prefect, it cannot be set at agent run time.' ) prefect_model = PrefectModel( wrapped.model, task_config=self._model_task_config, event_stream_handler=self.event_stream_handler, ) self._model = prefect_model def _prefectify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: """Convert a toolset to its Prefect equivalent.""" return prefectify_toolset_func( toolset, self._mcp_task_config, self._tool_task_config, self._tool_task_config_by_name, ) prefect_toolsets = [toolset.visit_and_replace(_prefectify_toolset) for toolset in wrapped.toolsets] self._toolsets = prefect_toolsets # Context variable to track when we're inside this agent's Prefect flow self._in_prefect_agent_flow: ContextVar[bool] = ContextVar( f'_in_prefect_agent_flow_{self._name}', default=False ) @property def name(self) -> str | None: return self._name @name.setter def name(self, value: str | None) -> None: # pragma: no cover raise UserError( 'The agent name cannot be changed after creation. If you need to change the name, create a new agent.' ) @property def model(self) -> Model: return self._model @property def event_stream_handler(self) -> EventStreamHandler[AgentDepsT] | None: handler = self._event_stream_handler or super().event_stream_handler if handler is None: return None elif FlowRunContext.get() is not None: # Special case if it's in a Prefect flow, we need to iterate through all events and call the handler. return self._call_event_stream_handler_in_flow else: return handler async def _call_event_stream_handler_in_flow( self, ctx: RunContext[AgentDepsT], stream: AsyncIterable[_messages.AgentStreamEvent] ) -> None: handler = self._event_stream_handler or super().event_stream_handler assert handler is not None # Create a task to handle each event @task(name='Handle Stream Event', **self._event_stream_handler_task_config) async def event_stream_handler_task(event: _messages.AgentStreamEvent) -> None: async def streamed_response(): yield event await handler(ctx, streamed_response()) async for event in stream: await event_stream_handler_task(event) @property def toolsets(self) -> Sequence[AbstractToolset[AgentDepsT]]: with self._prefect_overrides(): return super().toolsets @contextmanager def _prefect_overrides(self) -> Iterator[None]: # Override with PrefectModel and PrefectMCPServer in the toolsets. with super().override(model=self._model, toolsets=self._toolsets, tools=[]): yield @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Run') async def wrapped_run_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): result = await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) return result finally: self._in_prefect_agent_flow.reset(token) return await wrapped_run_flow() @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[OutputDataT]: ... @overload def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AgentRunResult[RunOutputDataT]: ... def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Sync Run') def wrapped_run_sync_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): # Using `run_coro_as_sync` from Prefect with async `run` to avoid event loop conflicts. result = run_coro_as_sync( super(PrefectAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) return result finally: self._in_prefect_agent_flow.reset(token) return wrapped_run_sync_flow() @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, OutputDataT]]: ... @overload def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, ) -> AbstractAsyncContextManager[StreamedRunResult[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[OutputDataT]]: ... @overload def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[RunOutputDataT]]: ... def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream_events()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, OutputDataT]]: ... @overload def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AbstractAsyncContextManager[AgentRun[AgentDepsT, RunOutputDataT]]: ... @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be set at agent run time inside a Prefect flow, it must be set at agent creation time.' ) with self._prefect_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: yield run @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be contextually overridden inside a Prefect flow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions ): yield ```` #### __init__ ```python __init__( wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: ( dict[str, TaskConfig | None] | None ) = None, event_stream_handler_task_config: ( TaskConfig | None ) = None, prefectify_toolset_func: Callable[ [ AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None], ], AbstractToolset[AgentDepsT], ] = prefectify_toolset ) ``` Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AbstractAgent[AgentDepsT, OutputDataT]` | The agent to wrap. | *required* | | `name` | `str | None` | Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's name will be used. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use instead of the one set on the wrapped agent. | `None` | | `mcp_task_config` | `TaskConfig | None` | The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. | `None` | | `model_task_config` | `TaskConfig | None` | The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. | `None` | | `tool_task_config` | `TaskConfig | None` | The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. | `None` | | `tool_task_config_by_name` | `dict[str, TaskConfig | None] | None` | Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). | `None` | | `event_stream_handler_task_config` | `TaskConfig | None` | The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. | `None` | | `prefectify_toolset_func` | `Callable[[AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT]]` | Optional function to use to prepare toolsets for Prefect by wrapping them in a PrefectWrapperToolset that moves methods that require IO to Prefect tasks. If not provided, only FunctionToolset and MCPServer will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. | `prefectify_toolset` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ```python def __init__( self, wrapped: AbstractAgent[AgentDepsT, OutputDataT], *, name: str | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, mcp_task_config: TaskConfig | None = None, model_task_config: TaskConfig | None = None, tool_task_config: TaskConfig | None = None, tool_task_config_by_name: dict[str, TaskConfig | None] | None = None, event_stream_handler_task_config: TaskConfig | None = None, prefectify_toolset_func: Callable[ [AbstractToolset[AgentDepsT], TaskConfig, TaskConfig, dict[str, TaskConfig | None]], AbstractToolset[AgentDepsT], ] = prefectify_toolset, ): """Wrap an agent to enable it with Prefect durable flows, by automatically offloading model requests, tool calls, and MCP server communication to Prefect tasks. After wrapping, the original agent can still be used as normal outside of the Prefect flow. Args: wrapped: The agent to wrap. name: Optional unique agent name to use as the Prefect flow name prefix. If not provided, the agent's `name` will be used. event_stream_handler: Optional event stream handler to use instead of the one set on the wrapped agent. mcp_task_config: The base Prefect task config to use for MCP server tasks. If no config is provided, use the default settings of Prefect. model_task_config: The Prefect task config to use for model request tasks. If no config is provided, use the default settings of Prefect. tool_task_config: The default Prefect task config to use for tool calls. If no config is provided, use the default settings of Prefect. tool_task_config_by_name: Per-tool task configuration. Keys are tool names, values are TaskConfig or None (None disables task wrapping for that tool). event_stream_handler_task_config: The Prefect task config to use for the event stream handler task. If no config is provided, use the default settings of Prefect. prefectify_toolset_func: Optional function to use to prepare toolsets for Prefect by wrapping them in a `PrefectWrapperToolset` that moves methods that require IO to Prefect tasks. If not provided, only `FunctionToolset` and `MCPServer` will be prepared for Prefect. The function takes the toolset, the task config, the tool-specific task config, and the tool-specific task config by name. """ super().__init__(wrapped) self._name = name or wrapped.name self._event_stream_handler = event_stream_handler if self._name is None: raise UserError( "An agent needs to have a unique `name` in order to be used with Prefect. The name will be used to identify the agent's flows and tasks." ) # Merge the config with the default Prefect config self._mcp_task_config = default_task_config | (mcp_task_config or {}) self._model_task_config = default_task_config | (model_task_config or {}) self._tool_task_config = default_task_config | (tool_task_config or {}) self._tool_task_config_by_name = tool_task_config_by_name or {} self._event_stream_handler_task_config = default_task_config | (event_stream_handler_task_config or {}) if not isinstance(wrapped.model, Model): raise UserError( 'An agent needs to have a `model` in order to be used with Prefect, it cannot be set at agent run time.' ) prefect_model = PrefectModel( wrapped.model, task_config=self._model_task_config, event_stream_handler=self.event_stream_handler, ) self._model = prefect_model def _prefectify_toolset(toolset: AbstractToolset[AgentDepsT]) -> AbstractToolset[AgentDepsT]: """Convert a toolset to its Prefect equivalent.""" return prefectify_toolset_func( toolset, self._mcp_task_config, self._tool_task_config, self._tool_task_config_by_name, ) prefect_toolsets = [toolset.visit_and_replace(_prefectify_toolset) for toolset in wrapped.toolsets] self._toolsets = prefect_toolsets # Context variable to track when we're inside this agent's Prefect flow self._in_prefect_agent_flow: ContextVar[bool] = ContextVar( f'_in_prefect_agent_flow_{self._name}', default=False ) ``` #### run ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python async def run( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Run the agent with a user prompt in async mode. This method builds an internal agent graph (using system prompts, tools and result schemas) and then runs the graph to completion. The result of the run is returned. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): agent_run = await agent.run('What is the capital of France?') print(agent_run.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Run') async def wrapped_run_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): result = await super(WrapperAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) return result finally: self._in_prefect_agent_flow.reset(token) return await wrapped_run_flow() ```` #### run_sync ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[OutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AgentRunResult[RunOutputDataT] ``` ```python run_sync( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AgentRunResult[Any] ``` Synchronously run the agent with a user prompt. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AgentRunResult[Any]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python def run_sync( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AgentRunResult[Any]: """Synchronously run the agent with a user prompt. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') result_sync = agent.run_sync('What is the capital of Italy?') print(result_sync.output) #> The capital of Italy is Rome. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. event_stream_handler: Optional event stream handler to use for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ @flow(name=f'{self._name} Sync Run') def wrapped_run_sync_flow() -> AgentRunResult[Any]: # Mark that we're inside a PrefectAgent flow token = self._in_prefect_agent_flow.set(True) try: with self._prefect_overrides(): # Using `run_coro_as_sync` from Prefect with async `run` to avoid event loop conflicts. result = run_coro_as_sync( super(PrefectAgent, self).run( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, ) ) return result finally: self._in_prefect_agent_flow.reset(token) return wrapped_run_sync_flow() ```` #### run_stream ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, OutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None ) -> AbstractAsyncContextManager[ StreamedRunResult[AgentDepsT, RunOutputDataT] ] ``` ```python run_stream( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None, event_stream_handler: ( EventStreamHandler[AgentDepsT] | None ) = None, **_deprecated_kwargs: Never ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]] ``` Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | | `event_stream_handler` | `EventStreamHandler[AgentDepsT] | None` | Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[StreamedRunResult[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python @asynccontextmanager async def run_stream( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, event_stream_handler: EventStreamHandler[AgentDepsT] | None = None, **_deprecated_kwargs: Never, ) -> AsyncIterator[StreamedRunResult[AgentDepsT, Any]]: """Run the agent with a user prompt in async mode, returning a streamed response. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): async with agent.run_stream('What is the capital of the UK?') as response: print(await response.get_output()) #> The capital of the UK is London. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. event_stream_handler: Optional event stream handler to use for this run. It will receive all the events up until the final result is found, which you can then read or stream from inside the context manager. Returns: The result of the run. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) async with super().run_stream( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, event_stream_handler=event_stream_handler, builtin_tools=builtin_tools, **_deprecated_kwargs, ) as result: yield result ```` #### run_stream_events ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[OutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[RunOutputDataT] ] ``` ```python run_stream_events( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[ AgentStreamEvent | AgentRunResultEvent[Any] ] ``` Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps self.run and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for self.run, except that `event_stream_handler` is now allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | An async iterable of stream events AgentStreamEvent and finally a AgentRunResultEvent with the final | | `AsyncIterator[AgentStreamEvent | AgentRunResultEvent[Any]]` | run result. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python def run_stream_events( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[_messages.AgentStreamEvent | AgentRunResultEvent[Any]]: """Run the agent with a user prompt in async mode and stream events from the run. This is a convenience method that wraps [`self.run`][pydantic_ai.agent.AbstractAgent.run] and uses the `event_stream_handler` kwarg to get a stream of events from the run. Example: ```python from pydantic_ai import Agent, AgentRunResultEvent, AgentStreamEvent agent = Agent('openai:gpt-4o') async def main(): events: list[AgentStreamEvent | AgentRunResultEvent] = [] async for event in agent.run_stream_events('What is the capital of France?'): events.append(event) print(events) ''' [ PartStartEvent(index=0, part=TextPart(content='The capital of ')), FinalResultEvent(tool_name=None, tool_call_id=None), PartDeltaEvent(index=0, delta=TextPartDelta(content_delta='France is Paris. ')), AgentRunResultEvent( result=AgentRunResult(output='The capital of France is Paris. ') ), ] ''' ``` Arguments are the same as for [`self.run`][pydantic_ai.agent.AbstractAgent.run], except that `event_stream_handler` is now allowed. Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: An async iterable of stream events `AgentStreamEvent` and finally a `AgentRunResultEvent` with the final run result. """ if FlowRunContext.get() is not None: raise UserError( '`agent.run_stream_events()` cannot be used inside a Prefect flow. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) return super().run_stream_events( user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, builtin_tools=builtin_tools, ) ```` #### iter ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, OutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT], message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AbstractAsyncContextManager[ AgentRun[AgentDepsT, RunOutputDataT] ] ``` ```python iter( user_prompt: str | Sequence[UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[ModelMessage] | None = None, deferred_tool_results: ( DeferredToolResults | None ) = None, model: Model | KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: UsageLimits | None = None, usage: RunUsage | None = None, infer_name: bool = True, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | None ) = None, builtin_tools: ( Sequence[AbstractBuiltinTool] | None ) = None ) -> AsyncIterator[AgentRun[AgentDepsT, Any]] ``` A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `user_prompt` | `str | Sequence[UserContent] | None` | User input to start/continue the conversation. | `None` | | `output_type` | `OutputSpec[RunOutputDataT] | None` | Custom output type to use for this run, output_type may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. | `None` | | `message_history` | `Sequence[ModelMessage] | None` | History of the conversation so far. | `None` | | `deferred_tool_results` | `DeferredToolResults | None` | Optional results for deferred tool calls in the message history. | `None` | | `model` | `Model | KnownModelName | str | None` | Optional model to use for this run, required if model was not set when creating the agent. | `None` | | `deps` | `AgentDepsT` | Optional dependencies to use for this run. | `None` | | `model_settings` | `ModelSettings | None` | Optional settings to use for this model's request. | `None` | | `usage_limits` | `UsageLimits | None` | Optional limits on model request count or token usage. | `None` | | `usage` | `RunUsage | None` | Optional usage to start with, useful for resuming a conversation or agents used in tools. | `None` | | `infer_name` | `bool` | Whether to try to infer the agent name from the call frame if it's not set. | `True` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | None` | Optional additional toolsets for this run. | `None` | | `builtin_tools` | `Sequence[AbstractBuiltinTool] | None` | Optional additional builtin tools for this run. | `None` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[AgentRun[AgentDepsT, Any]]` | The result of the run. | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ````python @asynccontextmanager async def iter( self, user_prompt: str | Sequence[_messages.UserContent] | None = None, *, output_type: OutputSpec[RunOutputDataT] | None = None, message_history: Sequence[_messages.ModelMessage] | None = None, deferred_tool_results: DeferredToolResults | None = None, model: models.Model | models.KnownModelName | str | None = None, deps: AgentDepsT = None, model_settings: ModelSettings | None = None, usage_limits: _usage.UsageLimits | None = None, usage: _usage.RunUsage | None = None, infer_name: bool = True, toolsets: Sequence[AbstractToolset[AgentDepsT]] | None = None, builtin_tools: Sequence[AbstractBuiltinTool] | None = None, ) -> AsyncIterator[AgentRun[AgentDepsT, Any]]: """A contextmanager which can be used to iterate over the agent graph's nodes as they are executed. This method builds an internal agent graph (using system prompts, tools and output schemas) and then returns an `AgentRun` object. The `AgentRun` can be used to async-iterate over the nodes of the graph as they are executed. This is the API to use if you want to consume the outputs coming from each LLM model response, or the stream of events coming from the execution of tools. The `AgentRun` also provides methods to access the full message history, new messages, and usage statistics, and the final result of the run once it has completed. For more details, see the documentation of `AgentRun`. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` Args: user_prompt: User input to start/continue the conversation. output_type: Custom output type to use for this run, `output_type` may only be used if the agent has no output validators since output validators would expect an argument that matches the agent's output type. message_history: History of the conversation so far. deferred_tool_results: Optional results for deferred tool calls in the message history. model: Optional model to use for this run, required if `model` was not set when creating the agent. deps: Optional dependencies to use for this run. model_settings: Optional settings to use for this model's request. usage_limits: Optional limits on model request count or token usage. usage: Optional usage to start with, useful for resuming a conversation or agents used in tools. infer_name: Whether to try to infer the agent name from the call frame if it's not set. toolsets: Optional additional toolsets for this run. builtin_tools: Optional additional builtin tools for this run. Returns: The result of the run. """ if model is not None and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be set at agent run time inside a Prefect flow, it must be set at agent creation time.' ) with self._prefect_overrides(): async with super().iter( user_prompt=user_prompt, output_type=output_type, message_history=message_history, deferred_tool_results=deferred_tool_results, model=model, deps=deps, model_settings=model_settings, usage_limits=usage_limits, usage=usage, infer_name=infer_name, toolsets=toolsets, ) as run: yield run ```` #### override ```python override( *, name: str | Unset = UNSET, deps: AgentDepsT | Unset = UNSET, model: Model | KnownModelName | str | Unset = UNSET, toolsets: ( Sequence[AbstractToolset[AgentDepsT]] | Unset ) = UNSET, tools: ( Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] | Unset ) = UNSET, instructions: Instructions[AgentDepsT] | Unset = UNSET ) -> Iterator[None] ``` Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../../testing/#overriding-model-via-pytest-fixtures). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | Unset` | The name to use instead of the name passed to the agent constructor and agent run. | `UNSET` | | `deps` | `AgentDepsT | Unset` | The dependencies to use instead of the dependencies passed to the agent run. | `UNSET` | | `model` | `Model | KnownModelName | str | Unset` | The model to use instead of the model passed to the agent run. | `UNSET` | | `toolsets` | `Sequence[AbstractToolset[AgentDepsT]] | Unset` | The toolsets to use instead of the toolsets passed to the agent constructor and agent run. | `UNSET` | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | Unset` | The tools to use instead of the tools registered with the agent. | `UNSET` | | `instructions` | `Instructions[AgentDepsT] | Unset` | The instructions to use instead of the instructions registered with the agent. | `UNSET` | Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_agent.py` ```python @contextmanager def override( self, *, name: str | _utils.Unset = _utils.UNSET, deps: AgentDepsT | _utils.Unset = _utils.UNSET, model: models.Model | models.KnownModelName | str | _utils.Unset = _utils.UNSET, toolsets: Sequence[AbstractToolset[AgentDepsT]] | _utils.Unset = _utils.UNSET, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] | _utils.Unset = _utils.UNSET, instructions: Instructions[AgentDepsT] | _utils.Unset = _utils.UNSET, ) -> Iterator[None]: """Context manager to temporarily override agent dependencies, model, toolsets, tools, or instructions. This is particularly useful when testing. You can find an example of this [here](../testing.md#overriding-model-via-pytest-fixtures). Args: name: The name to use instead of the name passed to the agent constructor and agent run. deps: The dependencies to use instead of the dependencies passed to the agent run. model: The model to use instead of the model passed to the agent run. toolsets: The toolsets to use instead of the toolsets passed to the agent constructor and agent run. tools: The tools to use instead of the tools registered with the agent. instructions: The instructions to use instead of the instructions registered with the agent. """ if _utils.is_set(model) and not isinstance(model, PrefectModel): raise UserError( 'Non-Prefect model cannot be contextually overridden inside a Prefect flow, it must be set at agent creation time.' ) with super().override( name=name, deps=deps, model=model, toolsets=toolsets, tools=tools, instructions=instructions ): yield ``` ### PrefectFunctionToolset Bases: `PrefectWrapperToolset[AgentDepsT]` A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_function_toolset.py` ```python class PrefectFunctionToolset(PrefectWrapperToolset[AgentDepsT]): """A wrapper for FunctionToolset that integrates with Prefect, turning tool calls into Prefect tasks.""" def __init__( self, wrapped: FunctionToolset[AgentDepsT], *, task_config: TaskConfig, tool_task_config: dict[str, TaskConfig | None], ): super().__init__(wrapped) self._task_config = default_task_config | (task_config or {}) self._tool_task_config = tool_task_config or {} @task async def _call_tool_task( tool_name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: return await super(PrefectFunctionToolset, self).call_tool(tool_name, tool_args, ctx, tool) self._call_tool_task = _call_tool_task async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: """Call a tool, wrapped as a Prefect task with a descriptive name.""" # Check if this specific tool has custom config or is disabled tool_specific_config = self._tool_task_config.get(name, default_task_config) if tool_specific_config is None: # None means this tool should not be wrapped as a task return await super().call_tool(name, tool_args, ctx, tool) # Merge tool-specific config with default config merged_config = self._task_config | tool_specific_config return await self._call_tool_task.with_options(name=f'Call Tool: {name}', **merged_config)( name, tool_args, ctx, tool ) ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool, wrapped as a Prefect task with a descriptive name. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_function_toolset.py` ```python async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any: """Call a tool, wrapped as a Prefect task with a descriptive name.""" # Check if this specific tool has custom config or is disabled tool_specific_config = self._tool_task_config.get(name, default_task_config) if tool_specific_config is None: # None means this tool should not be wrapped as a task return await super().call_tool(name, tool_args, ctx, tool) # Merge tool-specific config with default config merged_config = self._task_config | tool_specific_config return await self._call_tool_task.with_options(name=f'Call Tool: {name}', **merged_config)( name, tool_args, ctx, tool ) ``` ### PrefectMCPServer Bases: `PrefectWrapperToolset[AgentDepsT]`, `ABC` A wrapper for MCPServer that integrates with Prefect, turning call_tool and get_tools into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_mcp_server.py` ```python class PrefectMCPServer(PrefectWrapperToolset[AgentDepsT], ABC): """A wrapper for MCPServer that integrates with Prefect, turning call_tool and get_tools into Prefect tasks.""" def __init__( self, wrapped: MCPServer, *, task_config: TaskConfig, ): super().__init__(wrapped) self._task_config = default_task_config | (task_config or {}) self._mcp_id = wrapped.id @task async def _call_tool_task( tool_name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: return await super(PrefectMCPServer, self).call_tool(tool_name, tool_args, ctx, tool) self._call_tool_task = _call_tool_task async def __aenter__(self) -> Self: await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: """Call an MCP tool, wrapped as a Prefect task with a descriptive name.""" return await self._call_tool_task.with_options(name=f'Call MCP Tool: {name}', **self._task_config)( name, tool_args, ctx, tool ) ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult ``` Call an MCP tool, wrapped as a Prefect task with a descriptive name. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_mcp_server.py` ```python async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> ToolResult: """Call an MCP tool, wrapped as a Prefect task with a descriptive name.""" return await self._call_tool_task.with_options(name=f'Call MCP Tool: {name}', **self._task_config)( name, tool_args, ctx, tool ) ``` ### PrefectModel Bases: `WrapperModel` A wrapper for Model that integrates with Prefect, turning request and request_stream into Prefect tasks. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python class PrefectModel(WrapperModel): """A wrapper for Model that integrates with Prefect, turning request and request_stream into Prefect tasks.""" def __init__( self, model: Any, *, task_config: TaskConfig, event_stream_handler: EventStreamHandler[Any] | None = None, ): super().__init__(model) self.task_config = default_task_config | (task_config or {}) self.event_stream_handler = event_stream_handler @task async def wrapped_request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: response = await super(PrefectModel, self).request(messages, model_settings, model_request_parameters) return response self._wrapped_request = wrapped_request @task async def request_stream_task( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ctx: RunContext[Any] | None, ) -> ModelResponse: async with super(PrefectModel, self).request_stream( messages, model_settings, model_request_parameters, ctx ) as streamed_response: if self.event_stream_handler is not None: assert ctx is not None, ( 'A Prefect model cannot be used with `pydantic_ai.direct.model_request_stream()` as it requires a `run_context`. ' 'Set an `event_stream_handler` on the agent and use `agent.run()` instead.' ) await self.event_stream_handler(ctx, streamed_response) # Consume the entire stream async for _ in streamed_response: pass response = streamed_response.get() return response self._wrapped_request_stream = request_stream_task async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a model request, wrapped as a Prefect task when in a flow.""" return await self._wrapped_request.with_options( name=f'Model Request: {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. """ # Check if we're in a flow context flow_run_context = FlowRunContext.get() # If not in a flow, just call the wrapped request_stream method if flow_run_context is None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return # If in a flow, consume the stream in a task and return the final response response = await self._wrapped_request_stream.with_options( name=f'Model Request (Streaming): {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters, run_context) yield PrefectStreamedResponse(model_request_parameters, response) ``` #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a model request, wrapped as a Prefect task when in a flow. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a model request, wrapped as a Prefect task when in a flow.""" return await self._wrapped_request.with_options( name=f'Model Request: {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_model.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming model request. When inside a Prefect flow, the stream is consumed within a task and a non-streaming response is returned. When not in a flow, behaves normally. """ # Check if we're in a flow context flow_run_context = FlowRunContext.get() # If not in a flow, just call the wrapped request_stream method if flow_run_context is None: async with super().request_stream( messages, model_settings, model_request_parameters, run_context ) as streamed_response: yield streamed_response return # If in a flow, consume the stream in a task and return the final response response = await self._wrapped_request_stream.with_options( name=f'Model Request (Streaming): {self.wrapped.model_name}', **self.task_config )(messages, model_settings, model_request_parameters, run_context) yield PrefectStreamedResponse(model_request_parameters, response) ``` ### TaskConfig Bases: `TypedDict` Configuration for a task in Prefect. These options are passed to the `@task` decorator. Source code in `pydantic_ai_slim/pydantic_ai/durable_exec/prefect/_types.py` ```python class TaskConfig(TypedDict, total=False): """Configuration for a task in Prefect. These options are passed to the `@task` decorator. """ retries: int """Maximum number of retries for the task.""" retry_delay_seconds: float | list[float] """Delay between retries in seconds. Can be a single value or a list for custom backoff.""" timeout_seconds: float """Maximum time in seconds for the task to complete.""" cache_policy: CachePolicy """Prefect cache policy for the task.""" persist_result: bool """Whether to persist the task result.""" result_storage: ResultStorage """Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`.""" log_prints: bool """Whether to log print statements from the task.""" ``` #### retries ```python retries: int ``` Maximum number of retries for the task. #### retry_delay_seconds ```python retry_delay_seconds: float | list[float] ``` Delay between retries in seconds. Can be a single value or a list for custom backoff. #### timeout_seconds ```python timeout_seconds: float ``` Maximum time in seconds for the task to complete. #### cache_policy ```python cache_policy: CachePolicy ``` Prefect cache policy for the task. #### persist_result ```python persist_result: bool ``` Whether to persist the task result. #### result_storage ```python result_storage: ResultStorage ``` Prefect result storage for the task. Should be a storage block or a block slug like `s3-bucket/my-storage`. #### log_prints ```python log_prints: bool ``` Whether to log print statements from the task. # `pydantic_ai.exceptions` ### ModelRetry Bases: `Exception` Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelRetry(Exception): """Exception to raise when a tool function should be retried. The agent will return the message to the model and ask it to try calling the function/tool again. """ message: str """The message to return to the model.""" def __init__(self, message: str): self.message = message super().__init__(message) def __eq__(self, other: Any) -> bool: return isinstance(other, self.__class__) and other.message == self.message @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` #### message ```python message: str = message ``` The message to return to the model. #### __get_pydantic_core_schema__ ```python __get_pydantic_core_schema__(_: Any, __: Any) -> CoreSchema ``` Pydantic core schema to allow `ModelRetry` to be (de)serialized. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> core_schema.CoreSchema: """Pydantic core schema to allow `ModelRetry` to be (de)serialized.""" schema = core_schema.typed_dict_schema( { 'message': core_schema.typed_dict_field(core_schema.str_schema()), 'kind': core_schema.typed_dict_field(core_schema.literal_schema(['model-retry'])), } ) return core_schema.no_info_after_validator_function( lambda dct: ModelRetry(dct['message']), schema, serialization=core_schema.plain_serializer_function_ser_schema( lambda x: {'message': x.message, 'kind': 'model-retry'}, return_schema=schema, ), ) ``` ### CallDeferred Bases: `Exception` Exception to raise when a tool call should be deferred. See [tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class CallDeferred(Exception): """Exception to raise when a tool call should be deferred. See [tools docs](../deferred-tools.md#deferred-tools) for more information. """ pass ``` ### ApprovalRequired Bases: `Exception` Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](../../deferred-tools/#human-in-the-loop-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ApprovalRequired(Exception): """Exception to raise when a tool call requires human-in-the-loop approval. See [tools docs](../deferred-tools.md#human-in-the-loop-tool-approval) for more information. """ pass ``` ### UserError Bases: `RuntimeError` Error caused by a usage mistake by the application developer — You! Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UserError(RuntimeError): """Error caused by a usage mistake by the application developer — You!""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### AgentRunError Bases: `RuntimeError` Base class for errors occurring during an agent run. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class AgentRunError(RuntimeError): """Base class for errors occurring during an agent run.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) def __str__(self) -> str: return self.message ``` #### message ```python message: str = message ``` The error message. ### UsageLimitExceeded Bases: `AgentRunError` Error raised when a Model's usage exceeds the specified limits. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UsageLimitExceeded(AgentRunError): """Error raised when a Model's usage exceeds the specified limits.""" ``` ### UnexpectedModelBehavior Bases: `AgentRunError` Error caused by unexpected Model behavior, e.g. an unexpected response code. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class UnexpectedModelBehavior(AgentRunError): """Error caused by unexpected Model behavior, e.g. an unexpected response code.""" message: str """Description of the unexpected behavior.""" body: str | None """The body of the response, if available.""" def __init__(self, message: str, body: str | None = None): self.message = message if body is None: self.body: str | None = None else: try: self.body = json.dumps(json.loads(body), indent=2) except ValueError: self.body = body super().__init__(message) def __str__(self) -> str: if self.body: return f'{self.message}, body:\n{self.body}' else: return self.message ``` #### message ```python message: str = message ``` Description of the unexpected behavior. #### body ```python body: str | None = dumps(loads(body), indent=2) ``` The body of the response, if available. ### ModelHTTPError Bases: `AgentRunError` Raised when an model provider response has a status code of 4xx or 5xx. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class ModelHTTPError(AgentRunError): """Raised when an model provider response has a status code of 4xx or 5xx.""" status_code: int """The HTTP status code returned by the API.""" model_name: str """The name of the model associated with the error.""" body: object | None """The body of the response, if available.""" message: str """The error message with the status code and response body, if available.""" def __init__(self, status_code: int, model_name: str, body: object | None = None): self.status_code = status_code self.model_name = model_name self.body = body message = f'status_code: {status_code}, model_name: {model_name}, body: {body}' super().__init__(message) ``` #### message ```python message: str ``` The error message with the status code and response body, if available. #### status_code ```python status_code: int = status_code ``` The HTTP status code returned by the API. #### model_name ```python model_name: str = model_name ``` The name of the model associated with the error. #### body ```python body: object | None = body ``` The body of the response, if available. ### FallbackExceptionGroup Bases: `ExceptionGroup` A group of exceptions that can be raised when all fallback models fail. Source code in `pydantic_ai_slim/pydantic_ai/exceptions.py` ```python class FallbackExceptionGroup(ExceptionGroup): """A group of exceptions that can be raised when all fallback models fail.""" ``` # `pydantic_ai.ext` ### tool_from_langchain ```python tool_from_langchain(langchain_tool: LangChainTool) -> Tool ``` Creates a Pydantic AI tool proxy from a LangChain tool. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `langchain_tool` | `LangChainTool` | The LangChain tool to wrap. | *required* | Returns: | Type | Description | | --- | --- | | `Tool` | A Pydantic AI tool that corresponds to the LangChain tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python def tool_from_langchain(langchain_tool: LangChainTool) -> Tool: """Creates a Pydantic AI tool proxy from a LangChain tool. Args: langchain_tool: The LangChain tool to wrap. Returns: A Pydantic AI tool that corresponds to the LangChain tool. """ function_name = langchain_tool.name function_description = langchain_tool.description inputs = langchain_tool.args.copy() required = sorted({name for name, detail in inputs.items() if 'default' not in detail}) schema: JsonSchemaValue = langchain_tool.get_input_jsonschema() if 'additionalProperties' not in schema: schema['additionalProperties'] = False if required: schema['required'] = required defaults = {name: detail['default'] for name, detail in inputs.items() if 'default' in detail} # restructures the arguments to match langchain tool run def proxy(*args: Any, **kwargs: Any) -> str: assert not args, 'This should always be called with kwargs' kwargs = defaults | kwargs return langchain_tool.run(kwargs) return Tool.from_schema( function=proxy, name=function_name, description=function_description, json_schema=schema, ) ``` ### LangChainToolset Bases: `FunctionToolset` A toolset that wraps LangChain tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/langchain.py` ```python class LangChainToolset(FunctionToolset): """A toolset that wraps LangChain tools.""" def __init__(self, tools: list[LangChainTool], *, id: str | None = None): super().__init__([tool_from_langchain(tool) for tool in tools], id=id) ``` ### tool_from_aci ```python tool_from_aci( aci_function: str, linked_account_owner_id: str ) -> Tool ``` Creates a Pydantic AI tool proxy from an ACI.dev function. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `aci_function` | `str` | The ACI.dev function to wrap. | *required* | | `linked_account_owner_id` | `str` | The ACI user ID to execute the function on behalf of. | *required* | Returns: | Type | Description | | --- | --- | | `Tool` | A Pydantic AI tool that corresponds to the ACI.dev tool. | Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python def tool_from_aci(aci_function: str, linked_account_owner_id: str) -> Tool: """Creates a Pydantic AI tool proxy from an ACI.dev function. Args: aci_function: The ACI.dev function to wrap. linked_account_owner_id: The ACI user ID to execute the function on behalf of. Returns: A Pydantic AI tool that corresponds to the ACI.dev tool. """ aci = ACI() function_definition = aci.functions.get_definition(aci_function) function_name = function_definition['function']['name'] function_description = function_definition['function']['description'] inputs = function_definition['function']['parameters'] json_schema = { 'additionalProperties': inputs.get('additionalProperties', False), 'properties': inputs.get('properties', {}), 'required': inputs.get('required', []), # Default to 'object' if not specified 'type': inputs.get('type', 'object'), } # Clean the schema json_schema = _clean_schema(json_schema) def implementation(*args: Any, **kwargs: Any) -> str: if args: raise TypeError('Positional arguments are not allowed') return aci.handle_function_call( function_name, kwargs, linked_account_owner_id=linked_account_owner_id, allowed_apps_only=True, ) return Tool.from_schema( function=implementation, name=function_name, description=function_description, json_schema=json_schema, ) ``` ### ACIToolset Bases: `FunctionToolset` A toolset that wraps ACI.dev tools. Source code in `pydantic_ai_slim/pydantic_ai/ext/aci.py` ```python class ACIToolset(FunctionToolset): """A toolset that wraps ACI.dev tools.""" def __init__(self, aci_functions: Sequence[str], linked_account_owner_id: str, *, id: str | None = None): super().__init__( [tool_from_aci(aci_function, linked_account_owner_id) for aci_function in aci_functions], id=id ) ``` # `fasta2a` ### FastA2A Bases: `Starlette` The main class for the FastA2A library. Source code in `.venv/lib/python3.12/site-packages/fasta2a/applications.py` ```python class FastA2A(Starlette): """The main class for the FastA2A library.""" def __init__( self, *, storage: Storage, broker: Broker, # Agent card name: str | None = None, url: str = 'http://localhost:8000', version: str = '1.0.0', description: str | None = None, provider: AgentProvider | None = None, skills: list[Skill] | None = None, # Starlette debug: bool = False, routes: Sequence[Route] | None = None, middleware: Sequence[Middleware] | None = None, exception_handlers: dict[Any, ExceptionHandler] | None = None, lifespan: Lifespan[FastA2A] | None = None, ): if lifespan is None: lifespan = _default_lifespan super().__init__( debug=debug, routes=routes, middleware=middleware, exception_handlers=exception_handlers, lifespan=lifespan, ) self.name = name or 'My Agent' self.url = url self.version = version self.description = description self.provider = provider self.skills = skills or [] # NOTE: For now, I don't think there's any reason to support any other input/output modes. self.default_input_modes = ['application/json'] self.default_output_modes = ['application/json'] self.task_manager = TaskManager(broker=broker, storage=storage) # Setup self._agent_card_json_schema: bytes | None = None self.router.add_route('/.well-known/agent.json', self._agent_card_endpoint, methods=['HEAD', 'GET', 'OPTIONS']) self.router.add_route('/', self._agent_run_endpoint, methods=['POST']) self.router.add_route('/docs', self._docs_endpoint, methods=['GET']) async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None: if scope['type'] == 'http' and not self.task_manager.is_running: raise RuntimeError('TaskManager was not properly initialized.') await super().__call__(scope, receive, send) async def _agent_card_endpoint(self, request: Request) -> Response: if self._agent_card_json_schema is None: agent_card = AgentCard( name=self.name, description=self.description or 'An AI agent exposed as an A2A agent.', url=self.url, version=self.version, protocol_version='0.2.5', skills=self.skills, default_input_modes=self.default_input_modes, default_output_modes=self.default_output_modes, capabilities=AgentCapabilities( streaming=False, push_notifications=False, state_transition_history=False ), ) if self.provider is not None: agent_card['provider'] = self.provider self._agent_card_json_schema = agent_card_ta.dump_json(agent_card, by_alias=True) return Response(content=self._agent_card_json_schema, media_type='application/json') async def _docs_endpoint(self, request: Request) -> Response: """Serve the documentation interface.""" docs_path = Path(__file__).parent / 'static' / 'docs.html' return FileResponse(docs_path, media_type='text/html') async def _agent_run_endpoint(self, request: Request) -> Response: """This is the main endpoint for the A2A server. Although the specification allows freedom of choice and implementation, I'm pretty sure about some decisions. 1. The server will always either send a "submitted" or a "failed" on `tasks/send`. Never a "completed" on the first message. 2. There are three possible ends for the task: 2.1. The task was "completed" successfully. 2.2. The task was "canceled". 2.3. The task "failed". 3. The server will send a "working" on the first chunk on `tasks/pushNotification/get`. """ data = await request.body() a2a_request = a2a_request_ta.validate_json(data) if a2a_request['method'] == 'message/send': jsonrpc_response = await self.task_manager.send_message(a2a_request) elif a2a_request['method'] == 'tasks/get': jsonrpc_response = await self.task_manager.get_task(a2a_request) elif a2a_request['method'] == 'tasks/cancel': jsonrpc_response = await self.task_manager.cancel_task(a2a_request) else: raise NotImplementedError(f'Method {a2a_request["method"]} not implemented.') return Response( content=a2a_response_ta.dump_json(jsonrpc_response, by_alias=True), media_type='application/json' ) ``` ### Broker Bases: `ABC` The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @dataclass class Broker(ABC): """The broker class is in charge of scheduling the tasks. The HTTP server uses the broker to schedule tasks. The simple implementation is the `InMemoryBroker`, which is the broker that runs the tasks in the same process as the HTTP server. That said, this class can be extended to support remote workers. """ @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') @abstractmethod async def __aenter__(self) -> Self: ... @abstractmethod async def __aexit__(self, exc_type: Any, exc_value: Any, traceback: Any): ... @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` #### run_task ```python run_task(params: TaskSendParams) -> None ``` Send a task to be executed by the worker. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def run_task(self, params: TaskSendParams) -> None: """Send a task to be executed by the worker.""" raise NotImplementedError('send_run_task is not implemented yet.') ``` #### cancel_task ```python cancel_task(params: TaskIdParams) -> None ``` Cancel a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: """Cancel a task.""" raise NotImplementedError('send_cancel_task is not implemented yet.') ``` #### receive_task_operations ```python receive_task_operations() -> AsyncIterator[TaskOperation] ``` Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. Source code in `.venv/lib/python3.12/site-packages/fasta2a/broker.py` ```python @abstractmethod def receive_task_operations(self) -> AsyncIterator[TaskOperation]: """Receive task operations from the broker. On a multi-worker setup, the broker will need to round-robin the task operations between the workers. """ ``` ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Storage Bases: `ABC`, `Generic[ContextT]` A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 1. Context storage: Stores conversation context in a format optimized for the specific agent implementation Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python class Storage(ABC, Generic[ContextT]): """A storage to retrieve and save tasks, as well as retrieve and save context. The storage serves two purposes: 1. Task storage: Stores tasks in A2A protocol format with their status, artifacts, and message history 2. Context storage: Stores conversation context in a format optimized for the specific agent implementation """ @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` #### load_task ```python load_task( task_id: str, history_length: int | None = None ) -> Task | None ``` Load a task from storage. If the task is not found, return None. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_task(self, task_id: str, history_length: int | None = None) -> Task | None: """Load a task from storage. If the task is not found, return None. """ ``` #### submit_task ```python submit_task(context_id: str, message: Message) -> Task ``` Submit a task to storage. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def submit_task(self, context_id: str, message: Message) -> Task: """Submit a task to storage.""" ``` #### update_task ```python update_task( task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task ``` Update the state of a task. Appends artifacts and messages, if specified. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_task( self, task_id: str, state: TaskState, new_artifacts: list[Artifact] | None = None, new_messages: list[Message] | None = None, ) -> Task: """Update the state of a task. Appends artifacts and messages, if specified.""" ``` #### load_context ```python load_context(context_id: str) -> ContextT | None ``` Retrieve the stored context given the `context_id`. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def load_context(self, context_id: str) -> ContextT | None: """Retrieve the stored context given the `context_id`.""" ``` #### update_context ```python update_context(context_id: str, context: ContextT) -> None ``` Updates the context for a `context_id`. Implementing agent can decide what to store in context. Source code in `.venv/lib/python3.12/site-packages/fasta2a/storage.py` ```python @abstractmethod async def update_context(self, context_id: str, context: ContextT) -> None: """Updates the context for a `context_id`. Implementing agent can decide what to store in context. """ ``` ### Worker Bases: `ABC`, `Generic[ContextT]` A worker is responsible for executing tasks. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @dataclass class Worker(ABC, Generic[ContextT]): """A worker is responsible for executing tasks.""" broker: Broker storage: Storage[ContextT] @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() async def _loop(self) -> None: async for task_operation in self.broker.receive_task_operations(): await self._handle_task_operation(task_operation) async def _handle_task_operation(self, task_operation: TaskOperation) -> None: try: with use_span(task_operation['_current_span']): with tracer.start_as_current_span( f'{task_operation["operation"]} task', attributes={'logfire.tags': ['fasta2a']} ): if task_operation['operation'] == 'run': await self.run_task(task_operation['params']) elif task_operation['operation'] == 'cancel': await self.cancel_task(task_operation['params']) else: assert_never(task_operation) except Exception: await self.storage.update_task(task_operation['params']['id'], state='failed') @abstractmethod async def run_task(self, params: TaskSendParams) -> None: ... @abstractmethod async def cancel_task(self, params: TaskIdParams) -> None: ... @abstractmethod def build_message_history(self, history: list[Message]) -> list[Any]: ... @abstractmethod def build_artifacts(self, result: Any) -> list[Artifact]: ... ``` #### run ```python run() -> AsyncIterator[None] ``` Run the worker. It connects to the broker, and it makes itself available to receive commands. Source code in `.venv/lib/python3.12/site-packages/fasta2a/worker.py` ```python @asynccontextmanager async def run(self) -> AsyncIterator[None]: """Run the worker. It connects to the broker, and it makes itself available to receive commands. """ async with anyio.create_task_group() as tg: tg.start_soon(self._loop) yield tg.cancel_scope.cancel() ``` This module contains the schema for the agent card. ### AgentCard Bases: `TypedDict` The card that describes an agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCard(TypedDict): """The card that describes an agent.""" name: str """Human readable name of the agent e.g. "Recipe Agent".""" description: str """A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") """ url: str """A URL to the address the agent is hosted at.""" version: str """The version of the agent - format is up to the provider. (e.g. "1.0.0")""" protocol_version: str """The version of the A2A protocol this agent supports.""" provider: NotRequired[AgentProvider] """The service provider of the agent.""" documentation_url: NotRequired[str] """A URL to documentation for the agent.""" icon_url: NotRequired[str] """A URL to an icon for the agent.""" preferred_transport: NotRequired[str] """The transport of the preferred endpoint. If empty, defaults to JSONRPC.""" additional_interfaces: NotRequired[list[AgentInterface]] """Announcement of additional supported transports.""" capabilities: AgentCapabilities """The capabilities of the agent.""" security: NotRequired[list[dict[str, list[str]]]] """Security requirements for contacting the agent.""" security_schemes: NotRequired[dict[str, SecurityScheme]] """Security scheme definitions.""" default_input_modes: list[str] """Supported mime types for input data.""" default_output_modes: list[str] """Supported mime types for output data.""" skills: list[Skill] ``` #### name ```python name: str ``` Human readable name of the agent e.g. "Recipe Agent". #### description ```python description: str ``` A human-readable description of the agent. Used to assist users and other agents in understanding what the agent can do. (e.g. "Agent that helps users with recipes and cooking.") #### url ```python url: str ``` A URL to the address the agent is hosted at. #### version ```python version: str ``` The version of the agent - format is up to the provider. (e.g. "1.0.0") #### protocol_version ```python protocol_version: str ``` The version of the A2A protocol this agent supports. #### provider ```python provider: NotRequired[AgentProvider] ``` The service provider of the agent. #### documentation_url ```python documentation_url: NotRequired[str] ``` A URL to documentation for the agent. #### icon_url ```python icon_url: NotRequired[str] ``` A URL to an icon for the agent. #### preferred_transport ```python preferred_transport: NotRequired[str] ``` The transport of the preferred endpoint. If empty, defaults to JSONRPC. #### additional_interfaces ```python additional_interfaces: NotRequired[list[AgentInterface]] ``` Announcement of additional supported transports. #### capabilities ```python capabilities: AgentCapabilities ``` The capabilities of the agent. #### security ```python security: NotRequired[list[dict[str, list[str]]]] ``` Security requirements for contacting the agent. #### security_schemes ```python security_schemes: NotRequired[dict[str, SecurityScheme]] ``` Security scheme definitions. #### default_input_modes ```python default_input_modes: list[str] ``` Supported mime types for input data. #### default_output_modes ```python default_output_modes: list[str] ``` Supported mime types for output data. ### AgentProvider Bases: `TypedDict` The service provider of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class AgentProvider(TypedDict): """The service provider of the agent.""" organization: str url: str ``` ### AgentCapabilities Bases: `TypedDict` The capabilities of the agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentCapabilities(TypedDict): """The capabilities of the agent.""" streaming: NotRequired[bool] """Whether the agent supports streaming.""" push_notifications: NotRequired[bool] """Whether the agent can notify updates to client.""" state_transition_history: NotRequired[bool] """Whether the agent exposes status change history for tasks.""" ``` #### streaming ```python streaming: NotRequired[bool] ``` Whether the agent supports streaming. #### push_notifications ```python push_notifications: NotRequired[bool] ``` Whether the agent can notify updates to client. #### state_transition_history ```python state_transition_history: NotRequired[bool] ``` Whether the agent exposes status change history for tasks. ### HttpSecurityScheme Bases: `TypedDict` HTTP security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class HttpSecurityScheme(TypedDict): """HTTP security scheme.""" type: Literal['http'] scheme: str """The name of the HTTP Authorization scheme.""" bearer_format: NotRequired[str] """A hint to the client to identify how the bearer token is formatted.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### scheme ```python scheme: str ``` The name of the HTTP Authorization scheme. #### bearer_format ```python bearer_format: NotRequired[str] ``` A hint to the client to identify how the bearer token is formatted. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### ApiKeySecurityScheme Bases: `TypedDict` API Key security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ApiKeySecurityScheme(TypedDict): """API Key security scheme.""" type: Literal['apiKey'] name: str """The name of the header, query or cookie parameter to be used.""" in_: Literal['query', 'header', 'cookie'] """The location of the API key.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### name ```python name: str ``` The name of the header, query or cookie parameter to be used. #### in\_ ```python in_: Literal['query', 'header', 'cookie'] ``` The location of the API key. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OAuth2SecurityScheme Bases: `TypedDict` OAuth2 security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OAuth2SecurityScheme(TypedDict): """OAuth2 security scheme.""" type: Literal['oauth2'] flows: dict[str, Any] """An object containing configuration information for the flow types supported.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### flows ```python flows: dict[str, Any] ``` An object containing configuration information for the flow types supported. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### OpenIdConnectSecurityScheme Bases: `TypedDict` OpenID Connect security scheme. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class OpenIdConnectSecurityScheme(TypedDict): """OpenID Connect security scheme.""" type: Literal['openIdConnect'] open_id_connect_url: str """OpenId Connect URL to discover OAuth2 configuration values.""" description: NotRequired[str] """Description of this security scheme.""" ``` #### open_id_connect_url ```python open_id_connect_url: str ``` OpenId Connect URL to discover OAuth2 configuration values. #### description ```python description: NotRequired[str] ``` Description of this security scheme. ### SecurityScheme ```python SecurityScheme = Annotated[ Union[ HttpSecurityScheme, ApiKeySecurityScheme, OAuth2SecurityScheme, OpenIdConnectSecurityScheme, ], Field(discriminator="type"), ] ``` A security scheme for authentication. ### AgentInterface Bases: `TypedDict` An interface that the agent supports. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentInterface(TypedDict): """An interface that the agent supports.""" transport: str """The transport protocol (e.g., 'jsonrpc', 'websocket').""" url: str """The URL endpoint for this transport.""" description: NotRequired[str] """Description of this interface.""" ``` #### transport ```python transport: str ``` The transport protocol (e.g., 'jsonrpc', 'websocket'). #### url ```python url: str ``` The URL endpoint for this transport. #### description ```python description: NotRequired[str] ``` Description of this interface. ### AgentExtension Bases: `TypedDict` A declaration of an extension supported by an Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class AgentExtension(TypedDict): """A declaration of an extension supported by an Agent.""" uri: str """The URI of the extension.""" description: NotRequired[str] """A description of how this agent uses this extension.""" required: NotRequired[bool] """Whether the client must follow specific requirements of the extension.""" params: NotRequired[dict[str, Any]] """Optional configuration for the extension.""" ``` #### uri ```python uri: str ``` The URI of the extension. #### description ```python description: NotRequired[str] ``` A description of how this agent uses this extension. #### required ```python required: NotRequired[bool] ``` Whether the client must follow specific requirements of the extension. #### params ```python params: NotRequired[dict[str, Any]] ``` Optional configuration for the extension. ### Skill Bases: `TypedDict` Skills are a unit of capability that an agent can perform. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Skill(TypedDict): """Skills are a unit of capability that an agent can perform.""" id: str """A unique identifier for the skill.""" name: str """Human readable name of the skill.""" description: str """A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. """ tags: list[str] """Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". """ examples: NotRequired[list[str]] """The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") """ input_modes: list[str] """Supported mime types for input data.""" output_modes: list[str] """Supported mime types for output data.""" ``` #### id ```python id: str ``` A unique identifier for the skill. #### name ```python name: str ``` Human readable name of the skill. #### description ```python description: str ``` A human-readable description of the skill. It will be used by the client or a human as a hint to understand the skill. #### tags ```python tags: list[str] ``` Set of tag-words describing classes of capabilities for this specific skill. Examples: "cooking", "customer support", "billing". #### examples ```python examples: NotRequired[list[str]] ``` The set of example scenarios that the skill can perform. Will be used by the client as a hint to understand how the skill can be used. (e.g. "I need a recipe for bread") #### input_modes ```python input_modes: list[str] ``` Supported mime types for input data. #### output_modes ```python output_modes: list[str] ``` Supported mime types for output data. ### Artifact Bases: `TypedDict` Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Artifact(TypedDict): """Agents generate Artifacts as an end result of a Task. Artifacts are immutable, can be named, and can have multiple parts. A streaming response can append parts to existing Artifacts. A single Task can generate many Artifacts. For example, "create a webpage" could create separate HTML and image Artifacts. """ artifact_id: str """Unique identifier for the artifact.""" name: NotRequired[str] """The name of the artifact.""" description: NotRequired[str] """A description of the artifact.""" parts: list[Part] """The parts that make up the artifact.""" metadata: NotRequired[dict[str, Any]] """Metadata about the artifact.""" extensions: NotRequired[list[str]] """Array of extensions.""" append: NotRequired[bool] """Whether to append this artifact to an existing one.""" last_chunk: NotRequired[bool] """Whether this is the last chunk of the artifact.""" ``` #### artifact_id ```python artifact_id: str ``` Unique identifier for the artifact. #### name ```python name: NotRequired[str] ``` The name of the artifact. #### description ```python description: NotRequired[str] ``` A description of the artifact. #### parts ```python parts: list[Part] ``` The parts that make up the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the artifact. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. #### append ```python append: NotRequired[bool] ``` Whether to append this artifact to an existing one. #### last_chunk ```python last_chunk: NotRequired[bool] ``` Whether this is the last chunk of the artifact. ### PushNotificationConfig Bases: `TypedDict` Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside of a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class PushNotificationConfig(TypedDict): """Configuration for push notifications. A2A supports a secure notification mechanism whereby an agent can notify a client of an update outside of a connected session via a PushNotificationService. Within and across enterprises, it is critical that the agent verifies the identity of the notification service, authenticates itself with the service, and presents an identifier that ties the notification to the executing Task. The target server of the PushNotificationService should be considered a separate service, and is not guaranteed (or even expected) to be the client directly. This PushNotificationService is responsible for authenticating and authorizing the agent and for proxying the verified notification to the appropriate endpoint (which could be anything from a pub/sub queue, to an email inbox or other service, etc). For contrived scenarios with isolated client-agent pairs (e.g. local service mesh in a contained VPC, etc.) or isolated environments without enterprise security concerns, the client may choose to simply open a port and act as its own PushNotificationService. Any enterprise implementation will likely have a centralized service that authenticates the remote agents with trusted notification credentials and can handle online/offline scenarios. (This should be thought of similarly to a mobile Push Notification Service). """ id: NotRequired[str] """Server-assigned identifier.""" url: str """The URL to send push notifications to.""" token: NotRequired[str] """Token unique to this task/session.""" authentication: NotRequired[SecurityScheme] """Authentication details for push notifications.""" ``` #### id ```python id: NotRequired[str] ``` Server-assigned identifier. #### url ```python url: str ``` The URL to send push notifications to. #### token ```python token: NotRequired[str] ``` Token unique to this task/session. #### authentication ```python authentication: NotRequired[SecurityScheme] ``` Authentication details for push notifications. ### TaskPushNotificationConfig Bases: `TypedDict` Configuration for task push notifications. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskPushNotificationConfig(TypedDict): """Configuration for task push notifications.""" id: str """The task id.""" push_notification_config: PushNotificationConfig """The push notification configuration.""" ``` #### id ```python id: str ``` The task id. #### push_notification_config ```python push_notification_config: PushNotificationConfig ``` The push notification configuration. ### Message Bases: `TypedDict` A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Message(TypedDict): """A Message contains any content that is not an Artifact. This can include things like agent thoughts, user context, instructions, errors, status, or metadata. All content from a client comes in the form of a Message. Agents send Messages to communicate status or to provide instructions (whereas generated results are sent as Artifacts). A Message can have multiple parts to denote different pieces of content. For example, a user request could include a textual description from a user and then multiple files used as context from the client. """ role: Literal['user', 'agent'] """The role of the message.""" parts: list[Part] """The parts of the message.""" kind: Literal['message'] """Event type.""" metadata: NotRequired[dict[str, Any]] """Metadata about the message.""" # Additional fields message_id: str """Identifier created by the message creator.""" context_id: NotRequired[str] """The context the message is associated with.""" task_id: NotRequired[str] """Identifier of task the message is related to.""" reference_task_ids: NotRequired[list[str]] """Array of task IDs this message references.""" extensions: NotRequired[list[str]] """Array of extensions.""" ``` #### role ```python role: Literal['user', 'agent'] ``` The role of the message. #### parts ```python parts: list[Part] ``` The parts of the message. #### kind ```python kind: Literal['message'] ``` Event type. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Metadata about the message. #### message_id ```python message_id: str ``` Identifier created by the message creator. #### context_id ```python context_id: NotRequired[str] ``` The context the message is associated with. #### task_id ```python task_id: NotRequired[str] ``` Identifier of task the message is related to. #### reference_task_ids ```python reference_task_ids: NotRequired[list[str]] ``` Array of task IDs this message references. #### extensions ```python extensions: NotRequired[list[str]] ``` Array of extensions. ### TextPart Bases: `_BasePart` A part that contains text. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TextPart(_BasePart): """A part that contains text.""" kind: Literal['text'] """The kind of the part.""" text: str """The text of the part.""" ``` #### kind ```python kind: Literal['text'] ``` The kind of the part. #### text ```python text: str ``` The text of the part. ### FileWithBytes Bases: `TypedDict` File with base64 encoded data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithBytes(TypedDict): """File with base64 encoded data.""" bytes: str """The base64 encoded content of the file.""" mime_type: NotRequired[str] """Optional mime type for the file.""" ``` #### bytes ```python bytes: str ``` The base64 encoded content of the file. #### mime_type ```python mime_type: NotRequired[str] ``` Optional mime type for the file. ### FileWithUri Bases: `TypedDict` File with URI reference. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FileWithUri(TypedDict): """File with URI reference.""" uri: str """The URI of the file.""" mime_type: NotRequired[str] """The mime type of the file.""" ``` #### uri ```python uri: str ``` The URI of the file. #### mime_type ```python mime_type: NotRequired[str] ``` The mime type of the file. ### FilePart Bases: `_BasePart` A part that contains a file. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class FilePart(_BasePart): """A part that contains a file.""" kind: Literal['file'] """The kind of the part.""" file: FileWithBytes | FileWithUri """The file content - either bytes or URI.""" ``` #### kind ```python kind: Literal['file'] ``` The kind of the part. #### file ```python file: FileWithBytes | FileWithUri ``` The file content - either bytes or URI. ### DataPart Bases: `_BasePart` A part that contains structured data. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DataPart(_BasePart): """A part that contains structured data.""" kind: Literal['data'] """The kind of the part.""" data: dict[str, Any] """The data of the part.""" ``` #### kind ```python kind: Literal['data'] ``` The kind of the part. #### data ```python data: dict[str, Any] ``` The data of the part. ### Part ```python Part = Annotated[ Union[TextPart, FilePart, DataPart], Field(discriminator="kind"), ] ``` A fully formed piece of content exchanged between a client and a remote agent as part of a Message or an Artifact. Each Part has its own content type and metadata. ### TaskState ```python TaskState: TypeAlias = Literal[ "submitted", "working", "input-required", "completed", "canceled", "failed", "rejected", "auth-required", "unknown", ] ``` The possible states of a task. ### TaskStatus Bases: `TypedDict` Status and accompanying message for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatus(TypedDict): """Status and accompanying message for a task.""" state: TaskState """The current state of the task.""" message: NotRequired[Message] """Additional status updates for client.""" timestamp: NotRequired[str] """ISO datetime value of when the status was updated.""" ``` #### state ```python state: TaskState ``` The current state of the task. #### message ```python message: NotRequired[Message] ``` Additional status updates for client. #### timestamp ```python timestamp: NotRequired[str] ``` ISO datetime value of when the status was updated. ### Task Bases: `TypedDict` A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class Task(TypedDict): """A Task is a stateful entity that allows Clients and Remote Agents to achieve a specific outcome. Clients and Remote Agents exchange Messages within a Task. Remote Agents generate results as Artifacts. A Task is always created by a Client and the status is always determined by the Remote Agent. """ id: str """Unique identifier for the task.""" context_id: str """The context the task is associated with.""" kind: Literal['task'] """Event type.""" status: TaskStatus """Current status of the task.""" history: NotRequired[list[Message]] """Optional history of messages.""" artifacts: NotRequired[list[Artifact]] """Collection of artifacts created by the agent.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Unique identifier for the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['task'] ``` Event type. #### status ```python status: TaskStatus ``` Current status of the task. #### history ```python history: NotRequired[list[Message]] ``` Optional history of messages. #### artifacts ```python artifacts: NotRequired[list[Artifact]] ``` Collection of artifacts created by the agent. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskStatusUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskStatusUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['status-update'] """Event type.""" status: TaskStatus """The status of the task.""" final: bool """Indicates the end of the event stream.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['status-update'] ``` Event type. #### status ```python status: TaskStatus ``` The status of the task. #### final ```python final: bool ``` Indicates the end of the event stream. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskArtifactUpdateEvent Bases: `TypedDict` Sent by server during message/stream requests. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskArtifactUpdateEvent(TypedDict): """Sent by server during message/stream requests.""" task_id: str """The id of the task.""" context_id: str """The context the task is associated with.""" kind: Literal['artifact-update'] """Event type identification.""" artifact: Artifact """The artifact that was updated.""" append: NotRequired[bool] """Whether to append to existing artifact (true) or replace (false).""" last_chunk: NotRequired[bool] """Indicates this is the final chunk of the artifact.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### task_id ```python task_id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context the task is associated with. #### kind ```python kind: Literal['artifact-update'] ``` Event type identification. #### artifact ```python artifact: Artifact ``` The artifact that was updated. #### append ```python append: NotRequired[bool] ``` Whether to append to existing artifact (true) or replace (false). #### last_chunk ```python last_chunk: NotRequired[bool] ``` Indicates this is the final chunk of the artifact. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskIdParams Bases: `TypedDict` Parameters for a task id. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskIdParams(TypedDict): """Parameters for a task id.""" id: str metadata: NotRequired[dict[str, Any]] ``` ### TaskQueryParams Bases: `TaskIdParams` Query parameters for a task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskQueryParams(TaskIdParams): """Query parameters for a task.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" ``` #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. ### MessageSendConfiguration Bases: `TypedDict` Configuration for the send message request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendConfiguration(TypedDict): """Configuration for the send message request.""" accepted_output_modes: list[str] """Accepted output modalities by the client.""" blocking: NotRequired[bool] """If the server should treat the client as a blocking request.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" push_notification_config: NotRequired[PushNotificationConfig] """Where the server should send notifications when disconnected.""" ``` #### accepted_output_modes ```python accepted_output_modes: list[str] ``` Accepted output modalities by the client. #### blocking ```python blocking: NotRequired[bool] ``` If the server should treat the client as a blocking request. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### push_notification_config ```python push_notification_config: NotRequired[ PushNotificationConfig ] ``` Where the server should send notifications when disconnected. ### MessageSendParams Bases: `TypedDict` Parameters for message/send method. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class MessageSendParams(TypedDict): """Parameters for message/send method.""" configuration: NotRequired[MessageSendConfiguration] """Send message configuration.""" message: Message """The message being sent to the server.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### configuration ```python configuration: NotRequired[MessageSendConfiguration] ``` Send message configuration. #### message ```python message: Message ``` The message being sent to the server. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### TaskSendParams Bases: `TypedDict` Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class TaskSendParams(TypedDict): """Internal parameters for task execution within the framework. Note: This is not part of the A2A protocol - it's used internally for broker/worker communication. """ id: str """The id of the task.""" context_id: str """The context id for the task.""" message: Message """The message to process.""" history_length: NotRequired[int] """Number of recent messages to be retrieved.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` The id of the task. #### context_id ```python context_id: str ``` The context id for the task. #### message ```python message: Message ``` The message to process. #### history_length ```python history_length: NotRequired[int] ``` Number of recent messages to be retrieved. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### ListTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for getting list of pushNotificationConfigurations associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class ListTaskPushNotificationConfigParams(TypedDict): """Parameters for getting list of pushNotificationConfigurations associated with a Task.""" id: str """Task id.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### DeleteTaskPushNotificationConfigParams Bases: `TypedDict` Parameters for removing pushNotificationConfiguration associated with a Task. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python @pydantic.with_config({'alias_generator': to_camel}) class DeleteTaskPushNotificationConfigParams(TypedDict): """Parameters for removing pushNotificationConfiguration associated with a Task.""" id: str """Task id.""" push_notification_config_id: str """The push notification config id to delete.""" metadata: NotRequired[dict[str, Any]] """Extension metadata.""" ``` #### id ```python id: str ``` Task id. #### push_notification_config_id ```python push_notification_config_id: str ``` The push notification config id to delete. #### metadata ```python metadata: NotRequired[dict[str, Any]] ``` Extension metadata. ### JSONRPCMessage Bases: `TypedDict` A JSON RPC message. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCMessage(TypedDict): """A JSON RPC message.""" jsonrpc: Literal['2.0'] """The JSON RPC version.""" id: int | str | None """The request id.""" ``` #### jsonrpc ```python jsonrpc: Literal['2.0'] ``` The JSON RPC version. #### id ```python id: int | str | None ``` The request id. ### JSONRPCRequest Bases: `JSONRPCMessage`, `Generic[Method, Params]` A JSON RPC request. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCRequest(JSONRPCMessage, Generic[Method, Params]): """A JSON RPC request.""" method: Method """The method to call.""" params: Params """The parameters to pass to the method.""" ``` #### method ```python method: Method ``` The method to call. #### params ```python params: Params ``` The parameters to pass to the method. ### JSONRPCError Bases: `TypedDict`, `Generic[CodeT, MessageT]` A JSON RPC error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCError(TypedDict, Generic[CodeT, MessageT]): """A JSON RPC error.""" code: CodeT message: MessageT data: NotRequired[Any] ``` ### JSONRPCResponse Bases: `JSONRPCMessage`, `Generic[ResultT, ErrorT]` A JSON RPC response. Source code in `.venv/lib/python3.12/site-packages/fasta2a/schema.py` ```python class JSONRPCResponse(JSONRPCMessage, Generic[ResultT, ErrorT]): """A JSON RPC response.""" result: NotRequired[ResultT] error: NotRequired[ErrorT] ``` ### JSONParseError ```python JSONParseError = JSONRPCError[ Literal[-32700], Literal["Invalid JSON payload"] ] ``` A JSON RPC error for a parse error. ### InvalidRequestError ```python InvalidRequestError = JSONRPCError[ Literal[-32600], Literal["Request payload validation error"], ] ``` A JSON RPC error for an invalid request. ### MethodNotFoundError ```python MethodNotFoundError = JSONRPCError[ Literal[-32601], Literal["Method not found"] ] ``` A JSON RPC error for a method not found. ### InvalidParamsError ```python InvalidParamsError = JSONRPCError[ Literal[-32602], Literal["Invalid parameters"] ] ``` A JSON RPC error for invalid parameters. ### InternalError ```python InternalError = JSONRPCError[ Literal[-32603], Literal["Internal error"] ] ``` A JSON RPC error for an internal error. ### TaskNotFoundError ```python TaskNotFoundError = JSONRPCError[ Literal[-32001], Literal["Task not found"] ] ``` A JSON RPC error for a task not found. ### TaskNotCancelableError ```python TaskNotCancelableError = JSONRPCError[ Literal[-32002], Literal["Task not cancelable"] ] ``` A JSON RPC error for a task not cancelable. ### PushNotificationNotSupportedError ```python PushNotificationNotSupportedError = JSONRPCError[ Literal[-32003], Literal["Push notification not supported"], ] ``` A JSON RPC error for a push notification not supported. ### UnsupportedOperationError ```python UnsupportedOperationError = JSONRPCError[ Literal[-32004], Literal["This operation is not supported"], ] ``` A JSON RPC error for an unsupported operation. ### ContentTypeNotSupportedError ```python ContentTypeNotSupportedError = JSONRPCError[ Literal[-32005], Literal["Incompatible content types"] ] ``` A JSON RPC error for incompatible content types. ### InvalidAgentResponseError ```python InvalidAgentResponseError = JSONRPCError[ Literal[-32006], Literal["Invalid agent response"] ] ``` A JSON RPC error for invalid agent response. ### SendMessageRequest ```python SendMessageRequest = JSONRPCRequest[ Literal["message/send"], MessageSendParams ] ``` A JSON RPC request to send a message. ### SendMessageResponse ```python SendMessageResponse = JSONRPCResponse[ Union[Task, Message], JSONRPCError[Any, Any] ] ``` A JSON RPC response to send a message. ### StreamMessageRequest ```python StreamMessageRequest = JSONRPCRequest[ Literal["message/stream"], MessageSendParams ] ``` A JSON RPC request to stream a message. ### GetTaskRequest ```python GetTaskRequest = JSONRPCRequest[ Literal["tasks/get"], TaskQueryParams ] ``` A JSON RPC request to get a task. ### GetTaskResponse ```python GetTaskResponse = JSONRPCResponse[Task, TaskNotFoundError] ``` A JSON RPC response to get a task. ### CancelTaskRequest ```python CancelTaskRequest = JSONRPCRequest[ Literal["tasks/cancel"], TaskIdParams ] ``` A JSON RPC request to cancel a task. ### CancelTaskResponse ```python CancelTaskResponse = JSONRPCResponse[ Task, Union[TaskNotCancelableError, TaskNotFoundError] ] ``` A JSON RPC response to cancel a task. ### SetTaskPushNotificationRequest ```python SetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/set"], TaskPushNotificationConfig, ] ``` A JSON RPC request to set a task push notification. ### SetTaskPushNotificationResponse ```python SetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to set a task push notification. ### GetTaskPushNotificationRequest ```python GetTaskPushNotificationRequest = JSONRPCRequest[ Literal["tasks/pushNotification/get"], TaskIdParams ] ``` A JSON RPC request to get a task push notification. ### GetTaskPushNotificationResponse ```python GetTaskPushNotificationResponse = JSONRPCResponse[ TaskPushNotificationConfig, PushNotificationNotSupportedError, ] ``` A JSON RPC response to get a task push notification. ### ResubscribeTaskRequest ```python ResubscribeTaskRequest = JSONRPCRequest[ Literal["tasks/resubscribe"], TaskIdParams ] ``` A JSON RPC request to resubscribe to a task. ### ListTaskPushNotificationConfigRequest ```python ListTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/list"], ListTaskPushNotificationConfigParams, ] ``` A JSON RPC request to list task push notification configs. ### DeleteTaskPushNotificationConfigRequest ```python DeleteTaskPushNotificationConfigRequest = JSONRPCRequest[ Literal["tasks/pushNotificationConfig/delete"], DeleteTaskPushNotificationConfigParams, ] ``` A JSON RPC request to delete a task push notification config. ### A2ARequest ```python A2ARequest = Annotated[ Union[ SendMessageRequest, StreamMessageRequest, GetTaskRequest, CancelTaskRequest, SetTaskPushNotificationRequest, GetTaskPushNotificationRequest, ResubscribeTaskRequest, ListTaskPushNotificationConfigRequest, DeleteTaskPushNotificationConfigRequest, ], Discriminator("method"), ] ``` A JSON RPC request to the A2A server. ### A2AResponse ```python A2AResponse: TypeAlias = Union[ SendMessageResponse, GetTaskResponse, CancelTaskResponse, SetTaskPushNotificationResponse, GetTaskPushNotificationResponse, ] ``` A JSON RPC response from the A2A server. ### A2AClient A client for the A2A protocol. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class A2AClient: """A client for the A2A protocol.""" def __init__(self, base_url: str = 'http://localhost:8000', http_client: httpx.AsyncClient | None = None) -> None: if http_client is None: self.http_client = httpx.AsyncClient(base_url=base_url) else: self.http_client = http_client self.http_client.base_url = base_url async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) async def get_task(self, task_id: str) -> GetTaskResponse: payload = GetTaskRequest(jsonrpc='2.0', id=None, method='tasks/get', params={'id': task_id}) content = a2a_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return get_task_response_ta.validate_json(response.content) def _raise_for_status(self, response: httpx.Response) -> None: if response.status_code >= 400: raise UnexpectedResponseError(response.status_code, response.text) ``` #### send_message ```python send_message( message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None ) -> SendMessageResponse ``` Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python async def send_message( self, message: Message, *, metadata: dict[str, Any] | None = None, configuration: MessageSendConfiguration | None = None, ) -> SendMessageResponse: """Send a message using the A2A protocol. Returns a JSON-RPC response containing either a result (Task) or an error. """ params = MessageSendParams(message=message) if metadata is not None: params['metadata'] = metadata if configuration is not None: params['configuration'] = configuration request_id = str(uuid.uuid4()) payload = SendMessageRequest(jsonrpc='2.0', id=request_id, method='message/send', params=params) content = send_message_request_ta.dump_json(payload, by_alias=True) response = await self.http_client.post('/', content=content, headers={'Content-Type': 'application/json'}) self._raise_for_status(response) return send_message_response_ta.validate_json(response.content) ``` ### UnexpectedResponseError Bases: `Exception` An error raised when an unexpected response is received from the server. Source code in `.venv/lib/python3.12/site-packages/fasta2a/client.py` ```python class UnexpectedResponseError(Exception): """An error raised when an unexpected response is received from the server.""" def __init__(self, status_code: int, content: str) -> None: self.status_code = status_code self.content = content ``` # `pydantic_ai.format_prompt` ### format_as_xml ```python format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = "item", none_str: str = "null", indent: str | None = " ", include_field_info: Literal["once"] | bool = False, ) -> str ``` Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `date`, `datetime`, `time`, `timedelta`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `obj` | `Any` | Python Object to serialize to XML. | *required* | | `root_tag` | `str | None` | Outer tag to wrap the XML in, use None to omit the outer tag. | `None` | | `item_tag` | `str` | Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. | `'item'` | | `none_str` | `str` | String to use for None values. | `'null'` | | `indent` | `str | None` | Indentation string to use for pretty printing. | `' '` | | `include_field_info` | `Literal['once'] | bool` | Whether to include attributes like Pydantic Field attributes and dataclasses field() metadata as XML attributes. In both cases the allowed Field attributes and field() metadata keys are title and description. If a field is repeated in the data (e.g. in a list) by setting once the attributes are included only in the first occurrence of an XML element relative to the same field. | `False` | Returns: | Type | Description | | --- | --- | | `str` | XML representation of the object. | Example: format_as_xml_example.py ```python from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` Source code in `pydantic_ai_slim/pydantic_ai/format_prompt.py` ````python def format_as_xml( obj: Any, root_tag: str | None = None, item_tag: str = 'item', none_str: str = 'null', indent: str | None = ' ', include_field_info: Literal['once'] | bool = False, ) -> str: """Format a Python object as XML. This is useful since LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON etc. Supports: `str`, `bytes`, `bytearray`, `bool`, `int`, `float`, `date`, `datetime`, `time`, `timedelta`, `Enum`, `Mapping`, `Iterable`, `dataclass`, and `BaseModel`. Args: obj: Python Object to serialize to XML. root_tag: Outer tag to wrap the XML in, use `None` to omit the outer tag. item_tag: Tag to use for each item in an iterable (e.g. list), this is overridden by the class name for dataclasses and Pydantic models. none_str: String to use for `None` values. indent: Indentation string to use for pretty printing. include_field_info: Whether to include attributes like Pydantic `Field` attributes and dataclasses `field()` `metadata` as XML attributes. In both cases the allowed `Field` attributes and `field()` metadata keys are `title` and `description`. If a field is repeated in the data (e.g. in a list) by setting `once` the attributes are included only in the first occurrence of an XML element relative to the same field. Returns: XML representation of the object. Example: ```python {title="format_as_xml_example.py" lint="skip"} from pydantic_ai import format_as_xml print(format_as_xml({'name': 'John', 'height': 6, 'weight': 200}, root_tag='user')) ''' John 6 200 ''' ``` """ el = _ToXml( data=obj, item_tag=item_tag, none_str=none_str, include_field_info=include_field_info, ).to_xml(root_tag) if root_tag is None and el.text is None: join = '' if indent is None else '\n' return join.join(_rootless_xml_elements(el, indent)) else: if indent is not None: ElementTree.indent(el, space=indent) return ElementTree.tostring(el, encoding='unicode') ```` # `pydantic_ai.mcp` ### MCPServer Bases: `AbstractToolset[Any]`, `ABC` Base class for attaching agents to MCP servers. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python class MCPServer(AbstractToolset[Any], ABC): """Base class for attaching agents to MCP servers. See for more information. """ tool_prefix: str | None """A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` """ log_level: mcp_types.LoggingLevel | None """The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. """ log_handler: LoggingFnT | None """A handler for logging messages from the server.""" timeout: float """The timeout in seconds to wait for the client to initialize.""" read_timeout: float """Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). """ process_tool_call: ProcessToolCallback | None """Hook to customize tool calling and optionally pass extra metadata.""" allow_sampling: bool """Whether to allow MCP sampling through this client.""" sampling_model: models.Model | None """The model to use for sampling.""" max_retries: int """The maximum number of times to retry a tool call.""" elicitation_callback: ElicitationFnT | None = None """Callback function to handle elicitation requests from the server.""" _id: str | None _enter_lock: Lock = field(compare=False) _running_count: int _exit_stack: AsyncExitStack | None _client: ClientSession _read_stream: MemoryObjectReceiveStream[SessionMessage | Exception] _write_stream: MemoryObjectSendStream[SessionMessage] _server_info: mcp_types.Implementation def __init__( self, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, *, id: str | None = None, ): self.tool_prefix = tool_prefix self.log_level = log_level self.log_handler = log_handler self.timeout = timeout self.read_timeout = read_timeout self.process_tool_call = process_tool_call self.allow_sampling = allow_sampling self.sampling_model = sampling_model self.max_retries = max_retries self.elicitation_callback = elicitation_callback self._id = id or tool_prefix self.__post_init__() def __post_init__(self): self._enter_lock = Lock() self._running_count = 0 self._exit_stack = None @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield @property def id(self) -> str | None: return self._id @id.setter def id(self, value: str | None): self._id = value @property def label(self) -> str: if self.id: return super().label # pragma: no cover else: return repr(self) @property def tool_name_conflict_hint(self) -> str: return 'Set the `tool_prefix` attribute to avoid name conflicts.' @property def server_info(self) -> mcp_types.Implementation: """Access the information send by the MCP server during initialization.""" if getattr(self, '_server_info', None) is None: raise AttributeError( f'The `{self.__class__.__name__}.server_info` is only instantiated after initialization.' ) return self._server_info async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. """ async with self: # Ensure server is running result = await self._client.list_tools() return result.tools async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except McpError as e: raise exceptions.ModelRetry(e.error.message) if result.isError: message: str | None = None if result.content: # pragma: no branch text_parts = [part.text for part in result.content if isinstance(part, mcp_types.TextContent)] message = '\n'.join(text_parts) raise exceptions.ModelRetry(message or 'MCP tool call failed') # Prefer structured content if there are only text parts, which per the docs would contain the JSON-encoded structured content for backward compatibility. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if (structured := result.structuredContent) and not any( not isinstance(part, mcp_types.TextContent) for part in result.content ): # The MCP SDK wraps primitives and generic types like list in a `result` key, but we want to use the raw value returned by the tool function. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if isinstance(structured, dict) and len(structured) == 1 and 'result' in structured: return structured['result'] return structured mapped = [await self._map_tool_result_part(part) for part in result.content] return mapped[0] if len(mapped) == 1 else mapped async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[Any], tool: ToolsetTool[Any], ) -> ToolResult: if self.tool_prefix: name = name.removeprefix(f'{self.tool_prefix}_') ctx = replace(ctx, tool_name=name) if self.process_tool_call is not None: return await self.process_tool_call(ctx, self.direct_call_tool, name, tool_args) else: return await self.direct_call_tool(name, tool_args) async def get_tools(self, ctx: RunContext[Any]) -> dict[str, ToolsetTool[Any]]: return { name: self.tool_for_tool_def( ToolDefinition( name=name, description=mcp_tool.description, parameters_json_schema=mcp_tool.inputSchema, metadata={ 'meta': mcp_tool.meta, 'annotations': mcp_tool.annotations.model_dump() if mcp_tool.annotations else None, 'output_schema': mcp_tool.outputSchema or None, }, ), ) for mcp_tool in await self.list_tools() if (name := f'{self.tool_prefix}_{mcp_tool.name}' if self.tool_prefix else mcp_tool.name) } def tool_for_tool_def(self, tool_def: ToolDefinition) -> ToolsetTool[Any]: return ToolsetTool( toolset=self, tool_def=tool_def, max_retries=self.max_retries, args_validator=TOOL_SCHEMA_VALIDATOR, ) async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): result = await self._client.initialize() self._server_info = result.serverInfo if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: if self._running_count == 0: raise ValueError('MCPServer.__aexit__ called more times than __aenter__') async with self._enter_lock: self._running_count -= 1 if self._running_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None @property def is_running(self) -> bool: """Check if the MCP server is running.""" return bool(self._running_count) async def _sampling_callback( self, context: RequestContext[ClientSession, Any], params: mcp_types.CreateMessageRequestParams ) -> mcp_types.CreateMessageResult | mcp_types.ErrorData: """MCP sampling callback.""" if self.sampling_model is None: raise ValueError('Sampling model is not set') # pragma: no cover pai_messages = _mcp.map_from_mcp_params(params) model_settings = models.ModelSettings() if max_tokens := params.maxTokens: # pragma: no branch model_settings['max_tokens'] = max_tokens if temperature := params.temperature: # pragma: no branch model_settings['temperature'] = temperature if stop_sequences := params.stopSequences: # pragma: no branch model_settings['stop_sequences'] = stop_sequences model_response = await model_request(self.sampling_model, pai_messages, model_settings=model_settings) return mcp_types.CreateMessageResult( role='assistant', content=_mcp.map_from_model_response(model_response), model=self.sampling_model.model_name, ) async def _map_tool_result_part( self, part: mcp_types.ContentBlock ) -> str | messages.BinaryContent | dict[str, Any] | list[Any]: # See https://github.com/jlowin/fastmcp/blob/main/docs/servers/tools.mdx#return-values if isinstance(part, mcp_types.TextContent): text = part.text if text.startswith(('[', '{')): try: return pydantic_core.from_json(text) except ValueError: pass return text elif isinstance(part, mcp_types.ImageContent): return messages.BinaryContent(data=base64.b64decode(part.data), media_type=part.mimeType) elif isinstance(part, mcp_types.AudioContent): # NOTE: The FastMCP server doesn't support audio content. # See for more details. return messages.BinaryContent( data=base64.b64decode(part.data), media_type=part.mimeType ) # pragma: no cover elif isinstance(part, mcp_types.EmbeddedResource): resource = part.resource return self._get_content(resource) elif isinstance(part, mcp_types.ResourceLink): resource_result: mcp_types.ReadResourceResult = await self._client.read_resource(part.uri) return ( self._get_content(resource_result.contents[0]) if len(resource_result.contents) == 1 else [self._get_content(resource) for resource in resource_result.contents] ) else: assert_never(part) def _get_content( self, resource: mcp_types.TextResourceContents | mcp_types.BlobResourceContents ) -> str | messages.BinaryContent: if isinstance(resource, mcp_types.TextResourceContents): return resource.text elif isinstance(resource, mcp_types.BlobResourceContents): return messages.BinaryContent( data=base64.b64decode(resource.blob), media_type=resource.mimeType or 'application/octet-stream' ) else: assert_never(resource) def __eq__(self, value: object, /) -> bool: return isinstance(value, MCPServer) and self.id == value.id and self.tool_prefix == value.tool_prefix ``` #### tool_prefix ```python tool_prefix: str | None = tool_prefix ``` A prefix to add to all tools that are registered with the server. If not empty, will include a trailing underscore(`_`). e.g. if `tool_prefix='foo'`, then a tool named `bar` will be registered as `foo_bar` #### log_level ```python log_level: LoggingLevel | None = log_level ``` The log level to set when connecting to the server, if any. See for more details. If `None`, no log level will be set. #### log_handler ```python log_handler: LoggingFnT | None = log_handler ``` A handler for logging messages from the server. #### timeout ```python timeout: float = timeout ``` The timeout in seconds to wait for the client to initialize. #### read_timeout ```python read_timeout: float = read_timeout ``` Maximum time in seconds to wait for new messages before timing out. This timeout applies to the long-lived connection after it's established. If no new messages are received within this time, the connection will be considered stale and may be closed. Defaults to 5 minutes (300 seconds). #### process_tool_call ```python process_tool_call: ProcessToolCallback | None = ( process_tool_call ) ``` Hook to customize tool calling and optionally pass extra metadata. #### allow_sampling ```python allow_sampling: bool = allow_sampling ``` Whether to allow MCP sampling through this client. #### sampling_model ```python sampling_model: Model | None = sampling_model ``` The model to use for sampling. #### max_retries ```python max_retries: int = max_retries ``` The maximum number of times to retry a tool call. #### elicitation_callback ```python elicitation_callback: ElicitationFnT | None = ( elicitation_callback ) ``` Callback function to handle elicitation requests from the server. #### client_streams ```python client_streams() -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[ SessionMessage | Exception ], MemoryObjectSendStream[SessionMessage], ] ] ``` Create the streams for the MCP server. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python @abstractmethod @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: """Create the streams for the MCP server.""" raise NotImplementedError('MCP Server subclasses must implement this method.') yield ``` #### server_info ```python server_info: Implementation ``` Access the information send by the MCP server during initialization. #### list_tools ```python list_tools() -> list[Tool] ``` Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def list_tools(self) -> list[mcp_types.Tool]: """Retrieve tools that are currently active on the server. Note: - We don't cache tools as they might change. - We also don't subscribe to the server to avoid complexity. """ async with self: # Ensure server is running result = await self._client.list_tools() return result.tools ``` #### direct_call_tool ```python direct_call_tool( name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult ``` Call a tool on the server. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the tool to call. | *required* | | `args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `metadata` | `dict[str, Any] | None` | Request-level metadata (optional) | `None` | Returns: | Type | Description | | --- | --- | | `ToolResult` | The result of the tool call. | Raises: | Type | Description | | --- | --- | | `ModelRetry` | If the tool call fails. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def direct_call_tool( self, name: str, args: dict[str, Any], metadata: dict[str, Any] | None = None, ) -> ToolResult: """Call a tool on the server. Args: name: The name of the tool to call. args: The arguments to pass to the tool. metadata: Request-level metadata (optional) Returns: The result of the tool call. Raises: ModelRetry: If the tool call fails. """ async with self: # Ensure server is running try: result = await self._client.send_request( mcp_types.ClientRequest( mcp_types.CallToolRequest( method='tools/call', params=mcp_types.CallToolRequestParams( name=name, arguments=args, _meta=mcp_types.RequestParams.Meta(**metadata) if metadata else None, ), ) ), mcp_types.CallToolResult, ) except McpError as e: raise exceptions.ModelRetry(e.error.message) if result.isError: message: str | None = None if result.content: # pragma: no branch text_parts = [part.text for part in result.content if isinstance(part, mcp_types.TextContent)] message = '\n'.join(text_parts) raise exceptions.ModelRetry(message or 'MCP tool call failed') # Prefer structured content if there are only text parts, which per the docs would contain the JSON-encoded structured content for backward compatibility. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if (structured := result.structuredContent) and not any( not isinstance(part, mcp_types.TextContent) for part in result.content ): # The MCP SDK wraps primitives and generic types like list in a `result` key, but we want to use the raw value returned by the tool function. # See https://github.com/modelcontextprotocol/python-sdk#structured-output if isinstance(structured, dict) and len(structured) == 1 and 'result' in structured: return structured['result'] return structured mapped = [await self._map_tool_result_part(part) for part in result.content] return mapped[0] if len(mapped) == 1 else mapped ``` #### __aenter__ ```python __aenter__() -> Self ``` Enter the MCP server context. This will initialize the connection to the server. If this server is an MCPServerStdio, the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python async def __aenter__(self) -> Self: """Enter the MCP server context. This will initialize the connection to the server. If this server is an [`MCPServerStdio`][pydantic_ai.mcp.MCPServerStdio], the server will first be started as a subprocess. This is a no-op if the MCP server has already been entered. """ async with self._enter_lock: if self._running_count == 0: async with AsyncExitStack() as exit_stack: self._read_stream, self._write_stream = await exit_stack.enter_async_context(self.client_streams()) client = ClientSession( read_stream=self._read_stream, write_stream=self._write_stream, sampling_callback=self._sampling_callback if self.allow_sampling else None, elicitation_callback=self.elicitation_callback, logging_callback=self.log_handler, read_timeout_seconds=timedelta(seconds=self.read_timeout), ) self._client = await exit_stack.enter_async_context(client) with anyio.fail_after(self.timeout): result = await self._client.initialize() self._server_info = result.serverInfo if log_level := self.log_level: await self._client.set_logging_level(log_level) self._exit_stack = exit_stack.pop_all() self._running_count += 1 return self ``` #### is_running ```python is_running: bool ``` Check if the MCP server is running. ### MCPServerStdio Bases: `MCPServer` Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. Note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. 1. This will start the server as a subprocess and connect to it. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStdio(MCPServer): """Runs an MCP server in a subprocess and communicates with it over stdin/stdout. This class implements the stdio transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will start the server as a subprocess when entering the context, and stop it when exiting the context. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. 2. This will start the server as a subprocess and connect to it. """ command: str """The command to run.""" args: Sequence[str] """The arguments to pass to the command.""" env: dict[str, str] | None """The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. """ cwd: str | Path | None """The working directory to use when spawning the process.""" # last fields are re-defined from the parent class so they appear as fields tool_prefix: str | None log_level: mcp_types.LoggingLevel | None log_handler: LoggingFnT | None timeout: float read_timeout: float process_tool_call: ProcessToolCallback | None allow_sampling: bool sampling_model: models.Model | None max_retries: int elicitation_callback: ElicitationFnT | None = None def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, id=id, ) @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStdio(**dct), core_schema.typed_dict_schema( { 'command': core_schema.typed_dict_field(core_schema.str_schema()), 'args': core_schema.typed_dict_field(core_schema.list_schema(core_schema.str_schema())), 'env': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False, ), } ), ) @asynccontextmanager async def client_streams( self, ) -> AsyncIterator[ tuple[ MemoryObjectReceiveStream[SessionMessage | Exception], MemoryObjectSendStream[SessionMessage], ] ]: server = StdioServerParameters(command=self.command, args=list(self.args), env=self.env, cwd=self.cwd) async with stdio_client(server=server) as (read_stream, write_stream): yield read_stream, write_stream def __repr__(self) -> str: repr_args = [ f'command={self.command!r}', f'args={self.args!r}', ] if self.id: repr_args.append(f'id={self.id!r}') # pragma: lax no cover return f'{self.__class__.__name__}({", ".join(repr_args)})' def __eq__(self, value: object, /) -> bool: return ( super().__eq__(value) and isinstance(value, MCPServerStdio) and self.command == value.command and self.args == value.args and self.env == value.env and self.cwd == value.cwd ) ```` #### __init__ ```python __init__( command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None ) ``` Build a new MCP server. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `command` | `str` | The command to run. | *required* | | `args` | `Sequence[str]` | The arguments to pass to the command. | *required* | | `env` | `dict[str, str] | None` | The environment variables to set in the subprocess. | `None` | | `cwd` | `str | Path | None` | The working directory to use when spawning the process. | `None` | | `tool_prefix` | `str | None` | A prefix to add to all tools that are registered with the server. | `None` | | `log_level` | `LoggingLevel | None` | The log level to set when connecting to the server, if any. | `None` | | `log_handler` | `LoggingFnT | None` | A handler for logging messages from the server. | `None` | | `timeout` | `float` | The timeout in seconds to wait for the client to initialize. | `5` | | `read_timeout` | `float` | Maximum time in seconds to wait for new messages before timing out. | `5 * 60` | | `process_tool_call` | `ProcessToolCallback | None` | Hook to customize tool calling and optionally pass extra metadata. | `None` | | `allow_sampling` | `bool` | Whether to allow MCP sampling through this client. | `True` | | `sampling_model` | `Model | None` | The model to use for sampling. | `None` | | `max_retries` | `int` | The maximum number of times to retry a tool call. | `1` | | `elicitation_callback` | `ElicitationFnT | None` | Callback function to handle elicitation requests from the server. | `None` | | `id` | `str | None` | An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def __init__( self, command: str, args: Sequence[str], *, env: dict[str, str] | None = None, cwd: str | Path | None = None, tool_prefix: str | None = None, log_level: mcp_types.LoggingLevel | None = None, log_handler: LoggingFnT | None = None, timeout: float = 5, read_timeout: float = 5 * 60, process_tool_call: ProcessToolCallback | None = None, allow_sampling: bool = True, sampling_model: models.Model | None = None, max_retries: int = 1, elicitation_callback: ElicitationFnT | None = None, id: str | None = None, ): """Build a new MCP server. Args: command: The command to run. args: The arguments to pass to the command. env: The environment variables to set in the subprocess. cwd: The working directory to use when spawning the process. tool_prefix: A prefix to add to all tools that are registered with the server. log_level: The log level to set when connecting to the server, if any. log_handler: A handler for logging messages from the server. timeout: The timeout in seconds to wait for the client to initialize. read_timeout: Maximum time in seconds to wait for new messages before timing out. process_tool_call: Hook to customize tool calling and optionally pass extra metadata. allow_sampling: Whether to allow MCP sampling through this client. sampling_model: The model to use for sampling. max_retries: The maximum number of times to retry a tool call. elicitation_callback: Callback function to handle elicitation requests from the server. id: An optional unique ID for the MCP server. An MCP server needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the server's activities within the workflow. """ self.command = command self.args = args self.env = env self.cwd = cwd super().__init__( tool_prefix, log_level, log_handler, timeout, read_timeout, process_tool_call, allow_sampling, sampling_model, max_retries, elicitation_callback, id=id, ) ``` #### command ```python command: str = command ``` The command to run. #### args ```python args: Sequence[str] = args ``` The arguments to pass to the command. #### env ```python env: dict[str, str] | None = env ``` The environment variables the CLI server will have access to. By default the subprocess will not inherit any environment variables from the parent process. If you want to inherit the environment variables from the parent process, use `env=os.environ`. #### cwd ```python cwd: str | Path | None = cwd ``` The working directory to use when spawning the process. ### MCPServerSSE Bases: `_MCPServerHTTP` An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (1)! ... ``` 1. This will connect to a server running on `localhost:3001`. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerSSE(_MCPServerHTTP): """An MCP server that connects over streamable HTTP connections. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (1)! ... ``` 1. This will connect to a server running on `localhost:3001`. """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerSSE(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) @property def _transport_client(self): return sse_client # pragma: no cover def __eq__(self, value: object, /) -> bool: return super().__eq__(value) and isinstance(value, MCPServerSSE) and self.url == value.url ```` ### MCPServerHTTP Bases: `MCPServerSSE` Deprecated The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead. An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. This will connect to a server running on `localhost:3001`. Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python @deprecated('The `MCPServerHTTP` class is deprecated, use `MCPServerSSE` instead.') class MCPServerHTTP(MCPServerSSE): """An MCP server that connects over HTTP using the old SSE transport. This class implements the SSE transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10" test="skip"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerHTTP server = MCPServerHTTP('http://localhost:3001/sse') agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` 1. This will connect to a server running on `localhost:3001`. """ ```` ### MCPServerStreamableHTTP Bases: `_MCPServerHTTP` An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. Note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ````python class MCPServerStreamableHTTP(_MCPServerHTTP): """An MCP server that connects over HTTP using the Streamable HTTP transport. This class implements the Streamable HTTP transport from the MCP specification. See for more information. !!! note Using this class as an async context manager will create a new pool of HTTP connections to connect to a server which should already be running. Example: ```python {py="3.10"} from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: # (2)! ... ``` """ @classmethod def __get_pydantic_core_schema__(cls, _: Any, __: Any) -> CoreSchema: return core_schema.no_info_after_validator_function( lambda dct: MCPServerStreamableHTTP(**dct), core_schema.typed_dict_schema( { 'url': core_schema.typed_dict_field(core_schema.str_schema()), 'headers': core_schema.typed_dict_field( core_schema.dict_schema(core_schema.str_schema(), core_schema.str_schema()), required=False ), } ), ) @property def _transport_client(self): return streamablehttp_client # pragma: no cover def __eq__(self, value: object, /) -> bool: return super().__eq__(value) and isinstance(value, MCPServerStreamableHTTP) and self.url == value.url ```` ### load_mcp_servers ```python load_mcp_servers( config_path: str | Path, ) -> list[ MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE ] ``` Load MCP servers from a configuration file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `config_path` | `str | Path` | The path to the configuration file. | *required* | Returns: | Type | Description | | --- | --- | | `list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]` | A list of MCP servers. | Raises: | Type | Description | | --- | --- | | `FileNotFoundError` | If the configuration file does not exist. | | `ValidationError` | If the configuration file does not match the schema. | Source code in `pydantic_ai_slim/pydantic_ai/mcp.py` ```python def load_mcp_servers(config_path: str | Path) -> list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE]: """Load MCP servers from a configuration file. Args: config_path: The path to the configuration file. Returns: A list of MCP servers. Raises: FileNotFoundError: If the configuration file does not exist. ValidationError: If the configuration file does not match the schema. """ config_path = Path(config_path) if not config_path.exists(): raise FileNotFoundError(f'Config file {config_path} not found') config = MCPServerConfig.model_validate_json(config_path.read_bytes()) servers: list[MCPServerStdio | MCPServerStreamableHTTP | MCPServerSSE] = [] for name, server in config.mcp_servers.items(): server.id = name server.tool_prefix = name servers.append(server) return servers ``` # `pydantic_ai.messages` The structure of ModelMessage can be shown as a graph: ``` graph RL SystemPromptPart(SystemPromptPart) --- ModelRequestPart UserPromptPart(UserPromptPart) --- ModelRequestPart ToolReturnPart(ToolReturnPart) --- ModelRequestPart RetryPromptPart(RetryPromptPart) --- ModelRequestPart TextPart(TextPart) --- ModelResponsePart ToolCallPart(ToolCallPart) --- ModelResponsePart ThinkingPart(ThinkingPart) --- ModelResponsePart ModelRequestPart("ModelRequestPart
(Union)") --- ModelRequest ModelRequest("ModelRequest(parts=list[...])") --- ModelMessage ModelResponsePart("ModelResponsePart
(Union)") --- ModelResponse ModelResponse("ModelResponse(parts=list[...])") --- ModelMessage("ModelMessage
(Union)") ``` ### FinishReason ```python FinishReason: TypeAlias = Literal[ "stop", "length", "content_filter", "tool_call", "error" ] ``` Reason the model finished generating the response, normalized to OpenTelemetry values. ### SystemPromptPart A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class SystemPromptPart: """A system prompt, generally written by the application developer. This gives the model context and guidance on how to respond. """ content: str """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" dynamic_ref: str | None = None """The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see [`system_prompt`][pydantic_ai.Agent.system_prompt] for more information. """ part_kind: Literal['system-prompt'] = 'system-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> Event: return Event( 'gen_ai.system.message', body={'role': 'system', **({'content': self.content} if settings.include_content else {})}, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: return [_otel_messages.TextPart(type='text', **{'content': self.content} if settings.include_content else {})] __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### dynamic_ref ```python dynamic_ref: str | None = None ``` The ref of the dynamic system prompt function that generated this part. Only set if system prompt is dynamic, see system_prompt for more information. #### part_kind ```python part_kind: Literal['system-prompt'] = 'system-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### FileUrl Bases: `ABC` Abstract base class for any URL-based file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class FileUrl(ABC): """Abstract base class for any URL-based file.""" url: str """The URL of the file.""" _: KW_ONLY force_download: bool = False """For OpenAI and Google APIs it: * If True, the file is downloaded and the data is sent to the model as bytes. * If False, the URL is sent directly to the model and no download is performed. """ vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images """ _media_type: Annotated[str | None, pydantic.Field(alias='media_type', default=None, exclude=True)] = field( compare=False, default=None ) _identifier: Annotated[str | None, pydantic.Field(alias='identifier', default=None, exclude=True)] = field( compare=False, default=None ) def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, ) -> None: self.url = url self._media_type = media_type self._identifier = identifier self.force_download = force_download self.vendor_metadata = vendor_metadata @pydantic.computed_field @property def media_type(self) -> str: """Return the media type of the file, based on the URL or the provided `media_type`.""" return self._media_type or self._infer_media_type() @pydantic.computed_field @property def identifier(self) -> str: """The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. """ return self._identifier or _multi_modal_content_identifier(self.url) @abstractmethod def _infer_media_type(self) -> str: """Infer the media type of the file based on the URL.""" raise NotImplementedError @property @abstractmethod def format(self) -> str: """The file format.""" raise NotImplementedError __repr__ = _utils.dataclasses_no_defaults_repr ``` #### url ```python url: str = url ``` The URL of the file. #### force_download ```python force_download: bool = force_download ``` For OpenAI and Google APIs it: - If True, the file is downloaded and the data is sent to the model as bytes. - If False, the URL is sent directly to the model and no download is performed. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = vendor_metadata ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `VideoUrl.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `ImageUrl.vendor_metadata['detail']` is used as `detail` setting for images #### media_type ```python media_type: str ``` Return the media type of the file, based on the URL or the provided `media_type`. #### identifier ```python identifier: str ``` The identifier of the file, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `FileUrl`. This identifier is only automatically passed to the model when the `FileUrl` is returned by a tool. If you're passing the `FileUrl` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `FileUrl`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. #### format ```python format: str ``` The file format. ### VideoUrl Bases: `FileUrl` A URL to a video. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class VideoUrl(FileUrl): """A URL to a video.""" url: str """The URL of the video.""" _: KW_ONLY kind: Literal['video-url'] = 'video-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['video-url'] = 'video-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier or _identifier, ) self.kind = kind def _infer_media_type(self) -> VideoMediaType: """Return the media type of the video, based on the url.""" if self.url.endswith('.mkv'): return 'video/x-matroska' elif self.url.endswith('.mov'): return 'video/quicktime' elif self.url.endswith('.mp4'): return 'video/mp4' elif self.url.endswith('.webm'): return 'video/webm' elif self.url.endswith('.flv'): return 'video/x-flv' elif self.url.endswith(('.mpeg', '.mpg')): return 'video/mpeg' elif self.url.endswith('.wmv'): return 'video/x-ms-wmv' elif self.url.endswith('.three_gp'): return 'video/3gpp' # Assume that YouTube videos are mp4 because there would be no extension # to infer from. This should not be a problem, as Gemini disregards media # type for YouTube URLs. elif self.is_youtube: return 'video/mp4' else: raise ValueError( f'Could not infer media type from video URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def is_youtube(self) -> bool: """True if the URL has a YouTube domain.""" return self.url.startswith(('https://youtu.be/', 'https://youtube.com/', 'https://www.youtube.com/')) @property def format(self) -> VideoFormat: """The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _video_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the video. #### kind ```python kind: Literal['video-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### is_youtube ```python is_youtube: bool ``` True if the URL has a YouTube domain. #### format ```python format: VideoFormat ``` The file format of the video. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### AudioUrl Bases: `FileUrl` A URL to an audio file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class AudioUrl(FileUrl): """A URL to an audio file.""" url: str """The URL of the audio file.""" _: KW_ONLY kind: Literal['audio-url'] = 'audio-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['audio-url'] = 'audio-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier or _identifier, ) self.kind = kind def _infer_media_type(self) -> AudioMediaType: """Return the media type of the audio file, based on the url. References: - Gemini: https://ai.google.dev/gemini-api/docs/audio#supported-formats """ if self.url.endswith('.mp3'): return 'audio/mpeg' if self.url.endswith('.wav'): return 'audio/wav' if self.url.endswith('.flac'): return 'audio/flac' if self.url.endswith('.oga'): return 'audio/ogg' if self.url.endswith('.aiff'): return 'audio/aiff' if self.url.endswith('.aac'): return 'audio/aac' raise ValueError( f'Could not infer media type from audio URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def format(self) -> AudioFormat: """The file format of the audio file.""" return _audio_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the audio file. #### kind ```python kind: Literal['audio-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: AudioFormat ``` The file format of the audio file. ### ImageUrl Bases: `FileUrl` A URL to an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class ImageUrl(FileUrl): """A URL to an image.""" url: str """The URL of the image.""" _: KW_ONLY kind: Literal['image-url'] = 'image-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['image-url'] = 'image-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier or _identifier, ) self.kind = kind def _infer_media_type(self) -> ImageMediaType: """Return the media type of the image, based on the url.""" if self.url.endswith(('.jpg', '.jpeg')): return 'image/jpeg' elif self.url.endswith('.png'): return 'image/png' elif self.url.endswith('.gif'): return 'image/gif' elif self.url.endswith('.webp'): return 'image/webp' else: raise ValueError( f'Could not infer media type from image URL: {self.url}. Explicitly provide a `media_type` instead.' ) @property def format(self) -> ImageFormat: """The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ return _image_format_lookup[self.media_type] ``` #### url ```python url: str ``` The URL of the image. #### kind ```python kind: Literal['image-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: ImageFormat ``` The file format of the image. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### DocumentUrl Bases: `FileUrl` The URL of the document. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class DocumentUrl(FileUrl): """The URL of the document.""" url: str """The URL of the document.""" _: KW_ONLY kind: Literal['document-url'] = 'document-url' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, url: str, *, media_type: str | None = None, identifier: str | None = None, force_download: bool = False, vendor_metadata: dict[str, Any] | None = None, kind: Literal['document-url'] = 'document-url', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _media_type: str | None = None, _identifier: str | None = None, ) -> None: super().__init__( url=url, force_download=force_download, vendor_metadata=vendor_metadata, media_type=media_type or _media_type, identifier=identifier or _identifier, ) self.kind = kind def _infer_media_type(self) -> str: """Return the media type of the document, based on the url.""" # Common document types are hardcoded here as mime-type support for these # extensions varies across operating systems. if self.url.endswith(('.md', '.mdx', '.markdown')): return 'text/markdown' elif self.url.endswith('.asciidoc'): return 'text/x-asciidoc' elif self.url.endswith('.txt'): return 'text/plain' elif self.url.endswith('.pdf'): return 'application/pdf' elif self.url.endswith('.rtf'): return 'application/rtf' elif self.url.endswith('.docx'): return 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' elif self.url.endswith('.xlsx'): return 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet' type_, _ = guess_type(self.url) if type_ is None: raise ValueError( f'Could not infer media type from document URL: {self.url}. Explicitly provide a `media_type` instead.' ) return type_ @property def format(self) -> DocumentFormat: """The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. """ media_type = self.media_type try: return _document_format_lookup[media_type] except KeyError as e: raise ValueError(f'Unknown document media type: {media_type}') from e ``` #### url ```python url: str ``` The URL of the document. #### kind ```python kind: Literal['document-url'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### format ```python format: DocumentFormat ``` The file format of the document. The choice of supported formats were based on the Bedrock Converse API. Other APIs don't require to use a format. ### BinaryContent Binary content, e.g. an audio or image file. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(init=False, repr=False) class BinaryContent: """Binary content, e.g. an audio or image file.""" data: bytes """The binary data.""" _: KW_ONLY media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str """The media type of the binary data.""" vendor_metadata: dict[str, Any] | None = None """Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images """ _identifier: Annotated[str | None, pydantic.Field(alias='identifier', default=None, exclude=True)] = field( compare=False, default=None ) kind: Literal['binary'] = 'binary' """Type identifier, this is available on all parts as a discriminator.""" def __init__( self, data: bytes, *, media_type: AudioMediaType | ImageMediaType | DocumentMediaType | str, identifier: str | None = None, vendor_metadata: dict[str, Any] | None = None, kind: Literal['binary'] = 'binary', # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. _identifier: str | None = None, ) -> None: self.data = data self.media_type = media_type self._identifier = identifier or _identifier self.vendor_metadata = vendor_metadata self.kind = kind @staticmethod def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage: """Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image.""" if bc.is_image: return BinaryImage( data=bc.data, media_type=bc.media_type, identifier=bc.identifier, vendor_metadata=bc.vendor_metadata, ) else: return bc # pragma: no cover @classmethod def from_data_uri(cls, data_uri: str) -> Self: """Create a `BinaryContent` from a data URI.""" prefix = 'data:' if not data_uri.startswith(prefix): raise ValueError('Data URI must start with "data:"') # pragma: no cover media_type, data = data_uri[len(prefix) :].split(';base64,', 1) return cls(data=base64.b64decode(data), media_type=media_type) @pydantic.computed_field @property def identifier(self) -> str: """Identifier for the binary content, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. """ return self._identifier or _multi_modal_content_identifier(self.data) @property def data_uri(self) -> str: """Convert the `BinaryContent` to a data URI.""" return f'data:{self.media_type};base64,{base64.b64encode(self.data).decode()}' @property def is_audio(self) -> bool: """Return `True` if the media type is an audio type.""" return self.media_type.startswith('audio/') @property def is_image(self) -> bool: """Return `True` if the media type is an image type.""" return self.media_type.startswith('image/') @property def is_video(self) -> bool: """Return `True` if the media type is a video type.""" return self.media_type.startswith('video/') @property def is_document(self) -> bool: """Return `True` if the media type is a document type.""" return self.media_type in _document_format_lookup @property def format(self) -> str: """The file format of the binary content.""" try: if self.is_audio: return _audio_format_lookup[self.media_type] elif self.is_image: return _image_format_lookup[self.media_type] elif self.is_video: return _video_format_lookup[self.media_type] else: return _document_format_lookup[self.media_type] except KeyError as e: raise ValueError(f'Unknown media type: {self.media_type}') from e __repr__ = _utils.dataclasses_no_defaults_repr ``` #### data ```python data: bytes = data ``` The binary data. #### media_type ```python media_type: ( AudioMediaType | ImageMediaType | DocumentMediaType | str ) = media_type ``` The media type of the binary data. #### vendor_metadata ```python vendor_metadata: dict[str, Any] | None = vendor_metadata ``` Vendor-specific metadata for the file. Supported by: - `GoogleModel`: `BinaryContent.vendor_metadata` is used as `video_metadata`: https://ai.google.dev/gemini-api/docs/video-understanding#customize-video-processing - `OpenAIChatModel`, `OpenAIResponsesModel`: `BinaryContent.vendor_metadata['detail']` is used as `detail` setting for images #### kind ```python kind: Literal['binary'] = kind ``` Type identifier, this is available on all parts as a discriminator. #### narrow_type ```python narrow_type( bc: BinaryContent, ) -> BinaryContent | BinaryImage ``` Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @staticmethod def narrow_type(bc: BinaryContent) -> BinaryContent | BinaryImage: """Narrow the type of the `BinaryContent` to `BinaryImage` if it's an image.""" if bc.is_image: return BinaryImage( data=bc.data, media_type=bc.media_type, identifier=bc.identifier, vendor_metadata=bc.vendor_metadata, ) else: return bc # pragma: no cover ``` #### from_data_uri ```python from_data_uri(data_uri: str) -> Self ``` Create a `BinaryContent` from a data URI. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def from_data_uri(cls, data_uri: str) -> Self: """Create a `BinaryContent` from a data URI.""" prefix = 'data:' if not data_uri.startswith(prefix): raise ValueError('Data URI must start with "data:"') # pragma: no cover media_type, data = data_uri[len(prefix) :].split(';base64,', 1) return cls(data=base64.b64decode(data), media_type=media_type) ``` #### identifier ```python identifier: str ``` Identifier for the binary content, such as a unique ID. This identifier can be provided to the model in a message to allow it to refer to this file in a tool call argument, and the tool can look up the file in question by iterating over the message history and finding the matching `BinaryContent`. This identifier is only automatically passed to the model when the `BinaryContent` is returned by a tool. If you're passing the `BinaryContent` as a user message, it's up to you to include a separate text part with the identifier, e.g. "This is file :" preceding the `BinaryContent`. It's also included in inline-text delimiters for providers that require inlining text documents, so the model can distinguish multiple files. #### data_uri ```python data_uri: str ``` Convert the `BinaryContent` to a data URI. #### is_audio ```python is_audio: bool ``` Return `True` if the media type is an audio type. #### is_image ```python is_image: bool ``` Return `True` if the media type is an image type. #### is_video ```python is_video: bool ``` Return `True` if the media type is a video type. #### is_document ```python is_document: bool ``` Return `True` if the media type is a document type. #### format ```python format: str ``` The file format of the binary content. ### BinaryImage Bases: `BinaryContent` Binary content that's guaranteed to be an image. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python class BinaryImage(BinaryContent): """Binary content that's guaranteed to be an image.""" def __init__( self, data: bytes, *, media_type: str, identifier: str | None = None, vendor_metadata: dict[str, Any] | None = None, # Required for inline-snapshot which expects all dataclass `__init__` methods to take all field names as kwargs. kind: Literal['binary'] = 'binary', _identifier: str | None = None, ): super().__init__( data=data, media_type=media_type, identifier=identifier or _identifier, vendor_metadata=vendor_metadata ) if not self.is_image: raise ValueError('`BinaryImage` must be have a media type that starts with "image/"') # pragma: no cover ``` ### ToolReturn A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturn: """A structured return value for tools that need to provide both a return value and custom content to the model. This class allows tools to return complex responses that include: - A return value for actual tool return - Custom content (including multi-modal content) to be sent to the model as a UserPromptPart - Optional metadata for application use """ return_value: Any """The return value to be used in the tool response.""" _: KW_ONLY content: str | Sequence[UserContent] | None = None """The content to be sent to the model as a UserPromptPart.""" metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" kind: Literal['tool-return'] = 'tool-return' __repr__ = _utils.dataclasses_no_defaults_repr ``` #### return_value ```python return_value: Any ``` The return value to be used in the tool response. #### content ```python content: str | Sequence[UserContent] | None = None ``` The content to be sent to the model as a UserPromptPart. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. ### UserPromptPart A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of Agent.run, Agent.run_sync, and Agent.run_stream. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class UserPromptPart: """A user prompt, generally written by the end user. Content comes from the `user_prompt` parameter of [`Agent.run`][pydantic_ai.agent.AbstractAgent.run], [`Agent.run_sync`][pydantic_ai.agent.AbstractAgent.run_sync], and [`Agent.run_stream`][pydantic_ai.agent.AbstractAgent.run_stream]. """ content: str | Sequence[UserContent] """The content of the prompt.""" _: KW_ONLY timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the prompt.""" part_kind: Literal['user-prompt'] = 'user-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def otel_event(self, settings: InstrumentationSettings) -> Event: content = [{'kind': part.pop('type'), **part} for part in self.otel_message_parts(settings)] for part in content: if part['kind'] == 'binary' and 'content' in part: part['binary_content'] = part.pop('content') content = [ part['content'] if part == {'kind': 'text', 'content': part.get('content')} else part for part in content ] if content in ([{'kind': 'text'}], [self.content]): content = content[0] return Event('gen_ai.user.message', body={'content': content, 'role': 'user'}) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] content: Sequence[UserContent] = [self.content] if isinstance(self.content, str) else self.content for part in content: if isinstance(part, str): parts.append( _otel_messages.TextPart(type='text', **({'content': part} if settings.include_content else {})) ) elif isinstance(part, ImageUrl | AudioUrl | DocumentUrl | VideoUrl): parts.append( _otel_messages.MediaUrlPart( type=part.kind, **{'url': part.url} if settings.include_content else {}, ) ) elif isinstance(part, BinaryContent): converted_part = _otel_messages.BinaryDataPart(type='binary', media_type=part.media_type) if settings.include_content and settings.include_binary_content: converted_part['content'] = base64.b64encode(part.data).decode() parts.append(converted_part) else: parts.append({'type': part.kind}) # pragma: no cover return parts __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str | Sequence[UserContent] ``` The content of the prompt. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the prompt. #### part_kind ```python part_kind: Literal['user-prompt'] = 'user-prompt' ``` Part type identifier, this is available on all parts as a discriminator. ### BaseToolReturnPart Base class for tool return parts. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolReturnPart: """Base class for tool return parts.""" tool_name: str """The name of the "tool" was called.""" content: Any """The return value.""" tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ _: KW_ONLY metadata: Any = None """Additional data that can be accessed programmatically by the application but is not sent to the LLM.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the tool returned.""" def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict if isinstance(self.content, dict): return tool_return_ta.dump_python(self.content, mode='json') # pyright: ignore[reportUnknownMemberType] else: return {'return_value': tool_return_ta.dump_python(self.content, mode='json')} def otel_event(self, settings: InstrumentationSettings) -> Event: return Event( 'gen_ai.tool.message', body={ **({'content': self.content} if settings.include_content else {}), 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, }, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: from .models.instrumented import InstrumentedModel part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, ) if settings.include_content and self.content is not None: part['result'] = InstrumentedModel.serialize_any(self.content) return [part] def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the "tool" was called. #### content ```python content: Any ``` The return value. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### metadata ```python metadata: Any = None ``` Additional data that can be accessed programmatically by the application but is not sent to the LLM. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the tool returned. #### model_response_str ```python model_response_str() -> str ``` Return a string representation of the content for the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_str(self) -> str: """Return a string representation of the content for the model.""" if isinstance(self.content, str): return self.content else: return tool_return_ta.dump_json(self.content).decode() ``` #### model_response_object ```python model_response_object() -> dict[str, Any] ``` Return a dictionary representation of the content, wrapping non-dict types appropriately. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response_object(self) -> dict[str, Any]: """Return a dictionary representation of the content, wrapping non-dict types appropriately.""" # gemini supports JSON dict return values, but no other JSON types, hence we wrap anything else in a dict if isinstance(self.content, dict): return tool_return_ta.dump_python(self.content, mode='json') # pyright: ignore[reportUnknownMemberType] else: return {'return_value': tool_return_ta.dump_python(self.content, mode='json')} ``` #### has_content ```python has_content() -> bool ``` Return `True` if the tool return has content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the tool return has content.""" return self.content is not None # pragma: no cover ``` ### ToolReturnPart Bases: `BaseToolReturnPart` A tool return message, this encodes the result of running a tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolReturnPart(BaseToolReturnPart): """A tool return message, this encodes the result of running a tool.""" _: KW_ONLY part_kind: Literal['tool-return'] = 'tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-return'] = 'tool-return' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolReturnPart Bases: `BaseToolReturnPart` A tool return message from a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolReturnPart(BaseToolReturnPart): """A tool return message from a built-in tool.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response.""" part_kind: Literal['builtin-tool-return'] = 'builtin-tool-return' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. #### part_kind ```python part_kind: Literal["builtin-tool-return"] = ( "builtin-tool-return" ) ``` Part type identifier, this is available on all parts as a discriminator. ### RetryPromptPart A message back to a model asking it to try again. This can be sent for a number of reasons: - Pydantic validation of tool arguments failed, here content is derived from a Pydantic ValidationError - a tool raised a ModelRetry exception - no tool was found for the tool name - the model returned plain text when a structured response was expected - Pydantic validation of a structured response failed, here content is derived from a Pydantic ValidationError - an output validator raised a ModelRetry exception Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class RetryPromptPart: """A message back to a model asking it to try again. This can be sent for a number of reasons: * Pydantic validation of tool arguments failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * a tool raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception * no tool was found for the tool name * the model returned plain text when a structured response was expected * Pydantic validation of a structured response failed, here content is derived from a Pydantic [`ValidationError`][pydantic_core.ValidationError] * an output validator raised a [`ModelRetry`][pydantic_ai.exceptions.ModelRetry] exception """ content: list[pydantic_core.ErrorDetails] | str """Details of why and how the model should retry. If the retry was triggered by a [`ValidationError`][pydantic_core.ValidationError], this will be a list of error details. """ _: KW_ONLY tool_name: str | None = None """The name of the tool that was called, if any.""" tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ timestamp: datetime = field(default_factory=_now_utc) """The timestamp, when the retry was triggered.""" part_kind: Literal['retry-prompt'] = 'retry-prompt' """Part type identifier, this is available on all parts as a discriminator.""" def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) description = f'{len(self.content)} validation errors: {json_errors.decode()}' return f'{description}\n\nFix the errors and try again.' def otel_event(self, settings: InstrumentationSettings) -> Event: if self.tool_name is None: return Event('gen_ai.user.message', body={'content': self.model_response(), 'role': 'user'}) else: return Event( 'gen_ai.tool.message', body={ **({'content': self.model_response()} if settings.include_content else {}), 'role': 'tool', 'id': self.tool_call_id, 'name': self.tool_name, }, ) def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: if self.tool_name is None: return [_otel_messages.TextPart(type='text', content=self.model_response())] else: part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=self.tool_call_id, name=self.tool_name, ) if settings.include_content: part['result'] = self.model_response() return [part] __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: list[ErrorDetails] | str ``` Details of why and how the model should retry. If the retry was triggered by a ValidationError, this will be a list of error details. #### tool_name ```python tool_name: str | None = None ``` The name of the tool that was called, if any. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp, when the retry was triggered. #### part_kind ```python part_kind: Literal['retry-prompt'] = 'retry-prompt' ``` Part type identifier, this is available on all parts as a discriminator. #### model_response ```python model_response() -> str ``` Return a string message describing why the retry is requested. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def model_response(self) -> str: """Return a string message describing why the retry is requested.""" if isinstance(self.content, str): if self.tool_name is None: description = f'Validation feedback:\n{self.content}' else: description = self.content else: json_errors = error_details_ta.dump_json(self.content, exclude={'__all__': {'ctx'}}, indent=2) description = f'{len(self.content)} validation errors: {json_errors.decode()}' return f'{description}\n\nFix the errors and try again.' ``` ### ModelRequestPart ```python ModelRequestPart = Annotated[ SystemPromptPart | UserPromptPart | ToolReturnPart | RetryPromptPart, Discriminator("part_kind"), ] ``` A message part sent by Pydantic AI to a model. ### ModelRequest A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelRequest: """A request generated by Pydantic AI and sent to a model, e.g. a message from the Pydantic AI app to the model.""" parts: Sequence[ModelRequestPart] """The parts of the user message.""" _: KW_ONLY instructions: str | None = None """The instructions for the model.""" kind: Literal['request'] = 'request' """Message type identifier, this is available on all parts as a discriminator.""" @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelRequestPart] ``` The parts of the user message. #### instructions ```python instructions: str | None = None ``` The instructions for the model. #### kind ```python kind: Literal['request'] = 'request' ``` Message type identifier, this is available on all parts as a discriminator. #### user_text_prompt ```python user_text_prompt( user_prompt: str, *, instructions: str | None = None ) -> ModelRequest ``` Create a `ModelRequest` with a single user prompt as text. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @classmethod def user_text_prompt(cls, user_prompt: str, *, instructions: str | None = None) -> ModelRequest: """Create a `ModelRequest` with a single user prompt as text.""" return cls(parts=[UserPromptPart(user_prompt)], instructions=instructions) ``` ### TextPart A plain text response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPart: """A plain text response from a model.""" content: str """The text content of the response.""" _: KW_ONLY id: str | None = None """An optional identifier of the text part.""" part_kind: Literal['text'] = 'text' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The text content of the response. #### id ```python id: str | None = None ``` An optional identifier of the text part. #### part_kind ```python part_kind: Literal['text'] = 'text' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the text content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the text content is non-empty.""" return bool(self.content) ``` ### ThinkingPart A thinking response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ThinkingPart: """A thinking response from a model.""" content: str """The thinking content of the response.""" _: KW_ONLY id: str | None = None """The identifier of the thinking part.""" signature: str | None = None """The signature of the thinking. Supported by: * Anthropic (corresponds to the `signature` field) * Bedrock (corresponds to the `signature` field) * Google (corresponds to the `thought_signature` field) * OpenAI (corresponds to the `encrypted_content` field) """ provider_name: str | None = None """The name of the provider that generated the response. Signatures are only sent back to the same provider. """ part_kind: Literal['thinking'] = 'thinking' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: str ``` The thinking content of the response. #### id ```python id: str | None = None ``` The identifier of the thinking part. #### signature ```python signature: str | None = None ``` The signature of the thinking. Supported by: - Anthropic (corresponds to the `signature` field) - Bedrock (corresponds to the `signature` field) - Google (corresponds to the `thought_signature` field) - OpenAI (corresponds to the `encrypted_content` field) #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Signatures are only sent back to the same provider. #### part_kind ```python part_kind: Literal['thinking'] = 'thinking' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the thinking content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the thinking content is non-empty.""" return bool(self.content) ``` ### FilePart A file response from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FilePart: """A file response from a model.""" content: Annotated[BinaryContent, pydantic.AfterValidator(BinaryImage.narrow_type)] """The file content of the response.""" _: KW_ONLY id: str | None = None """The identifier of the file part.""" provider_name: str | None = None """The name of the provider that generated the response. """ part_kind: Literal['file'] = 'file' """Part type identifier, this is available on all parts as a discriminator.""" def has_content(self) -> bool: """Return `True` if the file content is non-empty.""" return bool(self.content) # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content ```python content: Annotated[ BinaryContent, AfterValidator(narrow_type) ] ``` The file content of the response. #### id ```python id: str | None = None ``` The identifier of the file part. #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. #### part_kind ```python part_kind: Literal['file'] = 'file' ``` Part type identifier, this is available on all parts as a discriminator. #### has_content ```python has_content() -> bool ``` Return `True` if the file content is non-empty. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the file content is non-empty.""" return bool(self.content) # pragma: no cover ``` ### BaseToolCallPart A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BaseToolCallPart: """A tool call from a model.""" tool_name: str """The name of the tool to call.""" args: str | dict[str, Any] | None = None """The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. """ tool_call_id: str = field(default_factory=_generate_tool_call_id) """The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. """ _: KW_ONLY id: str | None = None """An optional identifier of the tool call part, separate from the tool call ID. This is used by some APIs like OpenAI Responses.""" def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str ``` The name of the tool to call. #### args ```python args: str | dict[str, Any] | None = None ``` The arguments to pass to the tool. This is stored either as a JSON string or a Python dictionary depending on how data was received. #### tool_call_id ```python tool_call_id: str = field( default_factory=generate_tool_call_id ) ``` The tool call identifier, this is used by some models including OpenAI. In case the tool call id is not provided by the model, Pydantic AI will generate a random one. #### id ```python id: str | None = None ``` An optional identifier of the tool call part, separate from the tool call ID. This is used by some APIs like OpenAI Responses. #### args_as_dict ```python args_as_dict() -> dict[str, Any] ``` Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_dict(self) -> dict[str, Any]: """Return the arguments as a Python dictionary. This is just for convenience with models that require dicts as input. """ if not self.args: return {} if isinstance(self.args, dict): return self.args args = pydantic_core.from_json(self.args) assert isinstance(args, dict), 'args should be a dict' return cast(dict[str, Any], args) ``` #### args_as_json_str ```python args_as_json_str() -> str ``` Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def args_as_json_str(self) -> str: """Return the arguments as a JSON string. This is just for convenience with models that require JSON strings as input. """ if not self.args: return '{}' if isinstance(self.args, str): return self.args return pydantic_core.to_json(self.args).decode() ``` #### has_content ```python has_content() -> bool ``` Return `True` if the arguments contain any data. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def has_content(self) -> bool: """Return `True` if the arguments contain any data.""" if isinstance(self.args, dict): # TODO: This should probably return True if you have the value False, or 0, etc. # It makes sense to me to ignore empty strings, but not sure about empty lists or dicts return any(self.args.values()) else: return bool(self.args) ``` ### ToolCallPart Bases: `BaseToolCallPart` A tool call from a model. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ToolCallPart(BaseToolCallPart): """A tool call from a model.""" _: KW_ONLY part_kind: Literal['tool-call'] = 'tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### part_kind ```python part_kind: Literal['tool-call'] = 'tool-call' ``` Part type identifier, this is available on all parts as a discriminator. ### BuiltinToolCallPart Bases: `BaseToolCallPart` A tool call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class BuiltinToolCallPart(BaseToolCallPart): """A tool call to a built-in tool.""" _: KW_ONLY provider_name: str | None = None """The name of the provider that generated the response. Built-in tool calls are only sent back to the same provider. """ part_kind: Literal['builtin-tool-call'] = 'builtin-tool-call' """Part type identifier, this is available on all parts as a discriminator.""" ``` #### provider_name ```python provider_name: str | None = None ``` The name of the provider that generated the response. Built-in tool calls are only sent back to the same provider. #### part_kind ```python part_kind: Literal["builtin-tool-call"] = ( "builtin-tool-call" ) ``` Part type identifier, this is available on all parts as a discriminator. ### ModelResponsePart ```python ModelResponsePart = Annotated[ TextPart | ToolCallPart | BuiltinToolCallPart | BuiltinToolReturnPart | ThinkingPart | FilePart, Discriminator("part_kind"), ] ``` A message part returned by a model. ### ModelResponse A response from a model, e.g. a message from the model to the Pydantic AI app. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class ModelResponse: """A response from a model, e.g. a message from the model to the Pydantic AI app.""" parts: Sequence[ModelResponsePart] """The parts of the model message.""" _: KW_ONLY usage: RequestUsage = field(default_factory=RequestUsage) """Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. """ model_name: str | None = None """The name of the model that generated the response.""" timestamp: datetime = field(default_factory=_now_utc) """The timestamp of the response. If the model provides a timestamp in the response (as OpenAI does) that will be used. """ kind: Literal['response'] = 'response' """Message type identifier, this is available on all parts as a discriminator.""" provider_name: str | None = None """The name of the LLM provider that generated the response.""" provider_details: Annotated[ dict[str, Any] | None, # `vendor_details` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_details', 'vendor_details')), ] = None """Additional provider-specific details in a serializable format. This allows storing selected vendor-specific data that isn't mapped to standard ModelResponse fields. For OpenAI models, this may include 'logprobs', 'finish_reason', etc. """ provider_response_id: Annotated[ str | None, # `vendor_id` is deprecated, but we still want to support deserializing model responses stored in a DB before the name was changed pydantic.Field(validation_alias=pydantic.AliasChoices('provider_response_id', 'vendor_id')), ] = None """request ID as specified by the model provider. This can be used to track the specific request to the model.""" finish_reason: FinishReason | None = None """Reason the model finished generating the response, normalized to OpenTelemetry values.""" @property def text(self) -> str | None: """Get the text in the response.""" texts: list[str] = [] last_part: ModelResponsePart | None = None for part in self.parts: if isinstance(part, TextPart): # Adjacent text parts should be joined together, but if there are parts in between # (like built-in tool calls) they should have newlines between them if isinstance(last_part, TextPart): texts[-1] += part.content else: texts.append(part.content) last_part = part if not texts: return None return '\n\n'.join(texts) @property def thinking(self) -> str | None: """Get the thinking in the response.""" thinking_parts = [part.content for part in self.parts if isinstance(part, ThinkingPart)] if not thinking_parts: return None return '\n\n'.join(thinking_parts) @property def files(self) -> list[BinaryContent]: """Get the files in the response.""" return [part.content for part in self.parts if isinstance(part, FilePart)] @property def images(self) -> list[BinaryImage]: """Get the images in the response.""" return [file for file in self.files if isinstance(file, BinaryImage)] @property def tool_calls(self) -> list[ToolCallPart]: """Get the tool calls in the response.""" return [part for part in self.parts if isinstance(part, ToolCallPart)] @property def builtin_tool_calls(self) -> list[tuple[BuiltinToolCallPart, BuiltinToolReturnPart]]: """Get the builtin tool calls and results in the response.""" calls = [part for part in self.parts if isinstance(part, BuiltinToolCallPart)] if not calls: return [] returns_by_id = {part.tool_call_id: part for part in self.parts if isinstance(part, BuiltinToolReturnPart)} return [ (call_part, returns_by_id[call_part.tool_call_id]) for call_part in calls if call_part.tool_call_id in returns_by_id ] @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) def otel_events(self, settings: InstrumentationSettings) -> list[Event]: """Return OpenTelemetry events for the response.""" result: list[Event] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = Event('gen_ai.assistant.message', body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) elif isinstance(part, FilePart): body.setdefault('content', []).append( { 'kind': 'binary', 'media_type': part.content.media_type, **( {'binary_content': base64.b64encode(part.content.data).decode()} if settings.include_content and settings.include_binary_content else {} ), } ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result def otel_message_parts(self, settings: InstrumentationSettings) -> list[_otel_messages.MessagePart]: parts: list[_otel_messages.MessagePart] = [] for part in self.parts: if isinstance(part, TextPart): parts.append( _otel_messages.TextPart( type='text', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, ThinkingPart): parts.append( _otel_messages.ThinkingPart( type='thinking', **({'content': part.content} if settings.include_content else {}), ) ) elif isinstance(part, FilePart): converted_part = _otel_messages.BinaryDataPart(type='binary', media_type=part.content.media_type) if settings.include_content and settings.include_binary_content: converted_part['content'] = base64.b64encode(part.content.data).decode() parts.append(converted_part) elif isinstance(part, BaseToolCallPart): call_part = _otel_messages.ToolCallPart(type='tool_call', id=part.tool_call_id, name=part.tool_name) if isinstance(part, BuiltinToolCallPart): call_part['builtin'] = True if settings.include_content and part.args is not None: from .models.instrumented import InstrumentedModel if isinstance(part.args, str): call_part['arguments'] = part.args else: call_part['arguments'] = {k: InstrumentedModel.serialize_any(v) for k, v in part.args.items()} parts.append(call_part) elif isinstance(part, BuiltinToolReturnPart): return_part = _otel_messages.ToolCallResponsePart( type='tool_call_response', id=part.tool_call_id, name=part.tool_name, builtin=True, ) if settings.include_content and part.content is not None: # pragma: no branch from .models.instrumented import InstrumentedModel return_part['result'] = InstrumentedModel.serialize_any(part.content) parts.append(return_part) return parts @property @deprecated('`vendor_details` is deprecated, use `provider_details` instead') def vendor_details(self) -> dict[str, Any] | None: return self.provider_details @property @deprecated('`vendor_id` is deprecated, use `provider_response_id` instead') def vendor_id(self) -> str | None: return self.provider_response_id @property @deprecated('`provider_request_id` is deprecated, use `provider_response_id` instead') def provider_request_id(self) -> str | None: return self.provider_response_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### parts ```python parts: Sequence[ModelResponsePart] ``` The parts of the model message. #### usage ```python usage: RequestUsage = field(default_factory=RequestUsage) ``` Usage information for the request. This has a default to make tests easier, and to support loading old messages where usage will be missing. #### model_name ```python model_name: str | None = None ``` The name of the model that generated the response. #### timestamp ```python timestamp: datetime = field(default_factory=now_utc) ``` The timestamp of the response. If the model provides a timestamp in the response (as OpenAI does) that will be used. #### kind ```python kind: Literal['response'] = 'response' ``` Message type identifier, this is available on all parts as a discriminator. #### provider_name ```python provider_name: str | None = None ``` The name of the LLM provider that generated the response. #### provider_details ```python provider_details: Annotated[ dict[str, Any] | None, Field( validation_alias=AliasChoices( provider_details, vendor_details ) ), ] = None ``` Additional provider-specific details in a serializable format. This allows storing selected vendor-specific data that isn't mapped to standard ModelResponse fields. For OpenAI models, this may include 'logprobs', 'finish_reason', etc. #### provider_response_id ```python provider_response_id: Annotated[ str | None, Field( validation_alias=AliasChoices( provider_response_id, vendor_id ) ), ] = None ``` request ID as specified by the model provider. This can be used to track the specific request to the model. #### finish_reason ```python finish_reason: FinishReason | None = None ``` Reason the model finished generating the response, normalized to OpenTelemetry values. #### text ```python text: str | None ``` Get the text in the response. #### thinking ```python thinking: str | None ``` Get the thinking in the response. #### files ```python files: list[BinaryContent] ``` Get the files in the response. #### images ```python images: list[BinaryImage] ``` Get the images in the response. #### tool_calls ```python tool_calls: list[ToolCallPart] ``` Get the tool calls in the response. #### builtin_tool_calls ```python builtin_tool_calls: list[ tuple[BuiltinToolCallPart, BuiltinToolReturnPart] ] ``` Get the builtin tool calls and results in the response. #### price ```python price() -> PriceCalculation ``` Deprecated `price` is deprecated, use `cost` instead Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated('`price` is deprecated, use `cost` instead') def price(self) -> genai_types.PriceCalculation: # pragma: no cover return self.cost() ``` #### cost ```python cost() -> PriceCalculation ``` Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def cost(self) -> genai_types.PriceCalculation: """Calculate the cost of the usage. Uses [`genai-prices`](https://github.com/pydantic/genai-prices). """ assert self.model_name, 'Model name is required to calculate price' return calc_price( self.usage, self.model_name, provider_id=self.provider_name, genai_request_timestamp=self.timestamp, ) ``` #### otel_events ```python otel_events( settings: InstrumentationSettings, ) -> list[Event] ``` Return OpenTelemetry events for the response. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def otel_events(self, settings: InstrumentationSettings) -> list[Event]: """Return OpenTelemetry events for the response.""" result: list[Event] = [] def new_event_body(): new_body: dict[str, Any] = {'role': 'assistant'} ev = Event('gen_ai.assistant.message', body=new_body) result.append(ev) return new_body body = new_event_body() for part in self.parts: if isinstance(part, ToolCallPart): body.setdefault('tool_calls', []).append( { 'id': part.tool_call_id, 'type': 'function', 'function': { 'name': part.tool_name, **({'arguments': part.args} if settings.include_content else {}), }, } ) elif isinstance(part, TextPart | ThinkingPart): kind = part.part_kind body.setdefault('content', []).append( {'kind': kind, **({'text': part.content} if settings.include_content else {})} ) elif isinstance(part, FilePart): body.setdefault('content', []).append( { 'kind': 'binary', 'media_type': part.content.media_type, **( {'binary_content': base64.b64encode(part.content.data).decode()} if settings.include_content and settings.include_binary_content else {} ), } ) if content := body.get('content'): text_content = content[0].get('text') if content == [{'kind': 'text', 'text': text_content}]: body['content'] = text_content return result ``` ### ModelMessage ```python ModelMessage = Annotated[ ModelRequest | ModelResponse, Discriminator("kind") ] ``` Any message sent to or returned by a model. ### ModelMessagesTypeAdapter ```python ModelMessagesTypeAdapter = TypeAdapter( list[ModelMessage], config=ConfigDict( defer_build=True, ser_json_bytes="base64", val_json_bytes="base64", ), ) ``` Pydantic TypeAdapter for (de)serializing messages. ### TextPartDelta A partial update (delta) for a `TextPart` to append new text content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class TextPartDelta: """A partial update (delta) for a `TextPart` to append new text content.""" content_delta: str """The incremental text content to add to the existing `TextPart` content.""" _: KW_ONLY part_delta_kind: Literal['text'] = 'text' """Part delta type identifier, used as a discriminator.""" def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace(part, content=part.content + self.content_delta) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str ``` The incremental text content to add to the existing `TextPart` content. #### part_delta_kind ```python part_delta_kind: Literal['text'] = 'text' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> TextPart ``` Apply this text delta to an existing `TextPart`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart` | The existing model response part, which must be a TextPart. | *required* | Returns: | Type | Description | | --- | --- | | `TextPart` | A new TextPart with updated text content. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is not a TextPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart) -> TextPart: """Apply this text delta to an existing `TextPart`. Args: part: The existing model response part, which must be a `TextPart`. Returns: A new `TextPart` with updated text content. Raises: ValueError: If `part` is not a `TextPart`. """ if not isinstance(part, TextPart): raise ValueError('Cannot apply TextPartDeltas to non-TextParts') # pragma: no cover return replace(part, content=part.content + self.content_delta) ``` ### ThinkingPartDelta A partial update (delta) for a `ThinkingPart` to append new thinking content. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ThinkingPartDelta: """A partial update (delta) for a `ThinkingPart` to append new thinking content.""" content_delta: str | None = None """The incremental thinking content to add to the existing `ThinkingPart` content.""" signature_delta: str | None = None """Optional signature delta. Note this is never treated as a delta — it can replace None. """ provider_name: str | None = None """Optional provider name for the thinking part. Signatures are only sent back to the same provider. """ part_delta_kind: Literal['thinking'] = 'thinking' """Part delta type identifier, used as a discriminator.""" @overload def apply(self, part: ModelResponsePart) -> ThinkingPart: ... @overload def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: ... def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name return replace(part, content=new_content, signature=new_signature, provider_name=new_provider_name) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### content_delta ```python content_delta: str | None = None ``` The incremental thinking content to add to the existing `ThinkingPart` content. #### signature_delta ```python signature_delta: str | None = None ``` Optional signature delta. Note this is never treated as a delta — it can replace None. #### provider_name ```python provider_name: str | None = None ``` Optional provider name for the thinking part. Signatures are only sent back to the same provider. #### part_delta_kind ```python part_delta_kind: Literal['thinking'] = 'thinking' ``` Part delta type identifier, used as a discriminator. #### apply ```python apply(part: ModelResponsePart) -> ThinkingPart ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` ```python apply( part: ModelResponsePart | ThinkingPartDelta, ) -> ThinkingPart | ThinkingPartDelta ``` Apply this thinking delta to an existing `ThinkingPart`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart | ThinkingPartDelta` | The existing model response part, which must be a ThinkingPart. | *required* | Returns: | Type | Description | | --- | --- | | `ThinkingPart | ThinkingPartDelta` | A new ThinkingPart with updated thinking content. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is not a ThinkingPart. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply(self, part: ModelResponsePart | ThinkingPartDelta) -> ThinkingPart | ThinkingPartDelta: """Apply this thinking delta to an existing `ThinkingPart`. Args: part: The existing model response part, which must be a `ThinkingPart`. Returns: A new `ThinkingPart` with updated thinking content. Raises: ValueError: If `part` is not a `ThinkingPart`. """ if isinstance(part, ThinkingPart): new_content = part.content + self.content_delta if self.content_delta else part.content new_signature = self.signature_delta if self.signature_delta is not None else part.signature new_provider_name = self.provider_name if self.provider_name is not None else part.provider_name return replace(part, content=new_content, signature=new_signature, provider_name=new_provider_name) elif isinstance(part, ThinkingPartDelta): if self.content_delta is None and self.signature_delta is None: raise ValueError('Cannot apply ThinkingPartDelta with no content or signature') if self.content_delta is not None: part = replace(part, content_delta=(part.content_delta or '') + self.content_delta) if self.signature_delta is not None: part = replace(part, signature_delta=self.signature_delta) if self.provider_name is not None: part = replace(part, provider_name=self.provider_name) return part raise ValueError( # pragma: no cover f'Cannot apply ThinkingPartDeltas to non-ThinkingParts or non-ThinkingPartDeltas ({part=}, {self=})' ) ``` ### ToolCallPartDelta A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class ToolCallPartDelta: """A partial update (delta) for a `ToolCallPart` to modify tool name, arguments, or tool call ID.""" tool_name_delta: str | None = None """Incremental text to add to the existing tool name, if any.""" args_delta: str | dict[str, Any] | None = None """Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. """ tool_call_id: str | None = None """Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised.""" part_delta_kind: Literal['tool_call'] = 'tool_call' """Part delta type identifier, used as a discriminator.""" def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart(self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id()) @overload def apply(self, part: ModelResponsePart) -> ToolCallPart | BuiltinToolCallPart: ... @overload def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: ... def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart | BuiltinToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts, BuiltinToolCallParts, or ToolCallPartDeltas, not {part}' ) def _apply_to_delta(self, delta: ToolCallPartDelta) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Internal helper to apply this delta to another delta.""" if self.tool_name_delta: # Append incremental text to the existing tool_name_delta updated_tool_name_delta = (delta.tool_name_delta or '') + self.tool_name_delta delta = replace(delta, tool_name_delta=updated_tool_name_delta) if isinstance(self.args_delta, str): if isinstance(delta.args_delta, dict): raise UnexpectedModelBehavior( f'Cannot apply JSON deltas to non-JSON tool arguments ({delta=}, {self=})' ) updated_args_delta = (delta.args_delta or '') + self.args_delta delta = replace(delta, args_delta=updated_args_delta) elif isinstance(self.args_delta, dict): if isinstance(delta.args_delta, str): raise UnexpectedModelBehavior( f'Cannot apply dict deltas to non-dict tool arguments ({delta=}, {self=})' ) updated_args_delta = {**(delta.args_delta or {}), **self.args_delta} delta = replace(delta, args_delta=updated_args_delta) if self.tool_call_id: delta = replace(delta, tool_call_id=self.tool_call_id) # If we now have enough data to create a full ToolCallPart, do so if delta.tool_name_delta is not None: return ToolCallPart(delta.tool_name_delta, delta.args_delta, delta.tool_call_id or _generate_tool_call_id()) return delta def _apply_to_part(self, part: ToolCallPart | BuiltinToolCallPart) -> ToolCallPart | BuiltinToolCallPart: """Internal helper to apply this delta directly to a `ToolCallPart` or `BuiltinToolCallPart`.""" if self.tool_name_delta: # Append incremental text to the existing tool_name tool_name = part.tool_name + self.tool_name_delta part = replace(part, tool_name=tool_name) if isinstance(self.args_delta, str): if isinstance(part.args, dict): raise UnexpectedModelBehavior(f'Cannot apply JSON deltas to non-JSON tool arguments ({part=}, {self=})') updated_json = (part.args or '') + self.args_delta part = replace(part, args=updated_json) elif isinstance(self.args_delta, dict): if isinstance(part.args, str): raise UnexpectedModelBehavior(f'Cannot apply dict deltas to non-dict tool arguments ({part=}, {self=})') updated_dict = {**(part.args or {}), **self.args_delta} part = replace(part, args=updated_dict) if self.tool_call_id: part = replace(part, tool_call_id=self.tool_call_id) return part __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name_delta ```python tool_name_delta: str | None = None ``` Incremental text to add to the existing tool name, if any. #### args_delta ```python args_delta: str | dict[str, Any] | None = None ``` Incremental data to add to the tool arguments. If this is a string, it will be appended to existing JSON arguments. If this is a dict, it will be merged with existing dict arguments. #### tool_call_id ```python tool_call_id: str | None = None ``` Optional tool call identifier, this is used by some models including OpenAI. Note this is never treated as a delta — it can replace None, but otherwise if a non-matching value is provided an error will be raised. #### part_delta_kind ```python part_delta_kind: Literal['tool_call'] = 'tool_call' ``` Part delta type identifier, used as a discriminator. #### as_part ```python as_part() -> ToolCallPart | None ``` Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: | Type | Description | | --- | --- | | `ToolCallPart | None` | A ToolCallPart if tool_name_delta is set, otherwise None. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def as_part(self) -> ToolCallPart | None: """Convert this delta to a fully formed `ToolCallPart` if possible, otherwise return `None`. Returns: A `ToolCallPart` if `tool_name_delta` is set, otherwise `None`. """ if self.tool_name_delta is None: return None return ToolCallPart(self.tool_name_delta, self.args_delta, self.tool_call_id or _generate_tool_call_id()) ``` #### apply ```python apply( part: ModelResponsePart, ) -> ToolCallPart | BuiltinToolCallPart ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta ``` ```python apply( part: ModelResponsePart | ToolCallPartDelta, ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta ``` Apply this delta to a part or delta, returning a new part or delta with the changes applied. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `part` | `ModelResponsePart | ToolCallPartDelta` | The existing model response part or delta to update. | *required* | Returns: | Type | Description | | --- | --- | | `ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta` | Either a new ToolCallPart or BuiltinToolCallPart, or an updated ToolCallPartDelta. | Raises: | Type | Description | | --- | --- | | `ValueError` | If part is neither a ToolCallPart, BuiltinToolCallPart, nor a ToolCallPartDelta. | | `UnexpectedModelBehavior` | If applying JSON deltas to dict arguments or vice versa. | Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python def apply( self, part: ModelResponsePart | ToolCallPartDelta ) -> ToolCallPart | BuiltinToolCallPart | ToolCallPartDelta: """Apply this delta to a part or delta, returning a new part or delta with the changes applied. Args: part: The existing model response part or delta to update. Returns: Either a new `ToolCallPart` or `BuiltinToolCallPart`, or an updated `ToolCallPartDelta`. Raises: ValueError: If `part` is neither a `ToolCallPart`, `BuiltinToolCallPart`, nor a `ToolCallPartDelta`. UnexpectedModelBehavior: If applying JSON deltas to dict arguments or vice versa. """ if isinstance(part, ToolCallPart | BuiltinToolCallPart): return self._apply_to_part(part) if isinstance(part, ToolCallPartDelta): return self._apply_to_delta(part) raise ValueError( # pragma: no cover f'Can only apply ToolCallPartDeltas to ToolCallParts, BuiltinToolCallParts, or ToolCallPartDeltas, not {part}' ) ``` ### ModelResponsePartDelta ```python ModelResponsePartDelta = Annotated[ TextPartDelta | ThinkingPartDelta | ToolCallPartDelta, Discriminator("part_delta_kind"), ] ``` A partial update (delta) for any model response part. ### PartStartEvent An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartStartEvent: """An event indicating that a new part has started. If multiple `PartStartEvent`s are received with the same index, the new one should fully replace the old one. """ index: int """The index of the part within the overall response parts list.""" part: ModelResponsePart """The newly started `ModelResponsePart`.""" event_kind: Literal['part_start'] = 'part_start' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### part ```python part: ModelResponsePart ``` The newly started `ModelResponsePart`. #### event_kind ```python event_kind: Literal['part_start'] = 'part_start' ``` Event type identifier, used as a discriminator. ### PartDeltaEvent An event indicating a delta update for an existing part. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class PartDeltaEvent: """An event indicating a delta update for an existing part.""" index: int """The index of the part within the overall response parts list.""" delta: ModelResponsePartDelta """The delta to apply to the specified part.""" event_kind: Literal['part_delta'] = 'part_delta' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### index ```python index: int ``` The index of the part within the overall response parts list. #### delta ```python delta: ModelResponsePartDelta ``` The delta to apply to the specified part. #### event_kind ```python event_kind: Literal['part_delta'] = 'part_delta' ``` Event type identifier, used as a discriminator. ### FinalResultEvent An event indicating the response to the current model request matches the output schema and will produce a result. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False, kw_only=True) class FinalResultEvent: """An event indicating the response to the current model request matches the output schema and will produce a result.""" tool_name: str | None """The name of the output tool that was called. `None` if the result is from text content and not from a tool.""" tool_call_id: str | None """The tool call ID, if any, that this result is associated with.""" event_kind: Literal['final_result'] = 'final_result' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### tool_name ```python tool_name: str | None ``` The name of the output tool that was called. `None` if the result is from text content and not from a tool. #### tool_call_id ```python tool_call_id: str | None ``` The tool call ID, if any, that this result is associated with. #### event_kind ```python event_kind: Literal['final_result'] = 'final_result' ``` Event type identifier, used as a discriminator. ### ModelResponseStreamEvent ```python ModelResponseStreamEvent = Annotated[ PartStartEvent | PartDeltaEvent | FinalResultEvent, Discriminator("event_kind"), ] ``` An event in the model response stream, starting a new part, applying a delta to an existing one, or indicating the final result. ### FunctionToolCallEvent An event indicating the start to a call to a function tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolCallEvent: """An event indicating the start to a call to a function tool.""" part: ToolCallPart """The (function) tool call to make.""" _: KW_ONLY event_kind: Literal['function_tool_call'] = 'function_tool_call' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id @property @deprecated('`call_id` is deprecated, use `tool_call_id` instead.') def call_id(self) -> str: """An ID used for matching details about the call to its result.""" return self.part.tool_call_id # pragma: no cover __repr__ = _utils.dataclasses_no_defaults_repr ``` #### part ```python part: ToolCallPart ``` The (function) tool call to make. #### event_kind ```python event_kind: Literal["function_tool_call"] = ( "function_tool_call" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used for matching details about the call to its result. #### call_id ```python call_id: str ``` An ID used for matching details about the call to its result. ### FunctionToolResultEvent An event indicating the result of a function tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @dataclass(repr=False) class FunctionToolResultEvent: """An event indicating the result of a function tool call.""" result: ToolReturnPart | RetryPromptPart """The result of the call to the function tool.""" _: KW_ONLY content: str | Sequence[UserContent] | None = None """The content that will be sent to the model as a UserPromptPart following the result.""" event_kind: Literal['function_tool_result'] = 'function_tool_result' """Event type identifier, used as a discriminator.""" @property def tool_call_id(self) -> str: """An ID used to match the result to its original call.""" return self.result.tool_call_id __repr__ = _utils.dataclasses_no_defaults_repr ``` #### result ```python result: ToolReturnPart | RetryPromptPart ``` The result of the call to the function tool. #### content ```python content: str | Sequence[UserContent] | None = None ``` The content that will be sent to the model as a UserPromptPart following the result. #### event_kind ```python event_kind: Literal["function_tool_result"] = ( "function_tool_result" ) ``` Event type identifier, used as a discriminator. #### tool_call_id ```python tool_call_id: str ``` An ID used to match the result to its original call. ### BuiltinToolCallEvent Deprecated `BuiltinToolCallEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolCallPart` instead. An event indicating the start to a call to a built-in tool. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated( '`BuiltinToolCallEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolCallPart` instead.' ) @dataclass(repr=False) class BuiltinToolCallEvent: """An event indicating the start to a call to a built-in tool.""" part: BuiltinToolCallPart """The built-in tool call to make.""" _: KW_ONLY event_kind: Literal['builtin_tool_call'] = 'builtin_tool_call' """Event type identifier, used as a discriminator.""" ``` #### part ```python part: BuiltinToolCallPart ``` The built-in tool call to make. #### event_kind ```python event_kind: Literal["builtin_tool_call"] = ( "builtin_tool_call" ) ``` Event type identifier, used as a discriminator. ### BuiltinToolResultEvent Deprecated `BuiltinToolResultEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolReturnPart` instead. An event indicating the result of a built-in tool call. Source code in `pydantic_ai_slim/pydantic_ai/messages.py` ```python @deprecated( '`BuiltinToolResultEvent` is deprecated, look for `PartStartEvent` and `PartDeltaEvent` with `BuiltinToolReturnPart` instead.' ) @dataclass(repr=False) class BuiltinToolResultEvent: """An event indicating the result of a built-in tool call.""" result: BuiltinToolReturnPart """The result of the call to the built-in tool.""" _: KW_ONLY event_kind: Literal['builtin_tool_result'] = 'builtin_tool_result' """Event type identifier, used as a discriminator.""" ``` #### result ```python result: BuiltinToolReturnPart ``` The result of the call to the built-in tool. #### event_kind ```python event_kind: Literal["builtin_tool_result"] = ( "builtin_tool_result" ) ``` Event type identifier, used as a discriminator. ### HandleResponseEvent ```python HandleResponseEvent = Annotated[ FunctionToolCallEvent | FunctionToolResultEvent | BuiltinToolCallEvent | BuiltinToolResultEvent, Discriminator("event_kind"), ] ``` An event yielded when handling a model response, indicating tool calls and results. ### AgentStreamEvent ```python AgentStreamEvent = Annotated[ ModelResponseStreamEvent | HandleResponseEvent, Discriminator("event_kind"), ] ``` An event in the agent stream: model response stream events and response-handling events. # `pydantic_ai.output` ### OutputDataT ```python OutputDataT = TypeVar( "OutputDataT", default=str, covariant=True ) ``` Covariant type variable for the output data type of a run. ### ToolOutput Bases: `Generic[OutputDataT]` Marker class to use a tool for output and optionally customize the tool. Example: tool_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-4o', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class ToolOutput(Generic[OutputDataT]): """Marker class to use a tool for output and optionally customize the tool. Example: ```python {title="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, ToolOutput class Fruit(BaseModel): name: str color: str class Vehicle(BaseModel): name: str wheels: int agent = Agent( 'openai:gpt-4o', output_type=[ ToolOutput(Fruit, name='return_fruit'), ToolOutput(Vehicle, name='return_vehicle'), ], ) result = agent.run_sync('What is a banana?') print(repr(result.output)) #> Fruit(name='banana', color='yellow') ``` """ output: OutputTypeOrFunction[OutputDataT] """An output type or function.""" name: str | None """The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name.""" description: str | None """The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used.""" max_retries: int | None """The maximum number of retries for the tool.""" strict: bool | None """Whether to use strict mode for the tool.""" def __init__( self, type_: OutputTypeOrFunction[OutputDataT], *, name: str | None = None, description: str | None = None, max_retries: int | None = None, strict: bool | None = None, ): self.output = type_ self.name = name self.description = description self.max_retries = max_retries self.strict = strict ```` #### output ```python output: OutputTypeOrFunction[OutputDataT] = type_ ``` An output type or function. #### name ```python name: str | None = name ``` The name of the tool that will be passed to the model. If not specified and only one output is provided, `final_result` will be used. If multiple outputs are provided, the name of the output type or function will be added to the tool name. #### description ```python description: str | None = description ``` The description of the tool that will be passed to the model. If not specified, the docstring of the output type or function will be used. #### max_retries ```python max_retries: int | None = max_retries ``` The maximum number of retries for the tool. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the tool. ### NativeOutput Bases: `Generic[OutputDataT]` Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: native_output.py ```python from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-4o', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class NativeOutput(Generic[OutputDataT]): """Marker class to use the model's native structured outputs functionality for outputs and optionally customize the name and description. Example: ```python {title="native_output.py" requires="tool_output.py"} from pydantic_ai import Agent, NativeOutput from tool_output import Fruit, Vehicle agent = Agent( 'openai:gpt-4o', output_type=NativeOutput( [Fruit, Vehicle], name='Fruit or vehicle', description='Return a fruit or vehicle.' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" strict: bool | None """Whether to use strict mode for the output, if the model supports it.""" def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, strict: bool | None = None, ): self.outputs = outputs self.name = name self.description = description self.strict = strict ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description of the structured output that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### strict ```python strict: bool | None = strict ``` Whether to use strict mode for the output, if the model supports it. ### PromptedOutput Bases: `Generic[OutputDataT]` Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: prompted_output.py ```python from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass(init=False) class PromptedOutput(Generic[OutputDataT]): """Marker class to use a prompt to tell the model what to output and optionally customize the prompt. Example: ```python {title="prompted_output.py" requires="tool_output.py"} from pydantic import BaseModel from pydantic_ai import Agent, PromptedOutput from tool_output import Vehicle class Device(BaseModel): name: str kind: str agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], name='Vehicle or device', description='Return a vehicle or device.' ), ) result = agent.run_sync('What is a MacBook?') print(repr(result.output)) #> Device(name='MacBook', kind='laptop') agent = Agent( 'openai:gpt-4o', output_type=PromptedOutput( [Vehicle, Device], template='Gimme some JSON: {schema}' ), ) result = agent.run_sync('What is a Ford Explorer?') print(repr(result.output)) #> Vehicle(name='Ford Explorer', wheels=4) ``` """ outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] """The output types or functions.""" name: str | None """The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used.""" description: str | None """The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used.""" template: str | None """Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. """ def __init__( self, outputs: OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]], *, name: str | None = None, description: str | None = None, template: str | None = None, ): self.outputs = outputs self.name = name self.description = description self.template = template ```` #### outputs ```python outputs: ( OutputTypeOrFunction[OutputDataT] | Sequence[OutputTypeOrFunction[OutputDataT]] ) = outputs ``` The output types or functions. #### name ```python name: str | None = name ``` The name of the structured output that will be passed to the model. If not specified and only one output is provided, the name of the output type or function will be used. #### description ```python description: str | None = description ``` The description that will be passed to the model. If not specified and only one output is provided, the docstring of the output type or function will be used. #### template ```python template: str | None = template ``` Template for the prompt passed to the model. The '{schema}' placeholder will be replaced with the output JSON schema. If not specified, the default template specified on the model's profile will be used. ### TextOutput Bases: `Generic[OutputDataT]` Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-4o', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python @dataclass class TextOutput(Generic[OutputDataT]): """Marker class to use text output for an output function taking a string argument. Example: ```python from pydantic_ai import Agent, TextOutput def split_into_words(text: str) -> list[str]: return text.split() agent = Agent( 'openai:gpt-4o', output_type=TextOutput(split_into_words), ) result = agent.run_sync('Who was Albert Einstein?') print(result.output) #> ['Albert', 'Einstein', 'was', 'a', 'German-born', 'theoretical', 'physicist.'] ``` """ output_function: TextOutputFunc[OutputDataT] """The function that will be called to process the model's plain text output. The function must take a single string argument.""" ```` #### output_function ```python output_function: TextOutputFunc[OutputDataT] ``` The function that will be called to process the model's plain text output. The function must take a single string argument. ### StructuredDict ```python StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None, ) -> type[JsonSchemaValue] ``` Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `json_schema` | `JsonSchemaValue` | A JSON schema of type object defining the structure of the dictionary content. | *required* | | `name` | `str | None` | Optional name of the structured output. If not provided, the title field of the JSON schema will be used if it's present. | `None` | | `description` | `str | None` | Optional description of the structured output. If not provided, the description field of the JSON schema will be used if it's present. | `None` | Example: structured_dict.py ```python from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-4o', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` Source code in `pydantic_ai_slim/pydantic_ai/output.py` ````python def StructuredDict( json_schema: JsonSchemaValue, name: str | None = None, description: str | None = None ) -> type[JsonSchemaValue]: """Returns a `dict[str, Any]` subclass with a JSON schema attached that will be used for structured output. Args: json_schema: A JSON schema of type `object` defining the structure of the dictionary content. name: Optional name of the structured output. If not provided, the `title` field of the JSON schema will be used if it's present. description: Optional description of the structured output. If not provided, the `description` field of the JSON schema will be used if it's present. Example: ```python {title="structured_dict.py"} from pydantic_ai import Agent, StructuredDict schema = { 'type': 'object', 'properties': { 'name': {'type': 'string'}, 'age': {'type': 'integer'} }, 'required': ['name', 'age'] } agent = Agent('openai:gpt-4o', output_type=StructuredDict(schema)) result = agent.run_sync('Create a person') print(result.output) #> {'name': 'John Doe', 'age': 30} ``` """ json_schema = _utils.check_object_json_schema(json_schema) # Pydantic `TypeAdapter` fails when `object.__get_pydantic_json_schema__` has `$defs`, so we inline them # See https://github.com/pydantic/pydantic/issues/12145 if '$defs' in json_schema: json_schema = InlineDefsJsonSchemaTransformer(json_schema).walk() if '$defs' in json_schema: raise exceptions.UserError( '`StructuredDict` does not currently support recursive `$ref`s and `$defs`. See https://github.com/pydantic/pydantic/issues/12145 for more information.' ) if name: json_schema['title'] = name if description: json_schema['description'] = description class _StructuredDict(JsonSchemaValue): __is_model_like__ = True @classmethod def __get_pydantic_core_schema__( cls, source_type: Any, handler: GetCoreSchemaHandler ) -> core_schema.CoreSchema: return core_schema.dict_schema( keys_schema=core_schema.str_schema(), values_schema=core_schema.any_schema(), ) @classmethod def __get_pydantic_json_schema__( cls, core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler ) -> JsonSchemaValue: return json_schema return _StructuredDict ```` ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list) """Tool calls that require human-in-the-loop approval.""" ``` #### calls ```python calls: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require human-in-the-loop approval. # `pydantic_ai.profiles` Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @dataclass(kw_only=True) class ModelProfile: """Describes how requests to and responses from specific models or families of models need to be constructed and processed to get the best results, independent of the model and provider classes used.""" supports_tools: bool = True """Whether the model supports tools.""" supports_json_schema_output: bool = False """Whether the model supports JSON schema output.""" supports_json_object_output: bool = False """Whether the model supports JSON object output.""" supports_image_output: bool = False """Whether the model supports image output.""" default_structured_output_mode: StructuredOutputMode = 'tool' """The default structured output mode to use for the model.""" prompted_output_template: str = dedent( """ Always respond with a JSON object that's compatible with this schema: {schema} Don't include any text or Markdown fencing before or after. """ ) """The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output.""" json_schema_transformer: type[JsonSchemaTransformer] | None = None """The transformer to use to make JSON schemas for tools and structured output compatible with the model.""" thinking_tags: tuple[str, str] = ('', '') """The tags used to indicate thinking parts in the model's output. Defaults to ('', '').""" ignore_streamed_leading_whitespace: bool = False """Whether to ignore leading whitespace when streaming a response. This is a workaround for models that emit `\n\n\n` or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using `run_stream` with `str` a valid `output_type`. This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. """ @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### supports_tools ```python supports_tools: bool = True ``` Whether the model supports tools. ### supports_json_schema_output ```python supports_json_schema_output: bool = False ``` Whether the model supports JSON schema output. ### supports_json_object_output ```python supports_json_object_output: bool = False ``` Whether the model supports JSON object output. ### supports_image_output ```python supports_image_output: bool = False ``` Whether the model supports image output. ### default_structured_output_mode ```python default_structured_output_mode: StructuredOutputMode = ( "tool" ) ``` The default structured output mode to use for the model. ### prompted_output_template ```python prompted_output_template: str = dedent( "\n Always respond with a JSON object that's compatible with this schema:\n\n {schema}\n\n Don't include any text or Markdown fencing before or after.\n " ) ``` The instructions template to use for prompted structured output. The '{schema}' placeholder will be replaced with the JSON schema for the output. ### json_schema_transformer ```python json_schema_transformer: ( type[JsonSchemaTransformer] | None ) = None ``` The transformer to use to make JSON schemas for tools and structured output compatible with the model. ### thinking_tags ```python thinking_tags: tuple[str, str] = ('', '') ``` The tags used to indicate thinking parts in the model's output. Defaults to ('', ''). ### ignore_streamed_leading_whitespace ```python ignore_streamed_leading_whitespace: bool = False ``` Whether to ignore leading whitespace when streaming a response. ```text This is a workaround for models that emit ` ``` `or an empty text part ahead of tool calls (e.g. Ollama + Qwen3), which we don't want to end up treating as a final result when using`run_stream`with`str`a valid`output_type\`. ```text This is currently only used by `OpenAIChatModel`, `HuggingFaceModel`, and `GroqModel`. ``` ### from_profile ```python from_profile(profile: ModelProfile | None) -> Self ``` Build a ModelProfile subclass instance from a ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python @classmethod def from_profile(cls, profile: ModelProfile | None) -> Self: """Build a ModelProfile subclass instance from a ModelProfile instance.""" if isinstance(profile, cls): return profile return cls().update(profile) ``` ### update ```python update(profile: ModelProfile | None) -> Self ``` Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance. Source code in `pydantic_ai_slim/pydantic_ai/profiles/__init__.py` ```python def update(self, profile: ModelProfile | None) -> Self: """Update this ModelProfile (subclass) instance with the non-default values from another ModelProfile instance.""" if not profile: return self field_names = set(f.name for f in fields(self)) non_default_attrs = { f.name: getattr(profile, f.name) for f in fields(profile) if f.name in field_names and getattr(profile, f.name) != f.default } return replace(self, **non_default_attrs) ``` ### OpenAIModelProfile Bases: `ModelProfile` Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(kw_only=True) class OpenAIModelProfile(ModelProfile): """Profile for models used with `OpenAIChatModel`. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_supports_strict_tool_definition: bool = True """This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions.""" openai_supports_sampling_settings: bool = True """Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models.""" openai_unsupported_model_settings: Sequence[str] = () """A list of model settings that are not supported by this model.""" # Some OpenAI-compatible providers (e.g. MoonshotAI) currently do **not** accept # `tool_choice="required"`. This flag lets the calling model know whether it's # safe to pass that value along. Default is `True` to preserve existing # behaviour for OpenAI itself and most providers. openai_supports_tool_choice_required: bool = True """Whether the provider accepts the value ``tool_choice='required'`` in the request payload.""" openai_system_prompt_role: OpenAISystemPromptRole | None = None """The role to use for the system prompt message. If not provided, defaults to `'system'`.""" openai_chat_supports_web_search: bool = False """Whether the model supports web search in Chat Completions API.""" openai_supports_encrypted_reasoning_content: bool = False """Whether the model supports including encrypted reasoning content in the response.""" def __post_init__(self): # pragma: no cover if not self.openai_supports_sampling_settings: warnings.warn( 'The `openai_supports_sampling_settings` has no effect, and it will be removed in future versions. ' 'Use `openai_unsupported_model_settings` instead.', DeprecationWarning, ) ``` #### openai_supports_strict_tool_definition ```python openai_supports_strict_tool_definition: bool = True ``` This can be set by a provider or user if the OpenAI-"compatible" API doesn't support strict tool definitions. #### openai_supports_sampling_settings ```python openai_supports_sampling_settings: bool = True ``` Turn off to don't send sampling settings like `temperature` and `top_p` to models that don't support them, like OpenAI's o-series reasoning models. #### openai_unsupported_model_settings ```python openai_unsupported_model_settings: Sequence[str] = () ``` A list of model settings that are not supported by this model. #### openai_supports_tool_choice_required ```python openai_supports_tool_choice_required: bool = True ``` Whether the provider accepts the value `tool_choice='required'` in the request payload. #### openai_system_prompt_role ```python openai_system_prompt_role: OpenAISystemPromptRole | None = ( None ) ``` The role to use for the system prompt message. If not provided, defaults to `'system'`. #### openai_chat_supports_web_search ```python openai_chat_supports_web_search: bool = False ``` Whether the model supports web search in Chat Completions API. #### openai_supports_encrypted_reasoning_content ```python openai_supports_encrypted_reasoning_content: bool = False ``` Whether the model supports including encrypted reasoning content in the response. ### openai_model_profile ```python openai_model_profile(model_name: str) -> ModelProfile ``` Get the model profile for an OpenAI model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python def openai_model_profile(model_name: str) -> ModelProfile: """Get the model profile for an OpenAI model.""" is_reasoning_model = model_name.startswith('o') or model_name.startswith('gpt-5') # Check if the model supports web search (only specific search-preview models) supports_web_search = '-search-preview' in model_name # Structured Outputs (output mode 'native') is only supported with the gpt-4o-mini, gpt-4o-mini-2024-07-18, and gpt-4o-2024-08-06 model snapshots and later. # We leave it in here for all models because the `default_structured_output_mode` is `'tool'`, so `native` is only used # when the user specifically uses the `NativeOutput` marker, so an error from the API is acceptable. if is_reasoning_model: openai_unsupported_model_settings = ( 'temperature', 'top_p', 'presence_penalty', 'frequency_penalty', 'logit_bias', 'logprobs', 'top_logprobs', ) else: openai_unsupported_model_settings = () # The o1-mini model doesn't support the `system` role, so we default to `user`. # See https://github.com/pydantic/pydantic-ai/issues/974 for more details. openai_system_prompt_role = 'user' if model_name.startswith('o1-mini') else None return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, supports_json_schema_output=True, supports_json_object_output=True, supports_image_output=is_reasoning_model or '4.1' in model_name or '4o' in model_name, openai_unsupported_model_settings=openai_unsupported_model_settings, openai_system_prompt_role=openai_system_prompt_role, openai_chat_supports_web_search=supports_web_search, openai_supports_encrypted_reasoning_content=is_reasoning_model, ) ``` ### OpenAIJsonSchemaTransformer Bases: `JsonSchemaTransformer` Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required Source code in `pydantic_ai_slim/pydantic_ai/profiles/openai.py` ```python @dataclass(init=False) class OpenAIJsonSchemaTransformer(JsonSchemaTransformer): """Recursively handle the schema to make it compatible with OpenAI strict mode. See https://platform.openai.com/docs/guides/function-calling?api-mode=responses#strict-mode for more details, but this basically just requires: * `additionalProperties` must be set to false for each object in the parameters * all fields in properties must be marked as required """ def __init__(self, schema: JsonSchema, *, strict: bool | None = None): super().__init__(schema, strict=strict) self.root_ref = schema.get('$ref') def walk(self) -> JsonSchema: # Note: OpenAI does not support anyOf at the root in strict mode # However, we don't need to check for it here because we ensure in pydantic_ai._utils.check_object_json_schema # that the root schema either has type 'object' or is recursive. result = super().walk() # For recursive models, we need to tweak the schema to make it compatible with strict mode. # Because the following should never change the semantics of the schema we apply it unconditionally. if self.root_ref is not None: result.pop('$ref', None) # We replace references to the self.root_ref with just '#' in the transform method root_key = re.sub(r'^#/\$defs/', '', self.root_ref) result.update(self.defs.get(root_key) or {}) return result def transform(self, schema: JsonSchema) -> JsonSchema: # noqa C901 # Remove unnecessary keys schema.pop('title', None) schema.pop('$schema', None) schema.pop('discriminator', None) default = schema.get('default', _sentinel) if default is not _sentinel: # the "default" keyword is not allowed in strict mode, but including it makes some Ollama models behave # better, so we keep it around when not strict if self.strict is True: schema.pop('default', None) elif self.strict is None: # pragma: no branch self.is_strict_compatible = False if schema_ref := schema.get('$ref'): if schema_ref == self.root_ref: schema['$ref'] = '#' if len(schema) > 1: # OpenAI Strict mode doesn't support siblings to "$ref", but _does_ allow siblings to "anyOf". # So if there is a "description" field or any other extra info, we move the "$ref" into an "anyOf": schema['anyOf'] = [{'$ref': schema.pop('$ref')}] # Track strict-incompatible keys incompatible_values: dict[str, Any] = {} for key in _STRICT_INCOMPATIBLE_KEYS: value = schema.get(key, _sentinel) if value is not _sentinel: incompatible_values[key] = value if format := schema.get('format'): if format not in _STRICT_COMPATIBLE_STRING_FORMATS: incompatible_values['format'] = format description = schema.get('description') if incompatible_values: if self.strict is True: notes: list[str] = [] for key, value in incompatible_values.items(): schema.pop(key) notes.append(f'{key}={value}') notes_string = ', '.join(notes) schema['description'] = notes_string if not description else f'{description} ({notes_string})' elif self.strict is None: # pragma: no branch self.is_strict_compatible = False schema_type = schema.get('type') if 'oneOf' in schema: # OpenAI does not support oneOf in strict mode if self.strict is True: schema['anyOf'] = schema.pop('oneOf') else: self.is_strict_compatible = False if schema_type == 'object': if self.strict is True: # additional properties are disallowed schema['additionalProperties'] = False # all properties are required if 'properties' not in schema: schema['properties'] = dict[str, Any]() schema['required'] = list(schema['properties'].keys()) elif self.strict is None: if schema.get('additionalProperties', None) not in (None, False): self.is_strict_compatible = False else: # additional properties are disallowed by default schema['additionalProperties'] = False if 'properties' not in schema or 'required' not in schema: self.is_strict_compatible = False else: required = schema['required'] for k in schema['properties'].keys(): if k not in required: self.is_strict_compatible = False return schema ``` ### anthropic_model_profile ```python anthropic_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Anthropic model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/anthropic.py` ```python def anthropic_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Anthropic model.""" return ModelProfile(thinking_tags=('', '')) ``` ### google_model_profile ```python google_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Google model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python def google_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Google model.""" is_image_model = 'image' in model_name return ModelProfile( json_schema_transformer=GoogleJsonSchemaTransformer, supports_image_output=is_image_model, supports_json_schema_output=not is_image_model, supports_json_object_output=not is_image_model, supports_tools=not is_image_model, ) ``` ### GoogleJsonSchemaTransformer Bases: `JsonSchemaTransformer` Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini which [supports](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations) a subset of OpenAPI v3.0.3. Specifically: * gemini doesn't allow the `title` keyword to be set * gemini doesn't allow `$defs` — we need to inline the definitions where possible Source code in `pydantic_ai_slim/pydantic_ai/profiles/google.py` ```python class GoogleJsonSchemaTransformer(JsonSchemaTransformer): """Transforms the JSON Schema from Pydantic to be suitable for Gemini. Gemini which [supports](https://ai.google.dev/gemini-api/docs/function-calling#function_declarations) a subset of OpenAPI v3.0.3. Specifically: * gemini doesn't allow the `title` keyword to be set * gemini doesn't allow `$defs` — we need to inline the definitions where possible """ def __init__(self, schema: JsonSchema, *, strict: bool | None = None): super().__init__(schema, strict=strict, prefer_inlined_defs=True, simplify_nullable_unions=True) def transform(self, schema: JsonSchema) -> JsonSchema: # Note: we need to remove `additionalProperties: False` since it is currently mishandled by Gemini additional_properties = schema.pop( 'additionalProperties', None ) # don't pop yet so it's included in the warning if additional_properties: original_schema = {**schema, 'additionalProperties': additional_properties} warnings.warn( '`additionalProperties` is not supported by Gemini; it will be removed from the tool JSON schema.' f' Full schema: {self.schema}\n\n' f'Source of additionalProperties within the full schema: {original_schema}\n\n' 'If this came from a field with a type like `dict[str, MyType]`, that field will always be empty.\n\n' "If Google's APIs are updated to support this properly, please create an issue on the Pydantic AI GitHub" ' and we will fix this behavior.', UserWarning, ) schema.pop('title', None) schema.pop('$schema', None) if (const := schema.pop('const', None)) is not None: # Gemini doesn't support const, but it does support enum with a single value schema['enum'] = [const] schema.pop('discriminator', None) schema.pop('examples', None) # TODO: Should we use the trick from pydantic_ai.models.openai._OpenAIJsonSchema # where we add notes about these properties to the field description? schema.pop('exclusiveMaximum', None) schema.pop('exclusiveMinimum', None) # Gemini only supports string enums, so we need to convert any enum values to strings. # Pydantic will take care of transforming the transformed string values to the correct type. if enum := schema.get('enum'): schema['type'] = 'string' schema['enum'] = [str(val) for val in enum] type_ = schema.get('type') if 'oneOf' in schema and 'type' not in schema: # pragma: no cover # This gets hit when we have a discriminated union # Gemini returns an API error in this case even though it says in its error message it shouldn't... # Changing the oneOf to an anyOf prevents the API error and I think is functionally equivalent schema['anyOf'] = schema.pop('oneOf') if type_ == 'string' and (fmt := schema.pop('format', None)): description = schema.get('description') if description: schema['description'] = f'{description} (format: {fmt})' else: schema['description'] = f'Format: {fmt}' if '$ref' in schema: raise UserError(f'Recursive `$ref`s in JSON Schema are not supported by Gemini: {schema["$ref"]}') if 'prefixItems' in schema: # prefixItems is not currently supported in Gemini, so we convert it to items for best compatibility prefix_items = schema.pop('prefixItems') items = schema.get('items') unique_items = [items] if items is not None else [] for item in prefix_items: if item not in unique_items: unique_items.append(item) if len(unique_items) > 1: # pragma: no cover schema['items'] = {'anyOf': unique_items} elif len(unique_items) == 1: # pragma: no branch schema['items'] = unique_items[0] schema.setdefault('minItems', len(prefix_items)) if items is None: # pragma: no branch schema.setdefault('maxItems', len(prefix_items)) return schema ``` ### meta_model_profile ```python meta_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Meta model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/meta.py` ```python def meta_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### amazon_model_profile ```python amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/amazon.py` ```python def amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model.""" return ModelProfile(json_schema_transformer=InlineDefsJsonSchemaTransformer) ``` ### deepseek_model_profile ```python deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/deepseek.py` ```python def deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model.""" return ModelProfile(ignore_streamed_leading_whitespace='r1' in model_name) ``` ### grok_model_profile ```python grok_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Grok model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/grok.py` ```python def grok_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Grok model.""" return None ``` ### mistral_model_profile ```python mistral_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Mistral model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/mistral.py` ```python def mistral_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Mistral model.""" return None ``` ### qwen_model_profile ```python qwen_model_profile(model_name: str) -> ModelProfile | None ``` Get the model profile for a Qwen model. Source code in `pydantic_ai_slim/pydantic_ai/profiles/qwen.py` ```python def qwen_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Qwen model.""" if model_name.startswith('qwen-3-coder'): return OpenAIModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, openai_supports_tool_choice_required=False, openai_supports_strict_tool_definition=False, ignore_streamed_leading_whitespace=True, ) return ModelProfile( json_schema_transformer=InlineDefsJsonSchemaTransformer, ignore_streamed_leading_whitespace=True, ) ``` # `pydantic_ai.providers` Bases: `ABC`, `Generic[InterfaceClient]` Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python class Provider(ABC, Generic[InterfaceClient]): """Abstract class for a provider. The provider is in charge of providing an authenticated client to the API. Each provider only supports a specific interface. A interface can be supported by multiple providers. For example, the `OpenAIChatModel` interface can be supported by the `OpenAIProvider` and the `DeepSeekProvider`. """ _client: InterfaceClient @property @abstractmethod def name(self) -> str: """The provider name.""" raise NotImplementedError() @property @abstractmethod def base_url(self) -> str: """The base URL for the provider API.""" raise NotImplementedError() @property @abstractmethod def client(self) -> InterfaceClient: """The client for the provider.""" raise NotImplementedError() def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover def __repr__(self) -> str: return f'{self.__class__.__name__}(name={self.name}, base_url={self.base_url})' # pragma: lax no cover ``` ### name ```python name: str ``` The provider name. ### base_url ```python base_url: str ``` The base URL for the provider API. ### client ```python client: InterfaceClient ``` The client for the provider. ### model_profile ```python model_profile(model_name: str) -> ModelProfile | None ``` The model profile for the named model, if available. Source code in `pydantic_ai_slim/pydantic_ai/providers/__init__.py` ```python def model_profile(self, model_name: str) -> ModelProfile | None: """The model profile for the named model, if available.""" return None # pragma: no cover ``` ### GoogleProvider Bases: `Provider[Client]` Provider for Google. Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python class GoogleProvider(Provider[Client]): """Provider for Google.""" @property def name(self) -> str: return 'google-vertex' if self._client._api_client.vertexai else 'google-gla' # type: ignore[reportPrivateUsage] @property def base_url(self) -> str: return str(self._client._api_client._http_options.base_url) # type: ignore[reportPrivateUsage] @property def client(self) -> Client: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return google_model_profile(model_name) @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__( self, *, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, ) -> None: ... @overload def __init__(self, *, client: Client) -> None: ... @overload def __init__(self, *, vertexai: bool = False) -> None: ... def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, client: Client | None = None, vertexai: bool | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. Applies to the Gemini Developer API only. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. client: A pre-initialized client to use. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False`. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') if vertexai is None: vertexai = bool(location or project or credentials) http_options: HttpOptionsDict = { 'headers': {'User-Agent': get_user_agent()}, 'async_client_args': {'transport': httpx.AsyncHTTPTransport()}, } if not vertexai: if api_key is None: raise UserError( # pragma: no cover 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=vertexai, api_key=api_key, http_options=http_options) else: self._client = Client( vertexai=vertexai, project=project or os.getenv('GOOGLE_CLOUD_PROJECT'), # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location=location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1', credentials=credentials, http_options=http_options, ) else: self._client = client ``` #### __init__ ```python __init__(*, api_key: str) -> None ``` ```python __init__( *, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | None ) = None ) -> None ``` ```python __init__(*, client: Client) -> None ``` ```python __init__(*, vertexai: bool = False) -> None ``` ```python __init__( *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: ( VertexAILocation | Literal["global"] | None ) = None, client: Client | None = None, vertexai: bool | None = None ) -> None ``` Create a new Google provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key \_ to use for authentication. It can also be set via the GOOGLE_API_KEY environment variable. Applies to the Gemini Developer API only. | `None` | | `credentials` | `Credentials | None` | The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. | `None` | | `project` | `str | None` | The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. | `None` | | `location` | `VertexAILocation | Literal['global'] | None` | The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. | `None` | | `client` | `Client | None` | A pre-initialized client to use. | `None` | | `vertexai` | `bool | None` | Force the use of the Vertex AI API. If False, the Google Generative Language API will be used. Defaults to False. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/google.py` ```python def __init__( self, *, api_key: str | None = None, credentials: Credentials | None = None, project: str | None = None, location: VertexAILocation | Literal['global'] | None = None, client: Client | None = None, vertexai: bool | None = None, ) -> None: """Create a new Google provider. Args: api_key: The `API key `_ to use for authentication. It can also be set via the `GOOGLE_API_KEY` environment variable. Applies to the Gemini Developer API only. credentials: The credentials to use for authentication when calling the Vertex AI APIs. Credentials can be obtained from environment variables and default credentials. For more information, see Set up Application Default Credentials. Applies to the Vertex AI API only. project: The Google Cloud project ID to use for quota. Can be obtained from environment variables (for example, GOOGLE_CLOUD_PROJECT). Applies to the Vertex AI API only. location: The location to send API requests to (for example, us-central1). Can be obtained from environment variables. Applies to the Vertex AI API only. client: A pre-initialized client to use. vertexai: Force the use of the Vertex AI API. If `False`, the Google Generative Language API will be used. Defaults to `False`. """ if client is None: # NOTE: We are keeping GEMINI_API_KEY for backwards compatibility. api_key = api_key or os.getenv('GOOGLE_API_KEY') or os.getenv('GEMINI_API_KEY') if vertexai is None: vertexai = bool(location or project or credentials) http_options: HttpOptionsDict = { 'headers': {'User-Agent': get_user_agent()}, 'async_client_args': {'transport': httpx.AsyncHTTPTransport()}, } if not vertexai: if api_key is None: raise UserError( # pragma: no cover 'Set the `GOOGLE_API_KEY` environment variable or pass it via `GoogleProvider(api_key=...)`' 'to use the Google Generative Language API.' ) self._client = Client(vertexai=vertexai, api_key=api_key, http_options=http_options) else: self._client = Client( vertexai=vertexai, project=project or os.getenv('GOOGLE_CLOUD_PROJECT'), # From https://github.com/pydantic/pydantic-ai/pull/2031/files#r2169682149: # Currently `us-central1` supports the most models by far of any region including `global`, but not # all of them. `us-central1` has all google models but is missing some Anthropic partner models, # which use `us-east5` instead. `global` has fewer models but higher availability. # For more details, check: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#available-regions location=location or os.getenv('GOOGLE_CLOUD_LOCATION') or 'us-central1', credentials=credentials, http_options=http_options, ) else: self._client = client ``` ### VertexAILocation ```python VertexAILocation = Literal[ "asia-east1", "asia-east2", "asia-northeast1", "asia-northeast3", "asia-south1", "asia-southeast1", "australia-southeast1", "europe-central2", "europe-north1", "europe-southwest1", "europe-west1", "europe-west2", "europe-west3", "europe-west4", "europe-west6", "europe-west8", "europe-west9", "me-central1", "me-central2", "me-west1", "northamerica-northeast1", "southamerica-east1", "us-central1", "us-east1", "us-east4", "us-east5", "us-south1", "us-west1", "us-west4", ] ``` Regions available for Vertex AI. More details [here](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#genai-locations). ### OpenAIProvider Bases: `Provider[AsyncOpenAI]` Provider for OpenAI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python class OpenAIProvider(Provider[AsyncOpenAI]): """Provider for OpenAI API.""" @property def name(self) -> str: return 'openai' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... @overload def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: None = None, http_client: AsyncClient | None = None, ) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new OpenAI provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the OpenAI requests. If not provided, the OPENAI_BASE_URL environment variable will be used if available. Otherwise, defaults to OpenAI's base url. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the OPENAI_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/openai.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new OpenAI provider. Args: base_url: The base url for the OpenAI requests. If not provided, the `OPENAI_BASE_URL` environment variable will be used if available. Otherwise, defaults to OpenAI's base url. api_key: The API key to use for authentication, if not provided, the `OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. if api_key is None and 'OPENAI_API_KEY' not in os.environ and base_url is not None and openai_client is None: api_key = 'api-key-not-set' if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openai') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### DeepSeekProvider Bases: `Provider[AsyncOpenAI]` Provider for DeepSeek API. Source code in `pydantic_ai_slim/pydantic_ai/providers/deepseek.py` ```python class DeepSeekProvider(Provider[AsyncOpenAI]): """Provider for DeepSeek API.""" @property def name(self) -> str: return 'deepseek' @property def base_url(self) -> str: return 'https://api.deepseek.com' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = deepseek_model_profile(model_name) # As DeepSeekProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly. # This was not the case when using a DeepSeek model with another model class (e.g. BedrockConverseModel or GroqModel), # so we won't do this in `deepseek_model_profile` unless we learn it's always needed. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('DEEPSEEK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `DEEPSEEK_API_KEY` environment variable or pass it via `DeepSeekProvider(api_key=...)`' 'to use the DeepSeek provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='deepseek') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### BedrockModelProfile Bases: `ModelProfile` Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python @dataclass(kw_only=True) class BedrockModelProfile(ModelProfile): """Profile for models used with BedrockModel. ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ bedrock_supports_tool_choice: bool = False bedrock_tool_result_format: Literal['text', 'json'] = 'text' bedrock_send_back_thinking_parts: bool = False ``` ### bedrock_amazon_model_profile ```python bedrock_amazon_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an Amazon model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_amazon_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an Amazon model used via Bedrock.""" profile = amazon_model_profile(model_name) if 'nova' in model_name: return BedrockModelProfile(bedrock_supports_tool_choice=True).update(profile) return profile ``` ### bedrock_deepseek_model_profile ```python bedrock_deepseek_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a DeepSeek model used via Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def bedrock_deepseek_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a DeepSeek model used via Bedrock.""" profile = deepseek_model_profile(model_name) if 'r1' in model_name: return BedrockModelProfile(bedrock_send_back_thinking_parts=True).update(profile) return profile # pragma: no cover ``` ### BedrockProvider Bases: `Provider[BaseClient]` Provider for AWS Bedrock. Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python class BedrockProvider(Provider[BaseClient]): """Provider for AWS Bedrock.""" @property def name(self) -> str: return 'bedrock' @property def base_url(self) -> str: return self._client.meta.endpoint_url @property def client(self) -> BaseClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile: dict[str, Callable[[str], ModelProfile | None]] = { 'anthropic': lambda model_name: BedrockModelProfile( bedrock_supports_tool_choice=True, bedrock_send_back_thinking_parts=True ).update(anthropic_model_profile(model_name)), 'mistral': lambda model_name: BedrockModelProfile(bedrock_tool_result_format='json').update( mistral_model_profile(model_name) ), 'cohere': cohere_model_profile, 'amazon': bedrock_amazon_model_profile, 'meta': meta_model_profile, 'deepseek': bedrock_deepseek_model_profile, } # Split the model name into parts parts = model_name.split('.', 2) # Handle regional prefixes (e.g. "us.") if len(parts) > 2 and len(parts[0]) == 2: parts = parts[1:] if len(parts) < 2: return None provider = parts[0] model_name_with_version = parts[1] # Remove version suffix if it matches the format (e.g. "-v1:0" or "-v14") version_match = re.match(r'(.+)-v\d+(?::\d+)?$', model_name_with_version) if version_match: model_name = version_match.group(1) else: model_name = model_name_with_version if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, bedrock_client: BaseClient) -> None: ... @overload def __init__( self, *, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: ... def __init__( self, *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. region_name: The AWS region name. aws_access_key_id: The AWS access key ID. aws_secret_access_key: The AWS secret access key. aws_session_token: The AWS session token. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: try: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(read_timeout=read_timeout, connect_timeout=connect_timeout), ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` #### __init__ ```python __init__(*, bedrock_client: BaseClient) -> None ``` ```python __init__( *, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` ```python __init__( *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None ) -> None ``` Initialize the Bedrock provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `bedrock_client` | `BaseClient | None` | A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. | `None` | | `region_name` | `str | None` | The AWS region name. | `None` | | `aws_access_key_id` | `str | None` | The AWS access key ID. | `None` | | `aws_secret_access_key` | `str | None` | The AWS secret access key. | `None` | | `aws_session_token` | `str | None` | The AWS session token. | `None` | | `profile_name` | `str | None` | The AWS profile name. | `None` | | `aws_read_timeout` | `float | None` | The read timeout for Bedrock client. | `None` | | `aws_connect_timeout` | `float | None` | The connect timeout for Bedrock client. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/bedrock.py` ```python def __init__( self, *, bedrock_client: BaseClient | None = None, region_name: str | None = None, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, profile_name: str | None = None, aws_read_timeout: float | None = None, aws_connect_timeout: float | None = None, ) -> None: """Initialize the Bedrock provider. Args: bedrock_client: A boto3 client for Bedrock Runtime. If provided, other arguments are ignored. region_name: The AWS region name. aws_access_key_id: The AWS access key ID. aws_secret_access_key: The AWS secret access key. aws_session_token: The AWS session token. profile_name: The AWS profile name. aws_read_timeout: The read timeout for Bedrock client. aws_connect_timeout: The connect timeout for Bedrock client. """ if bedrock_client is not None: self._client = bedrock_client else: try: read_timeout = aws_read_timeout or float(os.getenv('AWS_READ_TIMEOUT', 300)) connect_timeout = aws_connect_timeout or float(os.getenv('AWS_CONNECT_TIMEOUT', 60)) session = boto3.Session( aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key, aws_session_token=aws_session_token, region_name=region_name, profile_name=profile_name, ) self._client = session.client( # type: ignore[reportUnknownMemberType] 'bedrock-runtime', config=Config(read_timeout=read_timeout, connect_timeout=connect_timeout), ) except NoRegionError as exc: # pragma: no cover raise UserError('You must provide a `region_name` or a boto3 client for Bedrock Runtime.') from exc ``` ### groq_moonshotai_model_profile ```python groq_moonshotai_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for an MoonshotAI model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def groq_moonshotai_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for an MoonshotAI model used with the Groq provider.""" return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( moonshotai_model_profile(model_name) ) ``` ### meta_groq_model_profile ```python meta_groq_model_profile( model_name: str, ) -> ModelProfile | None ``` Get the model profile for a Meta model used with the Groq provider. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def meta_groq_model_profile(model_name: str) -> ModelProfile | None: """Get the model profile for a Meta model used with the Groq provider.""" if model_name in {'llama-4-maverick-17b-128e-instruct', 'llama-4-scout-17b-16e-instruct'}: return ModelProfile(supports_json_object_output=True, supports_json_schema_output=True).update( meta_model_profile(model_name) ) else: return meta_model_profile(model_name) ``` ### GroqProvider Bases: `Provider[AsyncGroq]` Provider for Groq API. Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python class GroqProvider(Provider[AsyncGroq]): """Provider for Groq API.""" @property def name(self) -> str: return 'groq' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncGroq: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'meta-llama/': meta_groq_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'moonshotai/': groq_moonshotai_model_profile, 'compound-': groq_model_profile, 'openai/': openai_model_profile, } for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): if prefix.endswith('/'): model_name = model_name[len(prefix) :] return profile_func(model_name) return None @overload def __init__(self, *, groq_client: AsyncGroq | None = None) -> None: ... @overload def __init__( self, *, api_key: str | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None ) -> None: ... def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncHTTPClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` #### __init__ ```python __init__(*, groq_client: AsyncGroq | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Groq provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the GROQ_API_KEY environment variable will be used if available. | `None` | | `base_url` | `str | None` | The base url for the Groq requests. If not provided, the GROQ_BASE_URL environment variable will be used if available. Otherwise, defaults to Groq's base url. | `None` | | `groq_client` | `AsyncGroq | None` | An existing AsyncGroq client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing AsyncHTTPClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/groq.py` ```python def __init__( self, *, api_key: str | None = None, base_url: str | None = None, groq_client: AsyncGroq | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Groq provider. Args: api_key: The API key to use for authentication, if not provided, the `GROQ_API_KEY` environment variable will be used if available. base_url: The base url for the Groq requests. If not provided, the `GROQ_BASE_URL` environment variable will be used if available. Otherwise, defaults to Groq's base url. groq_client: An existing [`AsyncGroq`](https://github.com/groq/groq-python?tab=readme-ov-file#async-usage) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `AsyncHTTPClient` to use for making HTTP requests. """ if groq_client is not None: assert http_client is None, 'Cannot provide both `groq_client` and `http_client`' assert api_key is None, 'Cannot provide both `groq_client` and `api_key`' assert base_url is None, 'Cannot provide both `groq_client` and `base_url`' self._client = groq_client else: api_key = api_key or os.getenv('GROQ_API_KEY') base_url = base_url or os.getenv('GROQ_BASE_URL', 'https://api.groq.com') if not api_key: raise UserError( 'Set the `GROQ_API_KEY` environment variable or pass it via `GroqProvider(api_key=...)`' 'to use the Groq provider.' ) elif http_client is not None: self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='groq') self._client = AsyncGroq(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### AzureProvider Bases: `Provider[AsyncOpenAI]` Provider for Azure OpenAI API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python class AzureProvider(Provider[AsyncOpenAI]): """Provider for Azure OpenAI API. See for more information. """ @property def name(self) -> str: return 'azure' @property def base_url(self) -> str: assert self._base_url is not None return self._base_url @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: model_name = model_name.lower() prefix_to_profile = { 'llama': meta_model_profile, 'meta-': meta_model_profile, 'deepseek': deepseek_model_profile, 'mistralai-': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere-': cohere_model_profile, 'grok': grok_model_profile, } for prefix, profile_func in prefix_to_profile.items(): if model_name.startswith(prefix): if prefix.endswith('-'): model_name = model_name[len(prefix) :] profile = profile_func(model_name) # As AzureProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) # OpenAI models are unprefixed return openai_model_profile(model_name) @overload def __init__(self, *, openai_client: AsyncAzureOpenAI) -> None: ... @overload def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: ... def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` #### __init__ ```python __init__(*, openai_client: AsyncAzureOpenAI) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Azure provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `azure_endpoint` | `str | None` | The Azure endpoint to use for authentication, if not provided, the AZURE_OPENAI_ENDPOINT environment variable will be used if available. | `None` | | `api_version` | `str | None` | The API version to use for authentication, if not provided, the OPENAI_API_VERSION environment variable will be used if available. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the AZURE_OPENAI_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncAzureOpenAI | None` | An existing AsyncAzureOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/azure.py` ```python def __init__( self, *, azure_endpoint: str | None = None, api_version: str | None = None, api_key: str | None = None, openai_client: AsyncAzureOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Azure provider. Args: azure_endpoint: The Azure endpoint to use for authentication, if not provided, the `AZURE_OPENAI_ENDPOINT` environment variable will be used if available. api_version: The API version to use for authentication, if not provided, the `OPENAI_API_VERSION` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `AZURE_OPENAI_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncAzureOpenAI`](https://github.com/openai/openai-python#microsoft-azure-openai) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert azure_endpoint is None, 'Cannot provide both `openai_client` and `azure_endpoint`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._base_url = str(openai_client.base_url) self._client = openai_client else: azure_endpoint = azure_endpoint or os.getenv('AZURE_OPENAI_ENDPOINT') if not azure_endpoint: raise UserError( 'Must provide one of the `azure_endpoint` argument or the `AZURE_OPENAI_ENDPOINT` environment variable' ) if not api_key and 'AZURE_OPENAI_API_KEY' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_key` argument or the `AZURE_OPENAI_API_KEY` environment variable' ) if not api_version and 'OPENAI_API_VERSION' not in os.environ: # pragma: no cover raise UserError( 'Must provide one of the `api_version` argument or the `OPENAI_API_VERSION` environment variable' ) http_client = http_client or cached_async_http_client(provider='azure') self._client = AsyncAzureOpenAI( azure_endpoint=azure_endpoint, api_key=api_key, api_version=api_version, http_client=http_client, ) self._base_url = str(self._client.base_url) ``` ### CohereProvider Bases: `Provider[AsyncClientV2]` Provider for Cohere API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python class CohereProvider(Provider[AsyncClientV2]): """Provider for Cohere API.""" @property def name(self) -> str: return 'cohere' @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def client(self) -> AsyncClientV2: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return cohere_model_profile(model_name) def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` #### __init__ ```python __init__( *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Cohere provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the CO_API_KEY environment variable will be used if available. | `None` | | `cohere_client` | `AsyncClientV2 | None` | An existing AsyncClientV2 client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/cohere.py` ```python def __init__( self, *, api_key: str | None = None, cohere_client: AsyncClientV2 | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Cohere provider. Args: api_key: The API key to use for authentication, if not provided, the `CO_API_KEY` environment variable will be used if available. cohere_client: An existing [AsyncClientV2](https://github.com/cohere-ai/cohere-python) client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if cohere_client is not None: assert http_client is None, 'Cannot provide both `cohere_client` and `http_client`' assert api_key is None, 'Cannot provide both `cohere_client` and `api_key`' self._client = cohere_client else: api_key = api_key or os.getenv('CO_API_KEY') if not api_key: raise UserError( 'Set the `CO_API_KEY` environment variable or pass it via `CohereProvider(api_key=...)`' 'to use the Cohere provider.' ) base_url = os.getenv('CO_BASE_URL') if http_client is not None: self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='cohere') self._client = AsyncClientV2(api_key=api_key, httpx_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for Cerebras API. Source code in `pydantic_ai_slim/pydantic_ai/providers/cerebras.py` ```python class CerebrasProvider(Provider[AsyncOpenAI]): """Provider for Cerebras API.""" @property def name(self) -> str: return 'cerebras' @property def base_url(self) -> str: return 'https://api.cerebras.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = {'llama': meta_model_profile, 'qwen': qwen_model_profile, 'gpt-oss': harmony_model_profile} profile = None for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): profile = profile_func(model_name) # According to https://inference-docs.cerebras.ai/resources/openai#currently-unsupported-openai-features, # Cerebras doesn't support some model settings. unsupported_model_settings = ( 'frequency_penalty', 'logit_bias', 'presence_penalty', 'parallel_tool_calls', 'service_tier', ) return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_unsupported_model_settings=unsupported_model_settings, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('CEREBRAS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `CEREBRAS_API_KEY` environment variable or pass it via `CerebrasProvider(api_key=...)` ' 'to use the Cerebras provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='cerebras') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[Mistral]` Provider for Mistral API. Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python class MistralProvider(Provider[Mistral]): """Provider for Mistral API.""" @property def name(self) -> str: return 'mistral' @property def base_url(self) -> str: return self.client.sdk_configuration.get_server_details()[0] @property def client(self) -> Mistral: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: return mistral_model_profile(model_name) @overload def __init__(self, *, mistral_client: Mistral | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None, http_client: httpx.AsyncClient | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` ### __init__ ```python __init__(*, mistral_client: Mistral | None = None) -> None ``` ```python __init__( *, api_key: str | None = None, http_client: AsyncClient | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new Mistral provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the MISTRAL_API_KEY environment variable will be used if available. | `None` | | `mistral_client` | `Mistral | None` | An existing Mistral client to use, if provided, api_key and http_client must be None. | `None` | | `base_url` | `str | None` | The base url for the Mistral requests. | `None` | | `http_client` | `AsyncClient | None` | An existing async client to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/mistral.py` ```python def __init__( self, *, api_key: str | None = None, mistral_client: Mistral | None = None, base_url: str | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Mistral provider. Args: api_key: The API key to use for authentication, if not provided, the `MISTRAL_API_KEY` environment variable will be used if available. mistral_client: An existing `Mistral` client to use, if provided, `api_key` and `http_client` must be `None`. base_url: The base url for the Mistral requests. http_client: An existing async client to use for making HTTP requests. """ if mistral_client is not None: assert http_client is None, 'Cannot provide both `mistral_client` and `http_client`' assert api_key is None, 'Cannot provide both `mistral_client` and `api_key`' assert base_url is None, 'Cannot provide both `mistral_client` and `base_url`' self._client = mistral_client else: api_key = api_key or os.getenv('MISTRAL_API_KEY') if not api_key: raise UserError( 'Set the `MISTRAL_API_KEY` environment variable or pass it via `MistralProvider(api_key=...)`' 'to use the Mistral provider.' ) elif http_client is not None: self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) else: http_client = cached_async_http_client(provider='mistral') self._client = Mistral(api_key=api_key, async_client=http_client, server_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for Fireworks AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/fireworks.py` ```python class FireworksProvider(Provider[AsyncOpenAI]): """Provider for Fireworks AI API.""" @property def name(self) -> str: return 'fireworks' @property def base_url(self) -> str: return 'https://api.fireworks.ai/inference/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'qwen': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'gemma': google_model_profile, } prefix = 'accounts/fireworks/models/' profile = None if model_name.startswith(prefix): model_name = model_name[len(prefix) :] for provider, profile_func in prefix_to_profile.items(): if model_name.startswith(provider): profile = profile_func(model_name) break # As the Fireworks API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('FIREWORKS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `FIREWORKS_API_KEY` environment variable or pass it via `FireworksProvider(api_key=...)`' 'to use the Fireworks AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='fireworks') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Grok API. Source code in `pydantic_ai_slim/pydantic_ai/providers/grok.py` ```python class GrokProvider(Provider[AsyncOpenAI]): """Provider for Grok API.""" @property def name(self) -> str: return 'grok' @property def base_url(self) -> str: return 'https://api.x.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = grok_model_profile(model_name) # As the Grok API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, Grok does not support strict tool definitions: https://github.com/pydantic/pydantic-ai/issues/1846 return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_strict_tool_definition=False ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('GROK_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GROK_API_KEY` environment variable or pass it via `GrokProvider(api_key=...)`' 'to use the Grok provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='grok') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Together AI API. Source code in `pydantic_ai_slim/pydantic_ai/providers/together.py` ```python class TogetherProvider(Provider[AsyncOpenAI]): """Provider for Together AI API.""" @property def name(self) -> str: return 'together' @property def base_url(self) -> str: return 'https://api.together.xyz/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, } profile = None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As the Together API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('TOGETHER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `TOGETHER_API_KEY` environment variable or pass it via `TogetherProvider(api_key=...)`' 'to use the Together AI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='together') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Heroku API. Source code in `pydantic_ai_slim/pydantic_ai/providers/heroku.py` ```python class HerokuProvider(Provider[AsyncOpenAI]): """Provider for Heroku API.""" @property def name(self) -> str: return 'heroku' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # As the Heroku API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer. return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: if openai_client is not None: assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: api_key = api_key or os.getenv('HEROKU_INFERENCE_KEY') if not api_key: raise UserError( 'Set the `HEROKU_INFERENCE_KEY` environment variable or pass it via `HerokuProvider(api_key=...)`' 'to use the Heroku provider.' ) base_url = base_url or os.getenv('HEROKU_INFERENCE_URL', 'https://us.inference.heroku.com') base_url = base_url.rstrip('/') + '/v1' if http_client is not None: self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) else: http_client = cached_async_http_client(provider='heroku') self._client = AsyncOpenAI(api_key=api_key, http_client=http_client, base_url=base_url) ``` Bases: `Provider[AsyncOpenAI]` Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python class GitHubProvider(Provider[AsyncOpenAI]): """Provider for GitHub Models API. GitHub Models provides access to various AI models through an OpenAI-compatible API. See for more information. """ @property def name(self) -> str: return 'github' @property def base_url(self) -> str: return 'https://models.github.ai/inference' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'xai': grok_model_profile, 'meta': meta_model_profile, 'microsoft': openai_model_profile, 'mistral-ai': mistral_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, } profile = None # If the model name does not contain a provider prefix, we assume it's an OpenAI model if '/' not in model_name: return openai_model_profile(model_name) provider, model_name = model_name.lower().split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As GitHubProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__() -> None ``` ```python __init__(*, api_key: str) -> None ``` ```python __init__(*, api_key: str, http_client: AsyncClient) -> None ``` ```python __init__( *, openai_client: AsyncOpenAI | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Create a new GitHub Models provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | The GitHub token to use for authentication. If not provided, the GITHUB_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, api_key and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/github.py` ```python def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new GitHub Models provider. Args: api_key: The GitHub token to use for authentication. If not provided, the `GITHUB_API_KEY` environment variable will be used if available. openai_client: An existing `AsyncOpenAI` client to use. If provided, `api_key` and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ api_key = api_key or os.getenv('GITHUB_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `GITHUB_API_KEY` environment variable or pass it via `GitHubProvider(api_key=...)`' ' to use the GitHub Models provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='github') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for OpenRouter API. Source code in `pydantic_ai_slim/pydantic_ai/providers/openrouter.py` ```python class OpenRouterProvider(Provider[AsyncOpenAI]): """Provider for OpenRouter API.""" @property def name(self) -> str: return 'openrouter' @property def base_url(self) -> str: return 'https://openrouter.ai/api/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'google': google_model_profile, 'openai': openai_model_profile, 'anthropic': anthropic_model_profile, 'mistralai': mistral_model_profile, 'qwen': qwen_model_profile, 'x-ai': grok_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'deepseek': deepseek_model_profile, 'meta-llama': meta_model_profile, 'moonshotai': moonshotai_model_profile, } profile = None provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: model_name, *_ = model_name.split(':', 1) # drop tags profile = provider_to_profile[provider](model_name) # As OpenRouterProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('OPENROUTER_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `OPENROUTER_API_KEY` environment variable or pass it via `OpenRouterProvider(api_key=...)`' 'to use the OpenRouter provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='openrouter') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for Vercel AI Gateway API. Source code in `pydantic_ai_slim/pydantic_ai/providers/vercel.py` ```python class VercelProvider(Provider[AsyncOpenAI]): """Provider for Vercel AI Gateway API.""" @property def name(self) -> str: return 'vercel' @property def base_url(self) -> str: return 'https://ai-gateway.vercel.sh/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'anthropic': anthropic_model_profile, 'bedrock': amazon_model_profile, 'cohere': cohere_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'openai': openai_model_profile, 'vertex': google_model_profile, 'xai': grok_model_profile, } profile = None try: provider, model_name = model_name.split('/', 1) except ValueError: raise UserError(f"Model name must be in 'provider/model' format, got: {model_name!r}") if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As VercelProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: # Support Vercel AI Gateway's standard environment variables api_key = api_key or os.getenv('VERCEL_AI_GATEWAY_API_KEY') or os.getenv('VERCEL_OIDC_TOKEN') if not api_key and openai_client is None: raise UserError( 'Set the `VERCEL_AI_GATEWAY_API_KEY` or `VERCEL_OIDC_TOKEN` environment variable ' 'or pass the API key via `VercelProvider(api_key=...)` to use the Vercel provider.' ) default_headers = {'http-referer': 'https://ai.pydantic.dev/', 'x-title': 'pydantic-ai'} if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) else: http_client = cached_async_http_client(provider='vercel') self._client = AsyncOpenAI( base_url=self.base_url, api_key=api_key, http_client=http_client, default_headers=default_headers ) ``` Bases: `Provider[AsyncInferenceClient]` Provider for Hugging Face. Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python class HuggingFaceProvider(Provider[AsyncInferenceClient]): """Provider for Hugging Face.""" @property def name(self) -> str: return 'huggingface' @property def base_url(self) -> str: return self.client.model # type: ignore @property def client(self) -> AsyncInferenceClient: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'deepseek-ai': deepseek_model_profile, 'google': google_model_profile, 'qwen': qwen_model_profile, 'meta-llama': meta_model_profile, 'mistralai': mistral_model_profile, 'moonshotai': moonshotai_model_profile, } if '/' not in model_name: return None model_name = model_name.lower() provider, model_name = model_name.split('/', 1) if provider in provider_to_profile: return provider_to_profile[provider](model_name) return None @overload def __init__(self, *, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None) -> None: ... @overload def __init__(self, *, api_key: str | None = None) -> None: ... def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name : Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` ### __init__ ```python __init__( *, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, base_url: str, api_key: str | None = None ) -> None ``` ```python __init__( *, hf_client: AsyncInferenceClient, provider_name: str, api_key: str | None = None ) -> None ``` ```python __init__(*, api_key: str | None = None) -> None ``` ```python __init__( base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None ``` Create a new Hugging Face provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the Hugging Face requests. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the HF_TOKEN environment variable will be used if available. | `None` | | `hf_client` | `AsyncInferenceClient | None` | An existing AsyncInferenceClient client to use. If not provided, a new instance will be created. | `None` | | `http_client` | `AsyncClient | None` | (currently ignored) An existing httpx.AsyncClient to use for making HTTP requests. | `None` | | `provider_name` | | Name of the provider to use for inference. available providers can be found in the HF Inference Providers documentation. defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If base_url is passed, then provider_name is not used. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/huggingface.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, hf_client: AsyncInferenceClient | None = None, http_client: AsyncClient | None = None, provider_name: str | None = None, ) -> None: """Create a new Hugging Face provider. Args: base_url: The base url for the Hugging Face requests. api_key: The API key to use for authentication, if not provided, the `HF_TOKEN` environment variable will be used if available. hf_client: An existing [`AsyncInferenceClient`](https://huggingface.co/docs/huggingface_hub/v0.29.3/en/package_reference/inference_client#huggingface_hub.AsyncInferenceClient) client to use. If not provided, a new instance will be created. http_client: (currently ignored) An existing `httpx.AsyncClient` to use for making HTTP requests. provider_name : Name of the provider to use for inference. available providers can be found in the [HF Inference Providers documentation](https://huggingface.co/docs/inference-providers/index#partners). defaults to "auto", which will select the first available provider for the model, the first of the providers available for the model, sorted by the user's order in https://hf.co/settings/inference-providers. If `base_url` is passed, then `provider_name` is not used. """ api_key = api_key or os.getenv('HF_TOKEN') if api_key is None: raise UserError( 'Set the `HF_TOKEN` environment variable or pass it via `HuggingFaceProvider(api_key=...)`' 'to use the HuggingFace provider.' ) if http_client is not None: raise ValueError('`http_client` is ignored for HuggingFace provider, please use `hf_client` instead.') if base_url is not None and provider_name is not None: raise ValueError('Cannot provide both `base_url` and `provider_name`.') if hf_client is None: self._client = AsyncInferenceClient(api_key=api_key, provider=provider_name, base_url=base_url) # type: ignore else: self._client = hf_client ``` Bases: `Provider[AsyncOpenAI]` Provider for MoonshotAI platform (Kimi models). Source code in `pydantic_ai_slim/pydantic_ai/providers/moonshotai.py` ```python class MoonshotAIProvider(Provider[AsyncOpenAI]): """Provider for MoonshotAI platform (Kimi models).""" @property def name(self) -> str: return 'moonshotai' @property def base_url(self) -> str: # OpenAI-compatible endpoint, see MoonshotAI docs return 'https://api.moonshot.ai/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: profile = moonshotai_model_profile(model_name) # As the MoonshotAI API is OpenAI-compatible, let's assume we also need OpenAIJsonSchemaTransformer, # unless json_schema_transformer is set explicitly. # Also, MoonshotAI does not support strict tool definitions # https://platform.moonshot.ai/docs/guide/migrating-from-openai-to-kimi#about-tool_choice # "Please note that the current version of Kimi API does not support the tool_choice=required parameter." return OpenAIModelProfile( json_schema_transformer=OpenAIJsonSchemaTransformer, openai_supports_tool_choice_required=False, supports_json_object_output=True, ).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('MOONSHOTAI_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `MOONSHOTAI_API_KEY` environment variable or pass it via ' '`MoonshotAIProvider(api_key=...)` to use the MoonshotAI provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='moonshotai') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for local or remote Ollama API. Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python class OllamaProvider(Provider[AsyncOpenAI]): """Provider for local or remote Ollama API.""" @property def name(self) -> str: return 'ollama' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: prefix_to_profile = { 'llama': meta_model_profile, 'gemma': google_model_profile, 'qwen': qwen_model_profile, 'qwq': qwen_model_profile, 'deepseek': deepseek_model_profile, 'mistral': mistral_model_profile, 'command': cohere_model_profile, 'gpt-oss': harmony_model_profile, } profile = None for prefix, profile_func in prefix_to_profile.items(): model_name = model_name.lower() if model_name.startswith(prefix): profile = profile_func(model_name) # As OllamaProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` ### __init__ ```python __init__( base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None, ) -> None ``` Create a new Ollama provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `base_url` | `str | None` | The base url for the Ollama requests. If not provided, the OLLAMA_BASE_URL environment variable will be used if available. | `None` | | `api_key` | `str | None` | The API key to use for authentication, if not provided, the OLLAMA_API_KEY environment variable will be used if available. | `None` | | `openai_client` | `AsyncOpenAI | None` | An existing AsyncOpenAI client to use. If provided, base_url, api_key, and http_client must be None. | `None` | | `http_client` | `AsyncClient | None` | An existing httpx.AsyncClient to use for making HTTP requests. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/ollama.py` ```python def __init__( self, base_url: str | None = None, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: """Create a new Ollama provider. Args: base_url: The base url for the Ollama requests. If not provided, the `OLLAMA_BASE_URL` environment variable will be used if available. api_key: The API key to use for authentication, if not provided, the `OLLAMA_API_KEY` environment variable will be used if available. openai_client: An existing [`AsyncOpenAI`](https://github.com/openai/openai-python?tab=readme-ov-file#async-usage) client to use. If provided, `base_url`, `api_key`, and `http_client` must be `None`. http_client: An existing `httpx.AsyncClient` to use for making HTTP requests. """ if openai_client is not None: assert base_url is None, 'Cannot provide both `openai_client` and `base_url`' assert http_client is None, 'Cannot provide both `openai_client` and `http_client`' assert api_key is None, 'Cannot provide both `openai_client` and `api_key`' self._client = openai_client else: base_url = base_url or os.getenv('OLLAMA_BASE_URL') if not base_url: raise UserError( 'Set the `OLLAMA_BASE_URL` environment variable or pass it via `OllamaProvider(base_url=...)`' 'to use the Ollama provider.' ) # This is a workaround for the OpenAI client requiring an API key, whilst locally served, # openai compatible models do not always need an API key, but a placeholder (non-empty) key is required. api_key = api_key or os.getenv('OLLAMA_API_KEY') or 'api-key-not-set' if http_client is not None: self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='ollama') self._client = AsyncOpenAI(base_url=base_url, api_key=api_key, http_client=http_client) ``` Bases: `Provider[AsyncOpenAI]` Provider for LiteLLM API. Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python class LiteLLMProvider(Provider[AsyncOpenAI]): """Provider for LiteLLM API.""" @property def name(self) -> str: return 'litellm' @property def base_url(self) -> str: return str(self.client.base_url) @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: # Map provider prefixes to their profile functions provider_to_profile = { 'anthropic': anthropic_model_profile, 'openai': openai_model_profile, 'google': google_model_profile, 'mistralai': mistral_model_profile, 'mistral': mistral_model_profile, 'cohere': cohere_model_profile, 'amazon': amazon_model_profile, 'bedrock': amazon_model_profile, 'meta-llama': meta_model_profile, 'meta': meta_model_profile, 'groq': groq_model_profile, 'deepseek': deepseek_model_profile, 'moonshotai': moonshotai_model_profile, 'x-ai': grok_model_profile, 'qwen': qwen_model_profile, } profile = None # Check if model name contains a provider prefix (e.g., "anthropic/claude-3") if '/' in model_name: provider_prefix, model_suffix = model_name.split('/', 1) if provider_prefix in provider_to_profile: profile = provider_to_profile[provider_prefix](model_suffix) # If no profile found, default to OpenAI profile if profile is None: profile = openai_model_profile(model_name) # As LiteLLMProvider is used with OpenAIModel, which uses OpenAIJsonSchemaTransformer, # we maintain that behavior return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, ) -> None: ... @overload def __init__( self, *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncHTTPClient, ) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI) -> None: ... def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` ### __init__ ```python __init__( *, api_key: str | None = None, api_base: str | None = None ) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, http_client: AsyncClient ) -> None ``` ```python __init__(*, openai_client: AsyncOpenAI) -> None ``` ```python __init__( *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncClient | None = None ) -> None ``` Initialize a LiteLLM provider. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `api_key` | `str | None` | API key for the model provider. If None, LiteLLM will try to get it from environment variables. | `None` | | `api_base` | `str | None` | Base URL for the model provider. Use this for custom endpoints or self-hosted models. | `None` | | `openai_client` | `AsyncOpenAI | None` | Pre-configured OpenAI client. If provided, other parameters are ignored. | `None` | | `http_client` | `AsyncClient | None` | Custom HTTP client to use. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/providers/litellm.py` ```python def __init__( self, *, api_key: str | None = None, api_base: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: AsyncHTTPClient | None = None, ) -> None: """Initialize a LiteLLM provider. Args: api_key: API key for the model provider. If None, LiteLLM will try to get it from environment variables. api_base: Base URL for the model provider. Use this for custom endpoints or self-hosted models. openai_client: Pre-configured OpenAI client. If provided, other parameters are ignored. http_client: Custom HTTP client to use. """ if openai_client is not None: self._client = openai_client return # Create OpenAI client that will be used with LiteLLM's completion function # The actual API calls will be intercepted and routed through LiteLLM if http_client is not None: self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) else: http_client = cached_async_http_client(provider='litellm') self._client = AsyncOpenAI( base_url=api_base, api_key=api_key or 'litellm-placeholder', http_client=http_client ) ``` Bases: `Provider[AsyncOpenAI]` Provider for Nebius AI Studio API. Source code in `pydantic_ai_slim/pydantic_ai/providers/nebius.py` ```python class NebiusProvider(Provider[AsyncOpenAI]): """Provider for Nebius AI Studio API.""" @property def name(self) -> str: return 'nebius' @property def base_url(self) -> str: return 'https://api.studio.nebius.com/v1' @property def client(self) -> AsyncOpenAI: return self._client def model_profile(self, model_name: str) -> ModelProfile | None: provider_to_profile = { 'meta-llama': meta_model_profile, 'deepseek-ai': deepseek_model_profile, 'qwen': qwen_model_profile, 'google': google_model_profile, 'openai': harmony_model_profile, # used for gpt-oss models on Nebius 'mistralai': mistral_model_profile, 'moonshotai': moonshotai_model_profile, } profile = None try: model_name = model_name.lower() provider, model_name = model_name.split('/', 1) except ValueError: raise UserError(f"Model name must be in 'provider/model' format, got: {model_name!r}") if provider in provider_to_profile: profile = provider_to_profile[provider](model_name) # As NebiusProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer, # we need to maintain that behavior unless json_schema_transformer is set explicitly return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile) @overload def __init__(self) -> None: ... @overload def __init__(self, *, api_key: str) -> None: ... @overload def __init__(self, *, api_key: str, http_client: httpx.AsyncClient) -> None: ... @overload def __init__(self, *, openai_client: AsyncOpenAI | None = None) -> None: ... def __init__( self, *, api_key: str | None = None, openai_client: AsyncOpenAI | None = None, http_client: httpx.AsyncClient | None = None, ) -> None: api_key = api_key or os.getenv('NEBIUS_API_KEY') if not api_key and openai_client is None: raise UserError( 'Set the `NEBIUS_API_KEY` environment variable or pass it via ' '`NebiusProvider(api_key=...)` to use the Nebius AI Studio provider.' ) if openai_client is not None: self._client = openai_client elif http_client is not None: self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) else: http_client = cached_async_http_client(provider='nebius') self._client = AsyncOpenAI(base_url=self.base_url, api_key=api_key, http_client=http_client) ``` # `pydantic_ai.result` ### StreamedRunResult Bases: `Generic[AgentDepsT, OutputDataT]` Result of a streamed run that returns structured data via a tool call. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @dataclass(init=False) class StreamedRunResult(Generic[AgentDepsT, OutputDataT]): """Result of a streamed run that returns structured data via a tool call.""" _all_messages: list[_messages.ModelMessage] _new_message_index: int _stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None _on_complete: Callable[[], Awaitable[None]] | None = None _run_result: AgentRunResult[OutputDataT] | None = None is_complete: bool = field(default=False, init=False) """Whether the stream has all been received. This is set to `True` when one of [`stream_output`][pydantic_ai.result.StreamedRunResult.stream_output], [`stream_text`][pydantic_ai.result.StreamedRunResult.stream_text], [`stream_responses`][pydantic_ai.result.StreamedRunResult.stream_responses] or [`get_output`][pydantic_ai.result.StreamedRunResult.get_output] completes. """ @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None, on_complete: Callable[[], Awaitable[None]] | None, ) -> None: ... @overload def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, *, run_result: AgentRunResult[OutputDataT], ) -> None: ... def __init__( self, all_messages: list[_messages.ModelMessage], new_message_index: int, stream_response: AgentStream[AgentDepsT, OutputDataT] | None = None, on_complete: Callable[[], Awaitable[None]] | None = None, run_result: AgentRunResult[OutputDataT] | None = None, ) -> None: self._all_messages = all_messages self._new_message_index = new_message_index self._stream_response = stream_response self._on_complete = on_complete self._run_result = run_result def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: yield self.response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self.response yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self.response) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover @property def response(self) -> _messages.ModelResponse: """Return the current state of the response.""" if self._run_result is not None: return self._run_result.response elif self._stream_response is not None: return self._stream_response.get() else: raise ValueError('No stream response or run result provided') # pragma: no cover # TODO (v2): Make this a property def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover # TODO (v2): Make this a property def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover async def _marked_completed(self, message: _messages.ModelResponse | None = None) -> None: self.is_complete = True if message is not None: self._all_messages.append(message) if self._on_complete is not None: await self._on_complete() ``` #### is_complete ```python is_complete: bool = field(default=False, init=False) ``` Whether the stream has all been received. This is set to `True` when one of stream_output, stream_text, stream_responses or get_output completes. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ # this is a method to be consistent with the other methods if output_tool_return_content is not None: raise NotImplementedError('Setting output tool return content is not supported for this result type.') return self._all_messages ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return all messages from [`all_messages`][pydantic_ai.result.StreamedRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: # pragma: no cover """Return new messages from [`new_messages`][pydantic_ai.result.StreamedRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### stream ```python stream( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Deprecated `StreamedRunResult.stream` is deprecated, use `stream_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream` is deprecated, use `stream_output` instead.') async def stream(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: async for output in self.stream_output(debounce_by=debounce_by): yield output ``` #### stream_output ```python stream_output( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[OutputDataT] ``` Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the output chunks by. None means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. | `0.1` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[OutputDataT]` | An async iterable of the response data. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_output(self, *, debounce_by: float | None = 0.1) -> AsyncIterator[OutputDataT]: """Stream the output as an async iterable. The pydantic validator for structured data will be called in [partial mode](https://docs.pydantic.dev/dev/concepts/experimental/#partial-validation) on each iteration. Args: debounce_by: by how much (if at all) to debounce/group the output chunks by. `None` means no debouncing. Debouncing is particularly important for long structured outputs to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the response data. """ if self._run_result is not None: yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for output in self._stream_response.stream_output(debounce_by=debounce_by): yield output await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_text ```python stream_text( *, delta: bool = False, debounce_by: float | None = 0.1 ) -> AsyncIterator[str] ``` Stream the text result as an async iterable. Note Result validators will NOT be called on the text result if `delta=True`. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `delta` | `bool` | if True, yield each chunk of text as it is received, if False (default), yield the full text up to the current point. | `False` | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | `0.1` | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_text(self, *, delta: bool = False, debounce_by: float | None = 0.1) -> AsyncIterator[str]: """Stream the text result as an async iterable. !!! note Result validators will NOT be called on the text result if `delta=True`. Args: delta: if `True`, yield each chunk of text as it is received, if `False` (default), yield the full text up to the current point. debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. """ if self._run_result is not None: # pragma: no cover # We can't really get here, as `_run_result` is only set in `run_stream` when `CallToolsNode` produces `DeferredToolRequests` output # as a result of a tool function raising `CallDeferred` or `ApprovalRequired`. # That'll change if we ever support something like `raise EndRun(output: OutputT)` where `OutputT` could be `str`. if not isinstance(self._run_result.output, str): raise exceptions.UserError('stream_text() can only be used with text responses') yield self._run_result.output await self._marked_completed() elif self._stream_response is not None: async for text in self._stream_response.stream_text(delta=delta, debounce_by=debounce_by): yield text await self._marked_completed(self.response) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### stream_structured ```python stream_structured( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Deprecated `StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`StreamedRunResult.stream_structured` is deprecated, use `stream_responses` instead.') async def stream_structured( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: async for msg, last in self.stream_responses(debounce_by=debounce_by): yield msg, last ``` #### stream_responses ```python stream_responses( *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[ModelResponse, bool]] ``` Stream the response as an async iterable of Structured LLM Messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `debounce_by` | `float | None` | by how much (if at all) to debounce/group the response chunks by. None means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. | `0.1` | Returns: | Type | Description | | --- | --- | | `AsyncIterator[tuple[ModelResponse, bool]]` | An async iterable of the structured response message and whether that is the last message. | Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def stream_responses( self, *, debounce_by: float | None = 0.1 ) -> AsyncIterator[tuple[_messages.ModelResponse, bool]]: """Stream the response as an async iterable of Structured LLM Messages. Args: debounce_by: by how much (if at all) to debounce/group the response chunks by. `None` means no debouncing. Debouncing is particularly important for long structured responses to reduce the overhead of performing validation as each token is received. Returns: An async iterable of the structured response message and whether that is the last message. """ if self._run_result is not None: yield self.response, True await self._marked_completed() elif self._stream_response is not None: # if the message currently has any parts with content, yield before streaming async for msg in self._stream_response.stream_responses(debounce_by=debounce_by): yield msg, False msg = self.response yield msg, True await self._marked_completed(msg) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### get_output ```python get_output() -> OutputDataT ``` Stream the whole response, validate and return it. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def get_output(self) -> OutputDataT: """Stream the whole response, validate and return it.""" if self._run_result is not None: output = self._run_result.output await self._marked_completed() return output elif self._stream_response is not None: output = await self._stream_response.get_output() await self._marked_completed(self.response) return output else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### response ```python response: ModelResponse ``` Return the current state of the response. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Note This won't return the full usage until the stream is finished. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def usage(self) -> RunUsage: """Return the usage of the whole run. !!! note This won't return the full usage until the stream is finished. """ if self._run_result is not None: return self._run_result.usage() elif self._stream_response is not None: return self._stream_response.usage() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### timestamp ```python timestamp() -> datetime ``` Get the timestamp of the response. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python def timestamp(self) -> datetime: """Get the timestamp of the response.""" if self._run_result is not None: return self._run_result.timestamp() elif self._stream_response is not None: return self._stream_response.timestamp() else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` #### validate_structured_output ```python validate_structured_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Deprecated `validate_structured_output` is deprecated, use `validate_response_output` instead. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python @deprecated('`validate_structured_output` is deprecated, use `validate_response_output` instead.') async def validate_structured_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: return await self.validate_response_output(message, allow_partial=allow_partial) ``` #### validate_response_output ```python validate_response_output( message: ModelResponse, *, allow_partial: bool = False ) -> OutputDataT ``` Validate a structured result message. Source code in `pydantic_ai_slim/pydantic_ai/result.py` ```python async def validate_response_output( self, message: _messages.ModelResponse, *, allow_partial: bool = False ) -> OutputDataT: """Validate a structured result message.""" if self._run_result is not None: return self._run_result.output elif self._stream_response is not None: return await self._stream_response.validate_response_output(message, allow_partial=allow_partial) else: raise ValueError('No stream response or run result provided') # pragma: no cover ``` # `pydantic_ai.retries` Retries utilities based on tenacity, especially for HTTP requests. This module provides HTTP transport wrappers and wait strategies that integrate with the tenacity library to add retry capabilities to HTTP requests. The transports can be used with HTTP clients that support custom transports (such as httpx), while the wait strategies can be used with any tenacity retry decorator. The module includes: - TenacityTransport: Synchronous HTTP transport with retry capabilities - AsyncTenacityTransport: Asynchronous HTTP transport with retry capabilities - wait_retry_after: Wait strategy that respects HTTP Retry-After headers ### RetryConfig Bases: `TypedDict` The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python class RetryConfig(TypedDict, total=False): """The configuration for tenacity-based retrying. These are precisely the arguments to the tenacity `retry` decorator, and they are generally used internally by passing them to that decorator via `@retry(**config)` or similar. All fields are optional, and if not provided, the default values from the `tenacity.retry` decorator will be used. """ sleep: Callable[[int | float], None | Awaitable[None]] """A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`.""" stop: StopBaseT """ A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`.""" wait: WaitBaseT """ A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`.""" retry: SyncRetryBaseT | RetryBaseT """A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`.""" before: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`.""" after: Callable[[RetryCallState], None | Awaitable[None]] """ A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`.""" before_sleep: Callable[[RetryCallState], None | Awaitable[None]] | None """ An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`.""" reraise: bool """Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`.""" retry_error_cls: type[RetryError] """The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`.""" retry_error_callback: Callable[[RetryCallState], Any | Awaitable[Any]] | None """An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`.""" ``` #### sleep ```python sleep: Callable[[int | float], None | Awaitable[None]] ``` A sleep strategy to use for sleeping between retries. Tenacity's default for this argument is `tenacity.nap.sleep`. #### stop ```python stop: StopBaseT ``` A stop strategy to determine when to stop retrying. Tenacity's default for this argument is `tenacity.stop.stop_never`. #### wait ```python wait: WaitBaseT ``` A wait strategy to determine how long to wait between retries. Tenacity's default for this argument is `tenacity.wait.wait_none`. #### retry ```python retry: RetryBaseT | RetryBaseT ``` A retry strategy to determine which exceptions should trigger a retry. Tenacity's default for this argument is `tenacity.retry.retry_if_exception_type()`. #### before ```python before: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called before each retry attempt. Tenacity's default for this argument is `tenacity.before.before_nothing`. #### after ```python after: Callable[[RetryCallState], None | Awaitable[None]] ``` A callable that is called after each retry attempt. Tenacity's default for this argument is `tenacity.after.after_nothing`. #### before_sleep ```python before_sleep: ( Callable[[RetryCallState], None | Awaitable[None]] | None ) ``` An optional callable that is called before sleeping between retries. Tenacity's default for this argument is `None`. #### reraise ```python reraise: bool ``` Whether to reraise the last exception if the retry attempts are exhausted, or raise a RetryError instead. Tenacity's default for this argument is `False`. #### retry_error_cls ```python retry_error_cls: type[RetryError] ``` The exception class to raise when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `tenacity.RetryError`. #### retry_error_callback ```python retry_error_callback: ( Callable[[RetryCallState], Any | Awaitable[Any]] | None ) ``` An optional callable that is called when the retry attempts are exhausted and `reraise` is False. Tenacity's default for this argument is `None`. ### TenacityTransport Bases: `BaseTransport` Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `BaseTransport | None` | The underlying transport to wrap and add retry functionality to. | `None` | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | `Callable[[Response], Any] | None` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | `None` | Example ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class TenacityTransport(BaseTransport): """Synchronous HTTP transport with tenacity-based retry functionality. This transport wraps another BaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import Client, HTTPStatusError, HTTPTransport from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import RetryConfig, TenacityTransport, wait_retry_after transport = TenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), HTTPTransport(), validate_response=lambda r: r.raise_for_status() ) client = Client(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: BaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or HTTPTransport() self.validate_response = validate_response def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) def __enter__(self) -> TenacityTransport: self.wrapped.__enter__() return self def __exit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: self.wrapped.__exit__(exc_type, exc_value, traceback) def close(self) -> None: self.wrapped.close() # pragma: no cover ```` #### handle_request ```python handle_request(request: Request) -> Response ``` Handle an HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | --- | --- | | `Response` | The HTTP response. | Raises: | Type | Description | | --- | --- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python def handle_request(self, request: Request) -> Response: """Handle an HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) def handle_request(req: Request) -> Response: response = self.wrapped.handle_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: response.close() raise return response return handle_request(request) ``` ### AsyncTenacityTransport Bases: `AsyncBaseTransport` Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `wrapped` | `AsyncBaseTransport | None` | The underlying async transport to wrap and add retry functionality to. | `None` | | `config` | `RetryConfig` | The arguments to use for the tenacity retry decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. | *required* | | `validate_response` | `Callable[[Response], Any] | None` | Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. | `None` | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python class AsyncTenacityTransport(AsyncBaseTransport): """Asynchronous HTTP transport with tenacity-based retry functionality. This transport wraps another AsyncBaseTransport and adds retry capabilities using the tenacity library. It can be configured to retry requests based on various conditions such as specific exception types, response status codes, or custom validation logic. The transport works by intercepting HTTP requests and responses, allowing the tenacity controller to determine when and how to retry failed requests. The validate_response function can be used to convert HTTP responses into exceptions that trigger retries. Args: wrapped: The underlying async transport to wrap and add retry functionality to. config: The arguments to use for the tenacity `retry` decorator, including retry conditions, wait strategy, stop conditions, etc. See the tenacity docs for more info. validate_response: Optional callable that takes a Response and can raise an exception to be handled by the controller if the response should trigger a retry. Common use case is to raise exceptions for certain HTTP status codes. If None, no response validation is performed. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=300), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ def __init__( self, config: RetryConfig, wrapped: AsyncBaseTransport | None = None, validate_response: Callable[[Response], Any] | None = None, ): self.config = config self.wrapped = wrapped or AsyncHTTPTransport() self.validate_response = validate_response async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) async def __aenter__(self) -> AsyncTenacityTransport: await self.wrapped.__aenter__() return self async def __aexit__( self, exc_type: type[BaseException] | None = None, exc_value: BaseException | None = None, traceback: TracebackType | None = None, ) -> None: await self.wrapped.__aexit__(exc_type, exc_value, traceback) async def aclose(self) -> None: await self.wrapped.aclose() ```` #### handle_async_request ```python handle_async_request(request: Request) -> Response ``` Handle an async HTTP request with retry logic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `request` | `Request` | The HTTP request to handle. | *required* | Returns: | Type | Description | | --- | --- | | `Response` | The HTTP response. | Raises: | Type | Description | | --- | --- | | `RuntimeError` | If the retry controller did not make any attempts. | | `Exception` | Any exception raised by the wrapped transport or validation function. | Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ```python async def handle_async_request(self, request: Request) -> Response: """Handle an async HTTP request with retry logic. Args: request: The HTTP request to handle. Returns: The HTTP response. Raises: RuntimeError: If the retry controller did not make any attempts. Exception: Any exception raised by the wrapped transport or validation function. """ @retry(**self.config) async def handle_async_request(req: Request) -> Response: response = await self.wrapped.handle_async_request(req) # this is normally set by httpx _after_ calling this function, but we want the request in the validator: response.request = req if self.validate_response: try: self.validate_response(response) except Exception: await response.aclose() raise return response return await handle_async_request(request) ``` ### wait_retry_after ```python wait_retry_after( fallback_strategy: ( Callable[[RetryCallState], float] | None ) = None, max_wait: float = 300, ) -> Callable[[RetryCallState], float] ``` Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `fallback_strategy` | `Callable[[RetryCallState], float] | None` | Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. | `None` | | `max_wait` | `float` | Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). | `300` | Returns: | Type | Description | | --- | --- | | `Callable[[RetryCallState], float]` | A wait function that can be used with tenacity retry decorators. | Example ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` Source code in `pydantic_ai_slim/pydantic_ai/retries.py` ````python def wait_retry_after( fallback_strategy: Callable[[RetryCallState], float] | None = None, max_wait: float = 300 ) -> Callable[[RetryCallState], float]: """Create a tenacity-compatible wait strategy that respects HTTP Retry-After headers. This wait strategy checks if the exception contains an HTTPStatusError with a Retry-After header, and if so, waits for the time specified in the header. If no header is present or parsing fails, it falls back to the provided strategy. The Retry-After header can be in two formats: - An integer representing seconds to wait - An HTTP date string representing when to retry Args: fallback_strategy: Wait strategy to use when no Retry-After header is present or parsing fails. Defaults to exponential backoff with max 60s. max_wait: Maximum time to wait in seconds, regardless of header value. Defaults to 300 (5 minutes). Returns: A wait function that can be used with tenacity retry decorators. Example: ```python from httpx import AsyncClient, HTTPStatusError from tenacity import retry_if_exception_type, stop_after_attempt from pydantic_ai.retries import AsyncTenacityTransport, RetryConfig, wait_retry_after transport = AsyncTenacityTransport( RetryConfig( retry=retry_if_exception_type(HTTPStatusError), wait=wait_retry_after(max_wait=120), stop=stop_after_attempt(5), reraise=True ), validate_response=lambda r: r.raise_for_status() ) client = AsyncClient(transport=transport) ``` """ if fallback_strategy is None: fallback_strategy = wait_exponential(multiplier=1, max=60) def wait_func(state: RetryCallState) -> float: exc = state.outcome.exception() if state.outcome else None if isinstance(exc, HTTPStatusError): retry_after = exc.response.headers.get('retry-after') if retry_after: try: # Try parsing as seconds first wait_seconds = int(retry_after) return min(float(wait_seconds), max_wait) except ValueError: # Try parsing as HTTP date try: retry_time = cast(datetime, parsedate_to_datetime(retry_after)) assert isinstance(retry_time, datetime) now = datetime.now(timezone.utc) wait_seconds = (retry_time - now).total_seconds() if wait_seconds > 0: return min(wait_seconds, max_wait) except (ValueError, TypeError, AssertionError): # If date parsing fails, fall back to fallback strategy pass # Use fallback strategy return fallback_strategy(state) return wait_func ```` # `pydantic_ai.run` ### AgentRun Bases: `Generic[AgentDepsT, OutputDataT]` A stateful, async-iterable run of an Agent. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an End is reached, the run finishes and result becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the next method for more granular control. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python @dataclasses.dataclass(repr=False) class AgentRun(Generic[AgentDepsT, OutputDataT]): """A stateful, async-iterable run of an [`Agent`][pydantic_ai.agent.Agent]. You generally obtain an `AgentRun` instance by calling `async with my_agent.iter(...) as agent_run:`. Once you have an instance, you can use it to iterate through the run's nodes as they execute. When an [`End`][pydantic_graph.nodes.End] is reached, the run finishes and [`result`][pydantic_ai.agent.AgentRun.result] becomes available. Example: ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4o') async def main(): nodes = [] # Iterate through the run, recording each node along the way: async with agent.iter('What is the capital of France?') as agent_run: async for node in agent_run: nodes.append(node) print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print(agent_run.result.output) #> The capital of France is Paris. ``` You can also manually drive the iteration using the [`next`][pydantic_ai.agent.AgentRun.next] method for more granular control. """ _graph_run: GraphRun[ _agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any], FinalResult[OutputDataT] ] @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: traceparent = self._graph_run._traceparent(required=False) # type: ignore[reportPrivateUsage] if traceparent is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return traceparent @property def ctx(self) -> GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]: """The current context of the agent run.""" return GraphRunContext[_agent_graph.GraphAgentState, _agent_graph.GraphAgentDeps[AgentDepsT, Any]]( state=self._graph_run.state, deps=self._graph_run.deps ) @property def next_node( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ next_node = self._graph_run.next_node if isinstance(next_node, End): return next_node if _agent_graph.is_agent_node(next_node): return next_node raise exceptions.AgentRunError(f'Unexpected node type: {type(next_node)}') # pragma: no cover @property def result(self) -> AgentRunResult[OutputDataT] | None: """The final result of the run if it has ended, otherwise `None`. Once the run returns an [`End`][pydantic_graph.nodes.End] node, `result` is populated with an [`AgentRunResult`][pydantic_ai.agent.AgentRunResult]. """ graph_run_result = self._graph_run.result if graph_run_result is None: return None return AgentRunResult( graph_run_result.output.output, graph_run_result.output.tool_name, graph_run_result.state, self._graph_run.deps.new_message_index, self._traceparent(required=False), ) def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage def __repr__(self) -> str: # pragma: no cover result = self._graph_run.result result_repr = '' if result is None else repr(result.output) return f'<{type(self).__name__} result={result_repr} usage={self.usage()}>' ```` #### ctx ```python ctx: GraphRunContext[ GraphAgentState, GraphAgentDeps[AgentDepsT, Any] ] ``` The current context of the agent run. #### next_node ```python next_node: ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` The next node that will be run in the agent graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: AgentRunResult[OutputDataT] | None ``` The final result of the run if it has ended, otherwise `None`. Once the run returns an End node, `result` is populated with an AgentRunResult. #### __aiter__ ```python __aiter__() -> ( AsyncIterator[ AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ] ) ``` Provide async-iteration over the nodes in the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def __aiter__( self, ) -> AsyncIterator[_agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]]: """Provide async-iteration over the nodes in the agent run.""" return self ``` #### __anext__ ```python __anext__() -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Advance to the next node automatically based on the last returned node. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python async def __anext__( self, ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Advance to the next node automatically based on the last returned node.""" next_node = await self._graph_run.__anext__() if _agent_graph.is_agent_node(node=next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ``` #### next ```python next( node: AgentNode[AgentDepsT, OutputDataT], ) -> ( AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]] ) ``` Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an End node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `AgentNode[AgentDepsT, OutputDataT]` | The node to run next in the graph. | *required* | Returns: | Type | Description | | --- | --- | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | The next node returned by the graph logic, or an End node if | | `AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]` | the run has completed. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ````python async def next( self, node: _agent_graph.AgentNode[AgentDepsT, OutputDataT], ) -> _agent_graph.AgentNode[AgentDepsT, OutputDataT] | End[FinalResult[OutputDataT]]: """Manually drive the agent run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The agent run should be stopped when you return an [`End`][pydantic_graph.nodes.End] node. Example: ```python from pydantic_ai import Agent from pydantic_graph import End agent = Agent('openai:gpt-4o') async def main(): async with agent.iter('What is the capital of France?') as agent_run: next_node = agent_run.next_node # start with the first node nodes = [next_node] while not isinstance(next_node, End): next_node = await agent_run.next(next_node) nodes.append(next_node) # Once `next_node` is an End, we've finished: print(nodes) ''' [ UserPromptNode( user_prompt='What is the capital of France?', instructions_functions=[], system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), ) ] ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='The capital of France is Paris.')], usage=RequestUsage(input_tokens=56, output_tokens=7), model_name='gpt-4o', timestamp=datetime.datetime(...), ) ), End(data=FinalResult(output='The capital of France is Paris.')), ] ''' print('Final result:', agent_run.result.output) #> Final result: The capital of France is Paris. ``` Args: node: The node to run next in the graph. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ # Note: It might be nice to expose a synchronous interface for iteration, but we shouldn't do it # on this class, or else IDEs won't warn you if you accidentally use `for` instead of `async for` to iterate. next_node = await self._graph_run.next(node) if _agent_graph.is_agent_node(next_node): return next_node assert isinstance(next_node, End), f'Unexpected node type: {type(next_node)}' return next_node ```` #### usage ```python usage() -> RunUsage ``` Get usage statistics for the run so far, including token usage, model requests, and so on. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Get usage statistics for the run so far, including token usage, model requests, and so on.""" return self._graph_run.state.usage ``` ### AgentRunResult Bases: `Generic[OutputDataT]` The final result of an agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass class AgentRunResult(Generic[OutputDataT]): """The final result of an agent run.""" output: OutputDataT """The output data from the agent run.""" _output_tool_name: str | None = dataclasses.field(repr=False, compare=False, default=None) _state: _agent_graph.GraphAgentState = dataclasses.field( repr=False, compare=False, default_factory=_agent_graph.GraphAgentState ) _new_message_index: int = dataclasses.field(repr=False, compare=False, default=0) _traceparent_value: str | None = dataclasses.field(repr=False, compare=False, default=None) @overload def _traceparent(self, *, required: Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self._traceparent_value is None and required: # pragma: no cover raise AttributeError('No span was created for this agent run') return self._traceparent_value def _set_output_tool_return(self, return_content: str) -> list[_messages.ModelMessage]: """Set return content for the output tool. Useful if you want to continue the conversation and want to set the response to the output tool call. """ if not self._output_tool_name: raise ValueError('Cannot set output tool return content when the return type is `str`.') messages = self._state.message_history last_message = messages[-1] for idx, part in enumerate(last_message.parts): if isinstance(part, _messages.ToolReturnPart) and part.tool_name == self._output_tool_name: # Only do deepcopy when we have to modify copied_messages = list(messages) copied_last = deepcopy(last_message) copied_last.parts[idx].content = return_content # type: ignore[misc] copied_messages[-1] = copied_last return copied_messages raise LookupError(f'No tool call found with tool name {self._output_tool_name!r}.') def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) @property def response(self) -> _messages.ModelResponse: """Return the last response from the message history.""" # The response may not be the very last item if it contained an output tool call. See `CallToolsNode._handle_final_result`. for message in reversed(self.all_messages()): if isinstance(message, _messages.ModelResponse): return message raise ValueError('No response found in the message history') # pragma: no cover # TODO (v2): Make this a property def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage # TODO (v2): Make this a property def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` #### output ```python output: OutputDataT ``` The output data from the agent run. #### all_messages ```python all_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return the history of \_messages. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return the history of _messages. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of messages. """ if output_tool_return_content is not None: return self._set_output_tool_return(output_tool_return_content) else: return self._state.message_history ``` #### all_messages_json ```python all_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return all messages from all_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def all_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return all messages from [`all_messages`][pydantic_ai.agent.AgentRunResult.all_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.all_messages(output_tool_return_content=output_tool_return_content) ) ``` #### new_messages ```python new_messages( *, output_tool_return_content: str | None = None ) -> list[ModelMessage] ``` Return new messages associated with this run. Messages from older runs are excluded. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `list[ModelMessage]` | List of new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages(self, *, output_tool_return_content: str | None = None) -> list[_messages.ModelMessage]: """Return new messages associated with this run. Messages from older runs are excluded. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: List of new messages. """ return self.all_messages(output_tool_return_content=output_tool_return_content)[self._new_message_index :] ``` #### new_messages_json ```python new_messages_json( *, output_tool_return_content: str | None = None ) -> bytes ``` Return new messages from new_messages as JSON bytes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `output_tool_return_content` | `str | None` | The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If None, the last message will not be modified. | `None` | Returns: | Type | Description | | --- | --- | | `bytes` | JSON bytes representing the new messages. | Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def new_messages_json(self, *, output_tool_return_content: str | None = None) -> bytes: """Return new messages from [`new_messages`][pydantic_ai.agent.AgentRunResult.new_messages] as JSON bytes. Args: output_tool_return_content: The return content of the tool call to set in the last message. This provides a convenient way to modify the content of the output tool call if you want to continue the conversation and want to set the response to the output tool call. If `None`, the last message will not be modified. Returns: JSON bytes representing the new messages. """ return _messages.ModelMessagesTypeAdapter.dump_json( self.new_messages(output_tool_return_content=output_tool_return_content) ) ``` #### response ```python response: ModelResponse ``` Return the last response from the message history. #### usage ```python usage() -> RunUsage ``` Return the usage of the whole run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def usage(self) -> _usage.RunUsage: """Return the usage of the whole run.""" return self._state.usage ``` #### timestamp ```python timestamp() -> datetime ``` Return the timestamp of last response. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python def timestamp(self) -> datetime: """Return the timestamp of last response.""" return self.response.timestamp ``` ### AgentRunResultEvent Bases: `Generic[OutputDataT]` An event indicating the agent run ended and containing the final result of the agent run. Source code in `pydantic_ai_slim/pydantic_ai/run.py` ```python @dataclasses.dataclass(repr=False) class AgentRunResultEvent(Generic[OutputDataT]): """An event indicating the agent run ended and containing the final result of the agent run.""" result: AgentRunResult[OutputDataT] """The result of the run.""" _: dataclasses.KW_ONLY event_kind: Literal['agent_run_result'] = 'agent_run_result' """Event type identifier, used as a discriminator.""" __repr__ = _utils.dataclasses_no_defaults_repr ``` #### result ```python result: AgentRunResult[OutputDataT] ``` The result of the run. #### event_kind ```python event_kind: Literal["agent_run_result"] = "agent_run_result" ``` Event type identifier, used as a discriminator. # `pydantic_ai.settings` ### ModelSettings Bases: `TypedDict` Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. Source code in `pydantic_ai_slim/pydantic_ai/settings.py` ```python class ModelSettings(TypedDict, total=False): """Settings to configure an LLM. Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models. """ max_tokens: int """The maximum number of tokens to generate before stopping. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock * MCP Sampling """ temperature: float """Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock """ top_p: float """An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: * Gemini * Anthropic * OpenAI * Groq * Cohere * Mistral * Bedrock """ timeout: float | Timeout """Override the client-level default timeout for a request, in seconds. Supported by: * Gemini * Anthropic * OpenAI * Groq * Mistral """ parallel_tool_calls: bool """Whether to allow parallel tool calls. Supported by: * OpenAI (some models, not o1) * Groq * Anthropic """ seed: int """The random seed to use for the model, theoretically allowing for deterministic results. Supported by: * OpenAI * Groq * Cohere * Mistral * Gemini """ presence_penalty: float """Penalize new tokens based on whether they have appeared in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral """ frequency_penalty: float """Penalize new tokens based on their existing frequency in the text so far. Supported by: * OpenAI * Groq * Cohere * Gemini * Mistral """ logit_bias: dict[str, int] """Modify the likelihood of specified tokens appearing in the completion. Supported by: * OpenAI * Groq """ stop_sequences: list[str] """Sequences that will cause the model to stop generating. Supported by: * OpenAI * Anthropic * Bedrock * Mistral * Groq * Cohere * Google """ extra_headers: dict[str, str] """Extra headers to send to the model. Supported by: * OpenAI * Anthropic * Groq """ extra_body: object """Extra body to send to the model. Supported by: * OpenAI * Anthropic * Groq """ ``` #### max_tokens ```python max_tokens: int ``` The maximum number of tokens to generate before stopping. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock - MCP Sampling #### temperature ```python temperature: float ``` Amount of randomness injected into the response. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's maximum `temperature` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock #### top_p ```python top_p: float ``` An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should either alter `temperature` or `top_p`, but not both. Supported by: - Gemini - Anthropic - OpenAI - Groq - Cohere - Mistral - Bedrock #### timeout ```python timeout: float | Timeout ``` Override the client-level default timeout for a request, in seconds. Supported by: - Gemini - Anthropic - OpenAI - Groq - Mistral #### parallel_tool_calls ```python parallel_tool_calls: bool ``` Whether to allow parallel tool calls. Supported by: - OpenAI (some models, not o1) - Groq - Anthropic #### seed ```python seed: int ``` The random seed to use for the model, theoretically allowing for deterministic results. Supported by: - OpenAI - Groq - Cohere - Mistral - Gemini #### presence_penalty ```python presence_penalty: float ``` Penalize new tokens based on whether they have appeared in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral #### frequency_penalty ```python frequency_penalty: float ``` Penalize new tokens based on their existing frequency in the text so far. Supported by: - OpenAI - Groq - Cohere - Gemini - Mistral #### logit_bias ```python logit_bias: dict[str, int] ``` Modify the likelihood of specified tokens appearing in the completion. Supported by: - OpenAI - Groq #### stop_sequences ```python stop_sequences: list[str] ``` Sequences that will cause the model to stop generating. Supported by: - OpenAI - Anthropic - Bedrock - Mistral - Groq - Cohere - Google #### extra_headers ```python extra_headers: dict[str, str] ``` Extra headers to send to the model. Supported by: - OpenAI - Anthropic - Groq #### extra_body ```python extra_body: object ``` Extra body to send to the model. Supported by: - OpenAI - Anthropic - Groq # `pydantic_ai.tools` ### AgentDepsT ```python AgentDepsT = TypeVar( "AgentDepsT", default=None, contravariant=True ) ``` Type variable for agent dependencies. ### RunContext Bases: `Generic[AgentDepsT]` Information about the current call. Source code in `pydantic_ai_slim/pydantic_ai/_run_context.py` ```python @dataclasses.dataclass(repr=False, kw_only=True) class RunContext(Generic[AgentDepsT]): """Information about the current call.""" deps: AgentDepsT """Dependencies for the agent.""" model: Model """The model used in this run.""" usage: RunUsage """LLM usage associated with the run.""" prompt: str | Sequence[_messages.UserContent] | None = None """The original user prompt passed to the run.""" messages: list[_messages.ModelMessage] = field(default_factory=list) """Messages exchanged in the conversation so far.""" tracer: Tracer = field(default_factory=NoOpTracer) """The tracer to use for tracing the run.""" trace_include_content: bool = False """Whether to include the content of the messages in the trace.""" instrumentation_version: int = DEFAULT_INSTRUMENTATION_VERSION """Instrumentation settings version, if instrumentation is enabled.""" retries: dict[str, int] = field(default_factory=dict) """Number of retries for each tool so far.""" tool_call_id: str | None = None """The ID of the tool call.""" tool_name: str | None = None """Name of the tool being called.""" retry: int = 0 """Number of retries of this tool so far.""" max_retries: int = 0 """The maximum number of retries of this tool.""" run_step: int = 0 """The current step in the run.""" tool_call_approved: bool = False """Whether a tool call that required approval has now been approved.""" @property def last_attempt(self) -> bool: """Whether this is the last attempt at running this tool before an error is raised.""" return self.retry == self.max_retries __repr__ = _utils.dataclasses_no_defaults_repr ``` #### deps ```python deps: AgentDepsT ``` Dependencies for the agent. #### model ```python model: Model ``` The model used in this run. #### usage ```python usage: RunUsage ``` LLM usage associated with the run. #### prompt ```python prompt: str | Sequence[UserContent] | None = None ``` The original user prompt passed to the run. #### messages ```python messages: list[ModelMessage] = field(default_factory=list) ``` Messages exchanged in the conversation so far. #### tracer ```python tracer: Tracer = field(default_factory=NoOpTracer) ``` The tracer to use for tracing the run. #### trace_include_content ```python trace_include_content: bool = False ``` Whether to include the content of the messages in the trace. #### instrumentation_version ```python instrumentation_version: int = ( DEFAULT_INSTRUMENTATION_VERSION ) ``` Instrumentation settings version, if instrumentation is enabled. #### retries ```python retries: dict[str, int] = field(default_factory=dict) ``` Number of retries for each tool so far. #### tool_call_id ```python tool_call_id: str | None = None ``` The ID of the tool call. #### tool_name ```python tool_name: str | None = None ``` Name of the tool being called. #### retry ```python retry: int = 0 ``` Number of retries of this tool so far. #### max_retries ```python max_retries: int = 0 ``` The maximum number of retries of this tool. #### run_step ```python run_step: int = 0 ``` The current step in the run. #### tool_call_approved ```python tool_call_approved: bool = False ``` Whether a tool call that required approval has now been approved. #### last_attempt ```python last_attempt: bool ``` Whether this is the last attempt at running this tool before an error is raised. ### ToolParams ```python ToolParams = ParamSpec('ToolParams', default=...) ``` Retrieval function param spec. ### SystemPromptFunc ```python SystemPromptFunc: TypeAlias = ( Callable[[RunContext[AgentDepsT]], str] | Callable[[RunContext[AgentDepsT]], Awaitable[str]] | Callable[[], str] | Callable[[], Awaitable[str]] ) ``` A function that may or maybe not take `RunContext` as an argument, and may or may not be async. Usage `SystemPromptFunc[AgentDepsT]`. ### ToolFuncContext ```python ToolFuncContext: TypeAlias = Callable[ Concatenate[RunContext[AgentDepsT], ToolParams], Any ] ``` A tool function that takes `RunContext` as the first argument. Usage `ToolContextFunc[AgentDepsT, ToolParams]`. ### ToolFuncPlain ```python ToolFuncPlain: TypeAlias = Callable[ToolParams, Any] ``` A tool function that does not take `RunContext` as the first argument. Usage `ToolPlainFunc[ToolParams]`. ### ToolFuncEither ```python ToolFuncEither: TypeAlias = ( ToolFuncContext[AgentDepsT, ToolParams] | ToolFuncPlain[ToolParams] ) ``` Either kind of tool function. This is just a union of ToolFuncContext and ToolFuncPlain. Usage `ToolFuncEither[AgentDepsT, ToolParams]`. ### ToolPrepareFunc ```python ToolPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], "ToolDefinition"], Awaitable["ToolDefinition | None"], ] ``` Definition of a function that can prepare a tool definition at call time. See [tool docs](../../tools-advanced/#tool-prepare) for more information. Example — here `only_if_42` is valid as a `ToolPrepareFunc`: ```python from pydantic_ai import RunContext, Tool from pydantic_ai.tools import ToolDefinition async def only_if_42( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: if ctx.deps == 42: return tool_def def hitchhiker(ctx: RunContext[int], answer: str) -> str: return f'{ctx.deps} {answer}' hitchhiker = Tool(hitchhiker, prepare=only_if_42) ``` Usage `ToolPrepareFunc[AgentDepsT]`. ### ToolsPrepareFunc ```python ToolsPrepareFunc: TypeAlias = Callable[ [RunContext[AgentDepsT], list["ToolDefinition"]], Awaitable["list[ToolDefinition] | None"], ] ``` Definition of a function that can prepare the tool definition of all tools for each step. This is useful if you want to customize the definition of multiple tools or you want to register a subset of tools for a given step. Example — here `turn_on_strict_if_openai` is valid as a `ToolsPrepareFunc`: ```python from dataclasses import replace from pydantic_ai import Agent, RunContext from pydantic_ai.tools import ToolDefinition async def turn_on_strict_if_openai( ctx: RunContext[None], tool_defs: list[ToolDefinition] ) -> list[ToolDefinition] | None: if ctx.model.system == 'openai': return [replace(tool_def, strict=True) for tool_def in tool_defs] return tool_defs agent = Agent('openai:gpt-4o', prepare_tools=turn_on_strict_if_openai) ``` Usage `ToolsPrepareFunc[AgentDepsT]`. ### DocstringFormat ```python DocstringFormat: TypeAlias = Literal[ "google", "numpy", "sphinx", "auto" ] ``` Supported docstring formats. - `'google'` — [Google-style](https://google.github.io/styleguide/pyguide.html#381-docstrings) docstrings. - `'numpy'` — [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html) docstrings. - `'sphinx'` — [Sphinx-style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format) docstrings. - `'auto'` — Automatically infer the format based on the structure of the docstring. ### DeferredToolRequests Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a DeferredToolResults object with the same tool call IDs. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolRequests: """Tool calls that require approval or external execution. This can be used as an agent's `output_type` and will be used as the output of the agent run if the model called any deferred tools. Results can be passed to the next agent run using a [`DeferredToolResults`][pydantic_ai.tools.DeferredToolResults] object with the same tool call IDs. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: list[ToolCallPart] = field(default_factory=list) """Tool calls that require external execution.""" approvals: list[ToolCallPart] = field(default_factory=list) """Tool calls that require human-in-the-loop approval.""" ``` #### calls ```python calls: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require external execution. #### approvals ```python approvals: list[ToolCallPart] = field(default_factory=list) ``` Tool calls that require human-in-the-loop approval. ### ToolApproved Indicates that a tool call has been approved and that the tool function should be executed. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class ToolApproved: """Indicates that a tool call has been approved and that the tool function should be executed.""" override_args: dict[str, Any] | None = None """Optional tool call arguments to use instead of the original arguments.""" kind: Literal['tool-approved'] = 'tool-approved' ``` #### override_args ```python override_args: dict[str, Any] | None = None ``` Optional tool call arguments to use instead of the original arguments. ### ToolDenied Indicates that a tool call has been denied and that a denial message should be returned to the model. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass class ToolDenied: """Indicates that a tool call has been denied and that a denial message should be returned to the model.""" message: str = 'The tool call was denied.' """The message to return to the model.""" _: KW_ONLY kind: Literal['tool-denied'] = 'tool-denied' ``` #### message ```python message: str = 'The tool call was denied.' ``` The message to return to the model. ### DeferredToolResults Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the DeferredToolRequests output object from the previous run. See [deferred tools docs](../../deferred-tools/#deferred-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(kw_only=True) class DeferredToolResults: """Results for deferred tool calls from a previous run that required approval or external execution. The tool call IDs need to match those from the [`DeferredToolRequests`][pydantic_ai.output.DeferredToolRequests] output object from the previous run. See [deferred tools docs](../deferred-tools.md#deferred-tools) for more information. """ calls: dict[str, DeferredToolCallResult | Any] = field(default_factory=dict) """Map of tool call IDs to results for tool calls that required external execution.""" approvals: dict[str, bool | DeferredToolApprovalResult] = field(default_factory=dict) """Map of tool call IDs to results for tool calls that required human-in-the-loop approval.""" ``` #### calls ```python calls: dict[str, DeferredToolCallResult | Any] = field( default_factory=dict ) ``` Map of tool call IDs to results for tool calls that required external execution. #### approvals ```python approvals: dict[str, bool | DeferredToolApprovalResult] = ( field(default_factory=dict) ) ``` Map of tool call IDs to results for tool calls that required human-in-the-loop approval. ### Tool Bases: `Generic[AgentDepsT]` A tool function for an agent. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python @dataclass(init=False) class Tool(Generic[AgentDepsT]): """A tool function for an agent.""" function: ToolFuncEither[AgentDepsT] takes_ctx: bool max_retries: int | None name: str description: str | None prepare: ToolPrepareFunc[AgentDepsT] | None docstring_format: DocstringFormat require_parameter_descriptions: bool strict: bool | None sequential: bool requires_approval: bool metadata: dict[str, Any] | None function_schema: _function_schema.FunctionSchema """ The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. """ def __init__( self, function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) @property def tool_def(self): return ToolDefinition( name=self.name, description=self.description, parameters_json_schema=self.function_schema.json_schema, strict=self.strict, sequential=self.sequential, metadata=self.metadata, ) async def prepare_tool_def(self, ctx: RunContext[AgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.requires_approval and not ctx.tool_call_approved: base_tool_def = replace(base_tool_def, kind='unapproved') if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ```` #### __init__ ```python __init__( function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, function_schema: FunctionSchema | None = None ) ``` Create a new tool instance. Example usage: ```python from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `ToolFuncEither[AgentDepsT]` | The Python function to call as the tool. | *required* | | `takes_ctx` | `bool | None` | Whether the function takes a RunContext first argument, this is inferred if unset. | `None` | | `max_retries` | `int | None` | Maximum number of retries allowed for this tool, set to the agent default if None. | `None` | | `name` | `str | None` | Name of the tool, inferred from the function if None. | `None` | | `description` | `str | None` | Description of the tool, inferred from the function if None. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat` | The format of the docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use. Defaults to GenerateToolJsonSchema. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. | `False` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. | `None` | | `function_schema` | `FunctionSchema | None` | The function schema to use for the tool. If not provided, it will be generated. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ````python def __init__( self, function: ToolFuncEither[AgentDepsT], *, takes_ctx: bool | None = None, max_retries: int | None = None, name: str | None = None, description: str | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, function_schema: _function_schema.FunctionSchema | None = None, ): """Create a new tool instance. Example usage: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' agent = Agent('test', tools=[Tool(my_tool)]) ``` or with a custom prepare method: ```python {noqa="I001"} from pydantic_ai import Agent, RunContext, Tool from pydantic_ai.tools import ToolDefinition async def my_tool(ctx: RunContext[int], x: int, y: int) -> str: return f'{ctx.deps} {x} {y}' async def prep_my_tool( ctx: RunContext[int], tool_def: ToolDefinition ) -> ToolDefinition | None: # only register the tool if `deps == 42` if ctx.deps == 42: return tool_def agent = Agent('test', tools=[Tool(my_tool, prepare=prep_my_tool)]) ``` Args: function: The Python function to call as the tool. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] first argument, this is inferred if unset. max_retries: Maximum number of retries allowed for this tool, set to the agent default if `None`. name: Name of the tool, inferred from the function if `None`. description: Description of the tool, inferred from the function if `None`. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. schema_generator: The JSON schema generator class to use. Defaults to `GenerateToolJsonSchema`. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. function_schema: The function schema to use for the tool. If not provided, it will be generated. """ self.function = function self.function_schema = function_schema or _function_schema.function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) self.takes_ctx = self.function_schema.takes_ctx self.max_retries = max_retries self.name = name or function.__name__ self.description = description or self.function_schema.description self.prepare = prepare self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata ```` #### function_schema ```python function_schema: FunctionSchema = ( function_schema or function_schema( function, schema_generator, takes_ctx=takes_ctx, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, ) ) ``` The base JSON schema for the tool's parameters. This schema may be modified by the `prepare` function or by the Model class prior to including it in an API request. #### from_schema ```python from_schema( function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self ``` Creates a Pydantic tool from a function and a JSON schema. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `Callable[..., Any]` | The function to call. This will be called with keywords only, and no validation of the arguments will be performed. | *required* | | `name` | `str` | The unique name of the tool that clearly communicates its purpose | *required* | | `description` | `str | None` | Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. | *required* | | `json_schema` | `JsonSchemaValue` | The schema for the function arguments | *required* | | `takes_ctx` | `bool` | An optional boolean parameter indicating whether the function accepts the context object as an argument. | `False` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. | `False` | Returns: | Type | Description | | --- | --- | | `Self` | A Pydantic tool that calls the function | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @classmethod def from_schema( cls, function: Callable[..., Any], name: str, description: str | None, json_schema: JsonSchemaValue, takes_ctx: bool = False, sequential: bool = False, ) -> Self: """Creates a Pydantic tool from a function and a JSON schema. Args: function: The function to call. This will be called with keywords only, and no validation of the arguments will be performed. name: The unique name of the tool that clearly communicates its purpose description: Used to tell the model how/when/why to use the tool. You can provide few-shot examples as a part of the description. json_schema: The schema for the function arguments takes_ctx: An optional boolean parameter indicating whether the function accepts the context object as an argument. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Returns: A Pydantic tool that calls the function """ function_schema = _function_schema.FunctionSchema( function=function, description=description, validator=SchemaValidator(schema=core_schema.any_schema()), json_schema=json_schema, takes_ctx=takes_ctx, is_async=_utils.is_async_callable(function), ) return cls( function, takes_ctx=takes_ctx, name=name, description=description, function_schema=function_schema, sequential=sequential, ) ``` #### prepare_tool_def ```python prepare_tool_def( ctx: RunContext[AgentDepsT], ) -> ToolDefinition | None ``` Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: | Type | Description | | --- | --- | | `ToolDefinition | None` | return a ToolDefinition or None if the tools should not be registered for this run. | Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python async def prepare_tool_def(self, ctx: RunContext[AgentDepsT]) -> ToolDefinition | None: """Get the tool definition. By default, this method creates a tool definition, then either returns it, or calls `self.prepare` if it's set. Returns: return a `ToolDefinition` or `None` if the tools should not be registered for this run. """ base_tool_def = self.tool_def if self.requires_approval and not ctx.tool_call_approved: base_tool_def = replace(base_tool_def, kind='unapproved') if self.prepare is not None: return await self.prepare(ctx, base_tool_def) else: return base_tool_def ``` ### ObjectJsonSchema ```python ObjectJsonSchema: TypeAlias = dict[str, Any] ``` Type representing JSON schema of an object, e.g. where `"type": "object"`. This type is used to define tools parameters (aka arguments) in ToolDefinition. With PEP-728 this should be a TypedDict with `type: Literal['object']`, and `extra_parts=Any` ### ToolDefinition Definition of a tool passed to a model. This is used for both function tools and output tools. Source code in `pydantic_ai_slim/pydantic_ai/tools.py` ```python @dataclass(repr=False, kw_only=True) class ToolDefinition: """Definition of a tool passed to a model. This is used for both function tools and output tools. """ name: str """The name of the tool.""" parameters_json_schema: ObjectJsonSchema = field(default_factory=lambda: {'type': 'object', 'properties': {}}) """The JSON schema for the tool's parameters.""" description: str | None = None """The description of the tool.""" outer_typed_dict_key: str | None = None """The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. """ strict: bool | None = None """Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently only supported by OpenAI models. """ sequential: bool = False """Whether this tool requires a sequential/serial execution environment.""" kind: ToolKind = field(default='function') """The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. """ metadata: dict[str, Any] | None = None """Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization. For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition. """ @property def defer(self) -> bool: """Whether calls to this tool will be deferred. See the [tools documentation](../deferred-tools.md#deferred-tools) for more info. """ return self.kind in ('external', 'unapproved') __repr__ = _utils.dataclasses_no_defaults_repr ``` #### name ```python name: str ``` The name of the tool. #### parameters_json_schema ```python parameters_json_schema: ObjectJsonSchema = field( default_factory=lambda: { "type": "object", "properties": {}, } ) ``` The JSON schema for the tool's parameters. #### description ```python description: str | None = None ``` The description of the tool. #### outer_typed_dict_key ```python outer_typed_dict_key: str | None = None ``` The key in the outer [TypedDict] that wraps an output tool. This will only be set for output tools which don't have an `object` JSON schema. #### strict ```python strict: bool | None = None ``` Whether to enforce (vendor-specific) strict JSON schema validation for tool calls. Setting this to `True` while using a supported model generally imposes some restrictions on the tool's JSON schema in exchange for guaranteeing the API responses strictly match that schema. When `False`, the model may be free to generate other properties or types (depending on the vendor). When `None` (the default), the value will be inferred based on the compatibility of the parameters_json_schema. Note: this is currently only supported by OpenAI models. #### sequential ```python sequential: bool = False ``` Whether this tool requires a sequential/serial execution environment. #### kind ```python kind: ToolKind = field(default='function') ``` The kind of tool: - `'function'`: a tool that will be executed by Pydantic AI during an agent run and has its result returned to the model - `'output'`: a tool that passes through an output value that ends the run - `'external'`: a tool whose result will be produced outside of the Pydantic AI agent run in which it was called, because it depends on an upstream service (or user) or could take longer to generate than it's reasonable to keep the agent process running. See the [tools documentation](../../deferred-tools/#deferred-tools) for more info. - `'unapproved'`: a tool that requires human-in-the-loop approval. See the [tools documentation](../../deferred-tools/#human-in-the-loop-tool-approval) for more info. #### metadata ```python metadata: dict[str, Any] | None = None ``` Tool metadata that can be set by the toolset this tool came from. It is not sent to the model, but can be used for filtering and tool behavior customization. For MCP tools, this contains the `meta`, `annotations`, and `output_schema` fields from the tool definition. #### defer ```python defer: bool ``` Whether calls to this tool will be deferred. See the [tools documentation](../../deferred-tools/#deferred-tools) for more info. # `pydantic_ai.toolsets` ### AbstractToolset Bases: `ABC`, `Generic[AgentDepsT]` A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](../../toolsets/) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python class AbstractToolset(ABC, Generic[AgentDepsT]): """A toolset is a collection of tools that can be used by an agent. It is responsible for: - Listing the tools it contains - Validating the arguments of the tools - Calling the tools See [toolset docs](../toolsets.md) for more information. """ @property @abstractmethod def id(self) -> str | None: """An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ raise NotImplementedError() @property def label(self) -> str: """The name of the toolset for use in error messages.""" label = self.__class__.__name__ if self.id: # pragma: no branch label += f' {self.id!r}' return label @property def tool_name_conflict_hint(self) -> str: """A hint for how to avoid name conflicts with other toolsets for use in error messages.""" return 'Rename the tool or wrap the toolset in a `PrefixedToolset` to avoid name conflicts.' async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` #### id ```python id: str | None ``` An ID for the toolset that is unique among all toolsets registered with the same agent. If you're implementing a concrete implementation that users can instantiate more than once, you should let them optionally pass a custom ID to the constructor and return that here. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. #### label ```python label: str ``` The name of the toolset for use in error messages. #### tool_name_conflict_hint ```python tool_name_conflict_hint: str ``` A hint for how to avoid name conflicts with other toolsets for use in error messages. #### __aenter__ ```python __aenter__() -> Self ``` Enter the toolset context. This is where you can set up network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aenter__(self) -> Self: """Enter the toolset context. This is where you can set up network connections in a concrete implementation. """ return self ``` #### __aexit__ ```python __aexit__(*args: Any) -> bool | None ``` Exit the toolset context. This is where you can tear down network connections in a concrete implementation. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python async def __aexit__(self, *args: Any) -> bool | None: """Exit the toolset context. This is where you can tear down network connections in a concrete implementation. """ return None ``` #### get_tools ```python get_tools( ctx: RunContext[AgentDepsT], ) -> dict[str, ToolsetTool[AgentDepsT]] ``` The tools that are available in this toolset. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: """The tools that are available in this toolset.""" raise NotImplementedError() ``` #### call_tool ```python call_tool( name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT], ) -> Any ``` Call a tool with the given arguments. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the tool to call. | *required* | | `tool_args` | `dict[str, Any]` | The arguments to pass to the tool. | *required* | | `ctx` | `RunContext[AgentDepsT]` | The run context. | *required* | | `tool` | `ToolsetTool[AgentDepsT]` | The tool definition returned by get_tools that was called. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python @abstractmethod async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: """Call a tool with the given arguments. Args: name: The name of the tool to call. tool_args: The arguments to pass to the tool. ctx: The run context. tool: The tool definition returned by [`get_tools`][pydantic_ai.toolsets.AbstractToolset.get_tools] that was called. """ raise NotImplementedError() ``` #### apply ```python apply( visitor: Callable[[AbstractToolset[AgentDepsT]], None], ) -> None ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling). Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling).""" visitor(self) ``` #### visit_and_replace ```python visit_and_replace( visitor: Callable[ [AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT], ], ) -> AbstractToolset[AgentDepsT] ``` Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: """Run a visitor function on all "leaf" toolsets (i.e. those that implement their own tool listing and calling) and replace them in the hierarchy with the result of the function.""" return visitor(self) ``` #### filtered ```python filtered( filter_func: Callable[ [RunContext[AgentDepsT], ToolDefinition], bool ], ) -> FilteredToolset[AgentDepsT] ``` Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../../toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def filtered( self, filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] ) -> FilteredToolset[AgentDepsT]: """Returns a new toolset that filters this toolset's tools using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ from .filtered import FilteredToolset return FilteredToolset(self, filter_func) ``` #### prefixed ```python prefixed(prefix: str) -> PrefixedToolset[AgentDepsT] ``` Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../../toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prefixed(self, prefix: str) -> PrefixedToolset[AgentDepsT]: """Returns a new toolset that prefixes the names of this toolset's tools. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ from .prefixed import PrefixedToolset return PrefixedToolset(self, prefix) ``` #### prepared ```python prepared( prepare_func: ToolsPrepareFunc[AgentDepsT], ) -> PreparedToolset[AgentDepsT] ``` Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../../toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def prepared(self, prepare_func: ToolsPrepareFunc[AgentDepsT]) -> PreparedToolset[AgentDepsT]: """Returns a new toolset that prepares this toolset's tools using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ from .prepared import PreparedToolset return PreparedToolset(self, prepare_func) ``` #### renamed ```python renamed( name_map: dict[str, str], ) -> RenamedToolset[AgentDepsT] ``` Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../../toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def renamed(self, name_map: dict[str, str]) -> RenamedToolset[AgentDepsT]: """Returns a new toolset that renames this toolset's tools using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ from .renamed import RenamedToolset return RenamedToolset(self, name_map) ``` #### approval_required ```python approval_required( approval_required_func: Callable[ [ RunContext[AgentDepsT], ToolDefinition, dict[str, Any], ], bool, ] = lambda ctx, tool_def, tool_args: True ) -> ApprovalRequiredToolset[AgentDepsT] ``` Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../../toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/abstract.py` ```python def approval_required( self, approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ), ) -> ApprovalRequiredToolset[AgentDepsT]: """Returns a new toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ from .approval_required import ApprovalRequiredToolset return ApprovalRequiredToolset(self, approval_required_func) ``` ### CombinedToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that combines multiple toolsets. See [toolset docs](../../toolsets/#combining-toolsets) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/combined.py` ```python @dataclass class CombinedToolset(AbstractToolset[AgentDepsT]): """A toolset that combines multiple toolsets. See [toolset docs](../toolsets.md#combining-toolsets) for more information. """ toolsets: Sequence[AbstractToolset[AgentDepsT]] _enter_lock: Lock = field(compare=False, init=False, default_factory=Lock) _entered_count: int = field(init=False, default=0) _exit_stack: AsyncExitStack | None = field(init=False, default=None) @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({", ".join(toolset.label for toolset in self.toolsets)})' # pragma: no cover async def __aenter__(self) -> Self: async with self._enter_lock: if self._entered_count == 0: async with AsyncExitStack() as exit_stack: for toolset in self.toolsets: await exit_stack.enter_async_context(toolset) self._exit_stack = exit_stack.pop_all() self._entered_count += 1 return self async def __aexit__(self, *args: Any) -> bool | None: async with self._enter_lock: self._entered_count -= 1 if self._entered_count == 0 and self._exit_stack is not None: await self._exit_stack.aclose() self._exit_stack = None async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: toolsets_tools = await asyncio.gather(*(toolset.get_tools(ctx) for toolset in self.toolsets)) all_tools: dict[str, ToolsetTool[AgentDepsT]] = {} for toolset, tools in zip(self.toolsets, toolsets_tools): for name, tool in tools.items(): tool_toolset = tool.toolset if existing_tool := all_tools.get(name): capitalized_toolset_label = tool_toolset.label[0].upper() + tool_toolset.label[1:] raise UserError( f'{capitalized_toolset_label} defines a tool whose name conflicts with existing tool from {existing_tool.toolset.label}: {name!r}. {toolset.tool_name_conflict_hint}' ) all_tools[name] = _CombinedToolsetTool( toolset=tool_toolset, tool_def=tool.tool_def, max_retries=tool.max_retries, args_validator=tool.args_validator, source_toolset=toolset, source_tool=tool, ) return all_tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, _CombinedToolsetTool) return await tool.source_toolset.call_tool(name, tool_args, ctx, tool.source_tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: for toolset in self.toolsets: toolset.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, toolsets=[toolset.visit_and_replace(visitor) for toolset in self.toolsets]) ``` ### ExternalToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](../../toolsets/#external-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/external.py` ```python class ExternalToolset(AbstractToolset[AgentDepsT]): """A toolset that holds tools whose results will be produced outside of the Pydantic AI agent run in which they were called. See [toolset docs](../toolsets.md#external-toolset) for more information. """ tool_defs: list[ToolDefinition] _id: str | None def __init__(self, tool_defs: list[ToolDefinition], *, id: str | None = None): self.tool_defs = tool_defs self._id = id @property def id(self) -> str | None: return self._id async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { tool_def.name: ToolsetTool( toolset=self, tool_def=replace(tool_def, kind='external'), max_retries=0, args_validator=TOOL_SCHEMA_VALIDATOR, ) for tool_def in self.tool_defs } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: raise NotImplementedError('External tools cannot be called directly') ``` ### ApprovalRequiredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../../toolsets/#requiring-tool-approval) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/approval_required.py` ```python @dataclass class ApprovalRequiredToolset(WrapperToolset[AgentDepsT]): """A toolset that requires (some) calls to tools it contains to be approved. See [toolset docs](../toolsets.md#requiring-tool-approval) for more information. """ approval_required_func: Callable[[RunContext[AgentDepsT], ToolDefinition, dict[str, Any]], bool] = ( lambda ctx, tool_def, tool_args: True ) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: if not ctx.tool_call_approved and self.approval_required_func(ctx, tool.tool_def, tool_args): raise ApprovalRequired return await super().call_tool(name, tool_args, ctx, tool) ``` ### FilteredToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](../../toolsets/#filtering-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/filtered.py` ```python @dataclass class FilteredToolset(WrapperToolset[AgentDepsT]): """A toolset that filters the tools it contains using a filter function that takes the agent context and the tool definition. See [toolset docs](../toolsets.md#filtering-tools) for more information. """ filter_func: Callable[[RunContext[AgentDepsT], ToolDefinition], bool] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { name: tool for name, tool in (await super().get_tools(ctx)).items() if self.filter_func(ctx, tool.tool_def) } ``` ### FunctionToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that lets Python functions be used as tools. See [toolset docs](../../toolsets/#function-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python class FunctionToolset(AbstractToolset[AgentDepsT]): """A toolset that lets Python functions be used as tools. See [toolset docs](../toolsets.md#function-toolset) for more information. """ tools: dict[str, Tool[Any]] max_retries: int _id: str | None docstring_format: DocstringFormat require_parameter_descriptions: bool schema_generator: type[GenerateJsonSchema] def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ self.max_retries = max_retries self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) else: self.add_function(tool) @property def id(self) -> str | None: return self._id @overload def tool(self, func: ToolFuncEither[AgentDepsT, ToolParams], /) -> ToolFuncEither[AgentDepsT, ToolParams]: ... @overload def tool( self, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> Callable[[ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams]]: ... def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool,defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func=func_, takes_ctx=None, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator if strict is None: strict = self.strict if sequential is None: sequential = self.sequential if requires_approval is None: requires_approval = self.requires_approval tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, description=description, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) self.add_tool(tool) def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries if self.metadata is not None: tool.metadata = self.metadata | (tool.metadata or {}) self.tools[tool.name] = tool async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in self.tools.items(): max_retries = tool.max_retries if tool.max_retries is not None else self.max_retries run_context = replace( ctx, tool_name=original_name, retry=ctx.retries.get(original_name, 0), max_retries=max_retries, ) tool_def = await tool.prepare_tool_def(run_context) if not tool_def: continue new_name = tool_def.name if new_name in tools: if new_name != original_name: raise UserError(f'Renaming tool {original_name!r} to {new_name!r} conflicts with existing tool.') else: raise UserError(f'Tool name conflicts with previously renamed tool: {new_name!r}.') tools[new_name] = FunctionToolsetTool( toolset=self, tool_def=tool_def, max_retries=max_retries, args_validator=tool.function_schema.validator, call_func=tool.function_schema.call, is_async=tool.function_schema.is_async, ) return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: assert isinstance(tool, FunctionToolsetTool) return await tool.call_func(tool_args, ctx) ```` #### __init__ ```python __init__( tools: Sequence[ Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...] ] = [], *, max_retries: int = 1, docstring_format: DocstringFormat = "auto", require_parameter_descriptions: bool = False, schema_generator: type[ GenerateJsonSchema ] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None ) ``` Build a new function toolset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tools` | `Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]]` | The tools to add to the toolset. | `[]` | | `max_retries` | `int` | The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. | `1` | | `docstring_format` | `DocstringFormat` | Format of tool docstring, see DocstringFormat. Defaults to 'auto', such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. | `'auto'` | | `require_parameter_descriptions` | `bool` | If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. | `False` | | `schema_generator` | `type[GenerateJsonSchema]` | The JSON schema generator class to use for this tool. Defaults to GenerateToolJsonSchema. Applies to all tools, unless overridden when adding a tool. | `GenerateToolJsonSchema` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. | `None` | | `sequential` | `bool` | Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. | `False` | | `requires_approval` | `bool` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. Applies to all tools, unless overridden when adding a tool. | `False` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. | `None` | | `id` | `str | None` | An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def __init__( self, tools: Sequence[Tool[AgentDepsT] | ToolFuncEither[AgentDepsT, ...]] = [], *, max_retries: int = 1, docstring_format: DocstringFormat = 'auto', require_parameter_descriptions: bool = False, schema_generator: type[GenerateJsonSchema] = GenerateToolJsonSchema, strict: bool | None = None, sequential: bool = False, requires_approval: bool = False, metadata: dict[str, Any] | None = None, id: str | None = None, ): """Build a new function toolset. Args: tools: The tools to add to the toolset. max_retries: The maximum number of retries for each tool during a run. Applies to all tools, unless overridden when adding a tool. docstring_format: Format of tool docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. Defaults to `'auto'`, such that the format is inferred from the structure of the docstring. Applies to all tools, unless overridden when adding a tool. require_parameter_descriptions: If True, raise an error if a parameter description is missing. Defaults to False. Applies to all tools, unless overridden when adding a tool. schema_generator: The JSON schema generator class to use for this tool. Defaults to `GenerateToolJsonSchema`. Applies to all tools, unless overridden when adding a tool. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. Applies to all tools, unless overridden when adding a tool. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. Applies to all tools, unless overridden when adding a tool. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. Applies to all tools, unless overridden when adding a tool, which will be merged with the toolset's metadata. id: An optional unique ID for the toolset. A toolset needs to have an ID in order to be used in a durable execution environment like Temporal, in which case the ID will be used to identify the toolset's activities within the workflow. """ self.max_retries = max_retries self._id = id self.docstring_format = docstring_format self.require_parameter_descriptions = require_parameter_descriptions self.schema_generator = schema_generator self.strict = strict self.sequential = sequential self.requires_approval = requires_approval self.metadata = metadata self.tools = {} for tool in tools: if isinstance(tool, Tool): self.add_tool(tool) else: self.add_function(tool) ``` #### tool ```python tool( func: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams] ``` ```python tool( *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None ) -> Callable[ [ToolFuncEither[AgentDepsT, ToolParams]], ToolFuncEither[AgentDepsT, ToolParams], ] ``` ```python tool( func: ( ToolFuncEither[AgentDepsT, ToolParams] | None ) = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> Any ``` Decorator to register a tool function which takes RunContext as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncEither[AgentDepsT, ToolParams] | None` | The tool function to register. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `description` | `str | None` | The description of the tool,defaults to the function docstring. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat | None` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | `None` | | `require_parameter_descriptions` | `bool | None` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | `None` | | `schema_generator` | `type[GenerateJsonSchema] | None` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | `None` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. If None, the default value is determined by the toolset. | `None` | | `sequential` | `bool | None` | Whether the function requires a sequential/serial execution environment. Defaults to False. If None, the default value is determined by the toolset. | `None` | | `requires_approval` | `bool | None` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. If None, the default value is determined by the toolset. | `None` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If None, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ````python def tool( self, func: ToolFuncEither[AgentDepsT, ToolParams] | None = None, /, *, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> Any: """Decorator to register a tool function which takes [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. Can decorate a sync or async functions. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). We can't add overloads for every possible signature of tool, since the return type is a recursive union so the signature of functions decorated with `@toolset.tool` is obscured. Example: ```python from pydantic_ai import Agent, FunctionToolset, RunContext toolset = FunctionToolset() @toolset.tool def foobar(ctx: RunContext[int], x: int) -> int: return ctx.deps + x @toolset.tool(retries=2) async def spam(ctx: RunContext[str], y: float) -> float: return ctx.deps + y agent = Agent('test', toolsets=[toolset], deps_type=int) result = agent.run_sync('foobar', deps=1) print(result.output) #> {"foobar":1,"spam":1.0} ``` Args: func: The tool function to register. name: The name of the tool, defaults to the function name. description: The description of the tool,defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. """ def tool_decorator( func_: ToolFuncEither[AgentDepsT, ToolParams], ) -> ToolFuncEither[AgentDepsT, ToolParams]: # noinspection PyTypeChecker self.add_function( func=func_, takes_ctx=None, name=name, description=description, retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) return func_ return tool_decorator if func is None else tool_decorator(func) ```` #### add_function ```python add_function( func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: ( type[GenerateJsonSchema] | None ) = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> None ``` Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../../tools/#function-tools-and-schema). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `func` | `ToolFuncEither[AgentDepsT, ToolParams]` | The tool function to register. | *required* | | `takes_ctx` | `bool | None` | Whether the function takes a RunContext as its first argument. If None, this is inferred from the function signature. | `None` | | `name` | `str | None` | The name of the tool, defaults to the function name. | `None` | | `description` | `str | None` | The description of the tool, defaults to the function docstring. | `None` | | `retries` | `int | None` | The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. | `None` | | `prepare` | `ToolPrepareFunc[AgentDepsT] | None` | custom method to prepare the tool definition for each step, return None to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See ToolPrepareFunc. | `None` | | `docstring_format` | `DocstringFormat | None` | The format of the docstring, see DocstringFormat. If None, the default value is determined by the toolset. | `None` | | `require_parameter_descriptions` | `bool | None` | If True, raise an error if a parameter description is missing. If None, the default value is determined by the toolset. | `None` | | `schema_generator` | `type[GenerateJsonSchema] | None` | The JSON schema generator class to use for this tool. If None, the default value is determined by the toolset. | `None` | | `strict` | `bool | None` | Whether to enforce JSON schema compliance (only affects OpenAI). See ToolDefinition for more info. If None, the default value is determined by the toolset. | `None` | | `sequential` | `bool | None` | Whether the function requires a sequential/serial execution environment. Defaults to False. If None, the default value is determined by the toolset. | `None` | | `requires_approval` | `bool | None` | Whether this tool requires human-in-the-loop approval. Defaults to False. See the tools documentation for more info. If None, the default value is determined by the toolset. | `None` | | `metadata` | `dict[str, Any] | None` | Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If None, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_function( self, func: ToolFuncEither[AgentDepsT, ToolParams], takes_ctx: bool | None = None, name: str | None = None, description: str | None = None, retries: int | None = None, prepare: ToolPrepareFunc[AgentDepsT] | None = None, docstring_format: DocstringFormat | None = None, require_parameter_descriptions: bool | None = None, schema_generator: type[GenerateJsonSchema] | None = None, strict: bool | None = None, sequential: bool | None = None, requires_approval: bool | None = None, metadata: dict[str, Any] | None = None, ) -> None: """Add a function as a tool to the toolset. Can take a sync or async function. The docstring is inspected to extract both the tool description and description of each parameter, [learn more](../tools.md#function-tools-and-schema). Args: func: The tool function to register. takes_ctx: Whether the function takes a [`RunContext`][pydantic_ai.tools.RunContext] as its first argument. If `None`, this is inferred from the function signature. name: The name of the tool, defaults to the function name. description: The description of the tool, defaults to the function docstring. retries: The number of retries to allow for this tool, defaults to the agent's default retries, which defaults to 1. prepare: custom method to prepare the tool definition for each step, return `None` to omit this tool from a given step. This is useful if you want to customise a tool at call time, or omit it completely from a step. See [`ToolPrepareFunc`][pydantic_ai.tools.ToolPrepareFunc]. docstring_format: The format of the docstring, see [`DocstringFormat`][pydantic_ai.tools.DocstringFormat]. If `None`, the default value is determined by the toolset. require_parameter_descriptions: If True, raise an error if a parameter description is missing. If `None`, the default value is determined by the toolset. schema_generator: The JSON schema generator class to use for this tool. If `None`, the default value is determined by the toolset. strict: Whether to enforce JSON schema compliance (only affects OpenAI). See [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] for more info. If `None`, the default value is determined by the toolset. sequential: Whether the function requires a sequential/serial execution environment. Defaults to False. If `None`, the default value is determined by the toolset. requires_approval: Whether this tool requires human-in-the-loop approval. Defaults to False. See the [tools documentation](../deferred-tools.md#human-in-the-loop-tool-approval) for more info. If `None`, the default value is determined by the toolset. metadata: Optional metadata for the tool. This is not sent to the model but can be used for filtering and tool behavior customization. If `None`, the default value is determined by the toolset. If provided, it will be merged with the toolset's metadata. """ if docstring_format is None: docstring_format = self.docstring_format if require_parameter_descriptions is None: require_parameter_descriptions = self.require_parameter_descriptions if schema_generator is None: schema_generator = self.schema_generator if strict is None: strict = self.strict if sequential is None: sequential = self.sequential if requires_approval is None: requires_approval = self.requires_approval tool = Tool[AgentDepsT]( func, takes_ctx=takes_ctx, name=name, description=description, max_retries=retries, prepare=prepare, docstring_format=docstring_format, require_parameter_descriptions=require_parameter_descriptions, schema_generator=schema_generator, strict=strict, sequential=sequential, requires_approval=requires_approval, metadata=metadata, ) self.add_tool(tool) ``` #### add_tool ```python add_tool(tool: Tool[AgentDepsT]) -> None ``` Add a tool to the toolset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tool` | `Tool[AgentDepsT]` | The tool to add. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/toolsets/function.py` ```python def add_tool(self, tool: Tool[AgentDepsT]) -> None: """Add a tool to the toolset. Args: tool: The tool to add. """ if tool.name in self.tools: raise UserError(f'Tool name conflicts with existing tool: {tool.name!r}') if tool.max_retries is None: tool.max_retries = self.max_retries if self.metadata is not None: tool.metadata = self.metadata | (tool.metadata or {}) self.tools[tool.name] = tool ``` ### PrefixedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prefixes the names of the tools it contains. See [toolset docs](../../toolsets/#prefixing-tool-names) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prefixed.py` ```python @dataclass class PrefixedToolset(WrapperToolset[AgentDepsT]): """A toolset that prefixes the names of the tools it contains. See [toolset docs](../toolsets.md#prefixing-tool-names) for more information. """ prefix: str @property def tool_name_conflict_hint(self) -> str: return 'Change the `prefix` attribute to avoid name conflicts.' async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return { new_name: replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) for name, tool in (await super().get_tools(ctx)).items() if (new_name := f'{self.prefix}_{name}') } async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = name.removeprefix(self.prefix + '_') ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### RenamedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](../../toolsets/#renaming-tools) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/renamed.py` ```python @dataclass class RenamedToolset(WrapperToolset[AgentDepsT]): """A toolset that renames the tools it contains using a dictionary mapping new names to original names. See [toolset docs](../toolsets.md#renaming-tools) for more information. """ name_map: dict[str, str] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_to_new_name_map = {v: k for k, v in self.name_map.items()} original_tools = await super().get_tools(ctx) tools: dict[str, ToolsetTool[AgentDepsT]] = {} for original_name, tool in original_tools.items(): new_name = original_to_new_name_map.get(original_name, None) if new_name: tools[new_name] = replace( tool, toolset=self, tool_def=replace(tool.tool_def, name=new_name), ) else: tools[original_name] = tool return tools async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: original_name = self.name_map.get(name, name) ctx = replace(ctx, tool_name=original_name) tool = replace(tool, tool_def=replace(tool.tool_def, name=original_name)) return await super().call_tool(original_name, tool_args, ctx, tool) ``` ### PreparedToolset Bases: `WrapperToolset[AgentDepsT]` A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../../toolsets/#preparing-tool-definitions) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/prepared.py` ```python @dataclass class PreparedToolset(WrapperToolset[AgentDepsT]): """A toolset that prepares the tools it contains using a prepare function that takes the agent context and the original tool definitions. See [toolset docs](../toolsets.md#preparing-tool-definitions) for more information. """ prepare_func: ToolsPrepareFunc[AgentDepsT] async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: original_tools = await super().get_tools(ctx) original_tool_defs = [tool.tool_def for tool in original_tools.values()] prepared_tool_defs_by_name = { tool_def.name: tool_def for tool_def in (await self.prepare_func(ctx, original_tool_defs) or []) } if len(prepared_tool_defs_by_name.keys() - original_tools.keys()) > 0: raise UserError( 'Prepare function cannot add or rename tools. Use `FunctionToolset.add_function()` or `RenamedToolset` instead.' ) return { name: replace(original_tools[name], tool_def=tool_def) for name, tool_def in prepared_tool_defs_by_name.items() } ``` ### WrapperToolset Bases: `AbstractToolset[AgentDepsT]` A toolset that wraps another toolset and delegates to it. See [toolset docs](../../toolsets/#wrapping-a-toolset) for more information. Source code in `pydantic_ai_slim/pydantic_ai/toolsets/wrapper.py` ```python @dataclass class WrapperToolset(AbstractToolset[AgentDepsT]): """A toolset that wraps another toolset and delegates to it. See [toolset docs](../toolsets.md#wrapping-a-toolset) for more information. """ wrapped: AbstractToolset[AgentDepsT] @property def id(self) -> str | None: return None # pragma: no cover @property def label(self) -> str: return f'{self.__class__.__name__}({self.wrapped.label})' async def __aenter__(self) -> Self: await self.wrapped.__aenter__() return self async def __aexit__(self, *args: Any) -> bool | None: return await self.wrapped.__aexit__(*args) async def get_tools(self, ctx: RunContext[AgentDepsT]) -> dict[str, ToolsetTool[AgentDepsT]]: return await self.wrapped.get_tools(ctx) async def call_tool( self, name: str, tool_args: dict[str, Any], ctx: RunContext[AgentDepsT], tool: ToolsetTool[AgentDepsT] ) -> Any: return await self.wrapped.call_tool(name, tool_args, ctx, tool) def apply(self, visitor: Callable[[AbstractToolset[AgentDepsT]], None]) -> None: self.wrapped.apply(visitor) def visit_and_replace( self, visitor: Callable[[AbstractToolset[AgentDepsT]], AbstractToolset[AgentDepsT]] ) -> AbstractToolset[AgentDepsT]: return replace(self, wrapped=self.wrapped.visit_and_replace(visitor)) ``` ### ToolsetFunc ```python ToolsetFunc: TypeAlias = Callable[ [RunContext[AgentDepsT]], AbstractToolset[AgentDepsT] | None | Awaitable[AbstractToolset[AgentDepsT] | None], ] ``` A sync/async function which takes a run context and returns a toolset. # `pydantic_ai.usage` ### RequestUsage Bases: `UsageBase` LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RequestUsage(UsageBase): """LLM usage associated with a single request. This is an implementation of `genai_prices.types.AbstractUsage` so it can be used to calculate the price of the request using [genai-prices](https://github.com/pydantic/genai-prices). """ @property def requests(self): return 1 def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage @classmethod def extract( cls, data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str | None = None, details: dict[str, Any] | None = None, ) -> RequestUsage: """Extract usage information from the response data using genai-prices. Args: data: The response data from the model API. provider: The actual provider ID provider_url: The provider base_url provider_fallback: The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. api_flavor: The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. details: Becomes the `details` field on the returned `RequestUsage` for convenience. """ details = details or {} for provider_id, provider_api_url in [(None, provider_url), (provider, None), (provider_fallback, None)]: try: provider_obj = get_snapshot().find_provider(None, provider_id, provider_api_url) _model_ref, extracted_usage = provider_obj.extract_usage(data, api_flavor=api_flavor) return cls(**{k: v for k, v in extracted_usage.__dict__.items() if v is not None}, details=details) except Exception: pass return cls(details=details) ``` #### incr ```python incr(incr_usage: RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `incr_usage` | `RequestUsage` | The usage to increment by. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RequestUsage) -> RequestUsage ``` Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RequestUsage) -> RequestUsage: """Add two RequestUsages together. This is provided so it's trivial to sum usage information from multiple parts of a response. **WARNING:** this CANNOT be used to sum multiple requests without breaking some pricing calculations. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### extract ```python extract( data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str | None = None, details: dict[str, Any] | None = None ) -> RequestUsage ``` Extract usage information from the response data using genai-prices. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `data` | `Any` | The response data from the model API. | *required* | | `provider` | `str` | The actual provider ID | *required* | | `provider_url` | `str` | The provider base_url | *required* | | `provider_fallback` | `str` | The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. | *required* | | `api_flavor` | `str | None` | The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. | `None` | | `details` | `dict[str, Any] | None` | Becomes the details field on the returned RequestUsage for convenience. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @classmethod def extract( cls, data: Any, *, provider: str, provider_url: str, provider_fallback: str, api_flavor: str | None = None, details: dict[str, Any] | None = None, ) -> RequestUsage: """Extract usage information from the response data using genai-prices. Args: data: The response data from the model API. provider: The actual provider ID provider_url: The provider base_url provider_fallback: The fallback provider ID to use if the actual provider is not found in genai-prices. For example, an OpenAI model should set this to "openai" in case it has an obscure provider ID. api_flavor: The API flavor to use when extracting usage information, e.g. 'chat' or 'responses' for OpenAI. details: Becomes the `details` field on the returned `RequestUsage` for convenience. """ details = details or {} for provider_id, provider_api_url in [(None, provider_url), (provider, None), (provider_fallback, None)]: try: provider_obj = get_snapshot().find_provider(None, provider_id, provider_api_url) _model_ref, extracted_usage = provider_obj.extract_usage(data, api_flavor=api_flavor) return cls(**{k: v for k, v in extracted_usage.__dict__.items() if v is not None}, details=details) except Exception: pass return cls(details=details) ``` ### RunUsage Bases: `UsageBase` LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class RunUsage(UsageBase): """LLM usage associated with an agent run. Responsibility for calculating request usage is on the model; Pydantic AI simply sums the usage information across requests. """ requests: int = 0 """Number of requests made to the LLM API.""" tool_calls: int = 0 """Number of successful tool calls executed during the run.""" input_tokens: int = 0 """Total number of input/prompt tokens.""" cache_write_tokens: int = 0 """Total number of tokens written to the cache.""" cache_read_tokens: int = 0 """Total number of tokens read from the cache.""" input_audio_tokens: int = 0 """Total number of audio input tokens.""" cache_audio_read_tokens: int = 0 """Total number of audio tokens read from the cache.""" output_tokens: int = 0 """Total number of output/completion tokens.""" details: dict[str, int] = dataclasses.field(default_factory=dict) """Any extra details returned by the model.""" def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` #### requests ```python requests: int = 0 ``` Number of requests made to the LLM API. #### tool_calls ```python tool_calls: int = 0 ``` Number of successful tool calls executed during the run. #### input_tokens ```python input_tokens: int = 0 ``` Total number of input/prompt tokens. #### cache_write_tokens ```python cache_write_tokens: int = 0 ``` Total number of tokens written to the cache. #### cache_read_tokens ```python cache_read_tokens: int = 0 ``` Total number of tokens read from the cache. #### input_audio_tokens ```python input_audio_tokens: int = 0 ``` Total number of audio input tokens. #### cache_audio_read_tokens ```python cache_audio_read_tokens: int = 0 ``` Total number of audio tokens read from the cache. #### output_tokens ```python output_tokens: int = 0 ``` Total number of output/completion tokens. #### details ```python details: dict[str, int] = field(default_factory=dict) ``` Any extra details returned by the model. #### incr ```python incr(incr_usage: RunUsage | RequestUsage) -> None ``` Increment the usage in place. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `incr_usage` | `RunUsage | RequestUsage` | The usage to increment by. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def incr(self, incr_usage: RunUsage | RequestUsage) -> None: """Increment the usage in place. Args: incr_usage: The usage to increment by. """ if isinstance(incr_usage, RunUsage): self.requests += incr_usage.requests self.tool_calls += incr_usage.tool_calls return _incr_usage_tokens(self, incr_usage) ``` #### __add__ ```python __add__(other: RunUsage | RequestUsage) -> RunUsage ``` Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def __add__(self, other: RunUsage | RequestUsage) -> RunUsage: """Add two RunUsages together. This is provided so it's trivial to sum usage information from multiple runs. """ new_usage = copy(self) new_usage.incr(other) return new_usage ``` ### Usage Bases: `RunUsage` Deprecated `Usage` is deprecated, use `RunUsage` instead Deprecated alias for `RunUsage`. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) @deprecated('`Usage` is deprecated, use `RunUsage` instead') class Usage(RunUsage): """Deprecated alias for `RunUsage`.""" ``` ### UsageLimits Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python @dataclass(repr=False, kw_only=True) class UsageLimits: """Limits on model usage. The request count is tracked by pydantic_ai, and the request limit is checked before each request to the model. Token counts are provided in responses from the model, and the token limits are checked after each response. Each of the limits can be set to `None` to disable that limit. """ request_limit: int | None = 50 """The maximum number of requests allowed to the model.""" tool_calls_limit: int | None = None """The maximum number of successful tool calls allowed to be executed.""" input_tokens_limit: int | None = None """The maximum number of input/prompt tokens allowed.""" output_tokens_limit: int | None = None """The maximum number of output/response tokens allowed.""" total_tokens_limit: int | None = None """The maximum number of tokens allowed in requests and responses combined.""" count_tokens_before_request: bool = False """If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default.""" @property @deprecated('`request_tokens_limit` is deprecated, use `input_tokens_limit` instead') def request_tokens_limit(self) -> int | None: return self.input_tokens_limit @property @deprecated('`response_tokens_limit` is deprecated, use `output_tokens_limit` instead') def response_tokens_limit(self) -> int | None: return self.output_tokens_limit @overload def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit self.output_tokens_limit = output_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request @overload @deprecated( 'Use `input_tokens_limit` instead of `request_tokens_limit` and `output_tokens_limit` and `total_tokens_limit`' ) def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, ) -> None: self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = request_tokens_limit self.output_tokens_limit = response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def __init__( self, *, request_limit: int | None = 50, tool_calls_limit: int | None = None, input_tokens_limit: int | None = None, output_tokens_limit: int | None = None, total_tokens_limit: int | None = None, count_tokens_before_request: bool = False, # deprecated: request_tokens_limit: int | None = None, response_tokens_limit: int | None = None, ): self.request_limit = request_limit self.tool_calls_limit = tool_calls_limit self.input_tokens_limit = input_tokens_limit or request_tokens_limit self.output_tokens_limit = output_tokens_limit or response_tokens_limit self.total_tokens_limit = total_tokens_limit self.count_tokens_before_request = count_tokens_before_request def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') def check_before_tool_call(self, projected_usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit tool_calls = projected_usage.tool_calls if tool_calls_limit is not None and tool_calls > tool_calls_limit: raise UsageLimitExceeded( f'The next tool call(s) would exceed the tool_calls_limit of {tool_calls_limit} ({tool_calls=}).' ) __repr__ = _utils.dataclasses_no_defaults_repr ``` #### request_limit ```python request_limit: int | None = request_limit ``` The maximum number of requests allowed to the model. #### tool_calls_limit ```python tool_calls_limit: int | None = tool_calls_limit ``` The maximum number of successful tool calls allowed to be executed. #### input_tokens_limit ```python input_tokens_limit: int | None = ( input_tokens_limit or request_tokens_limit ) ``` The maximum number of input/prompt tokens allowed. #### output_tokens_limit ```python output_tokens_limit: int | None = ( output_tokens_limit or response_tokens_limit ) ``` The maximum number of output/response tokens allowed. #### total_tokens_limit ```python total_tokens_limit: int | None = total_tokens_limit ``` The maximum number of tokens allowed in requests and responses combined. #### count_tokens_before_request ```python count_tokens_before_request: bool = ( count_tokens_before_request ) ``` If True, perform a token counting pass before sending the request to the model, to enforce `request_tokens_limit` ahead of time. This may incur additional overhead (from calling the model's `count_tokens` API before making the actual request) and is disabled by default. #### has_token_limits ```python has_token_limits() -> bool ``` Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def has_token_limits(self) -> bool: """Returns `True` if this instance places any limits on token counts. If this returns `False`, the `check_tokens` method will never raise an error. This is useful because if we have token limits, we need to check them after receiving each streamed message. If there are no limits, we can skip that processing in the streaming response iterator. """ return any( limit is not None for limit in (self.input_tokens_limit, self.output_tokens_limit, self.total_tokens_limit) ) ``` #### check_before_request ```python check_before_request(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_request(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next request would exceed any of the limits.""" request_limit = self.request_limit if request_limit is not None and usage.requests >= request_limit: raise UsageLimitExceeded(f'The next request would exceed the request_limit of {request_limit}') input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded( f'The next request would exceed the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded( # pragma: lax no cover f'The next request would exceed the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})' ) ``` #### check_tokens ```python check_tokens(usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_tokens(self, usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the usage exceeds any of the token limits.""" input_tokens = usage.input_tokens if self.input_tokens_limit is not None and input_tokens > self.input_tokens_limit: raise UsageLimitExceeded(f'Exceeded the input_tokens_limit of {self.input_tokens_limit} ({input_tokens=})') output_tokens = usage.output_tokens if self.output_tokens_limit is not None and output_tokens > self.output_tokens_limit: raise UsageLimitExceeded( f'Exceeded the output_tokens_limit of {self.output_tokens_limit} ({output_tokens=})' ) total_tokens = usage.total_tokens if self.total_tokens_limit is not None and total_tokens > self.total_tokens_limit: raise UsageLimitExceeded(f'Exceeded the total_tokens_limit of {self.total_tokens_limit} ({total_tokens=})') ``` #### check_before_tool_call ```python check_before_tool_call(projected_usage: RunUsage) -> None ``` Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit. Source code in `pydantic_ai_slim/pydantic_ai/usage.py` ```python def check_before_tool_call(self, projected_usage: RunUsage) -> None: """Raises a `UsageLimitExceeded` exception if the next tool call(s) would exceed the tool call limit.""" tool_calls_limit = self.tool_calls_limit tool_calls = projected_usage.tool_calls if tool_calls_limit is not None and tool_calls > tool_calls_limit: raise UsageLimitExceeded( f'The next tool call(s) would exceed the tool_calls_limit of {tool_calls_limit} ({tool_calls=}).' ) ``` # `pydantic_ai.models.anthropic` ## Setup For details on how to set up authentication with this model, see [model configuration for Anthropic](../../../models/anthropic/). ### LatestAnthropicModelNames ```python LatestAnthropicModelNames = ModelParam ``` Latest Anthropic models. ### AnthropicModelName ```python AnthropicModelName = str | LatestAnthropicModelNames ``` Possible Anthropic model names. Since Anthropic supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Anthropic docs](https://docs.anthropic.com/en/docs/about-claude/models) for a full list. ### AnthropicModelSettings Bases: `ModelSettings` Settings used for an Anthropic model request. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python class AnthropicModelSettings(ModelSettings, total=False): """Settings used for an Anthropic model request.""" # ALL FIELDS MUST BE `anthropic_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. anthropic_metadata: BetaMetadataParam """An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. """ anthropic_thinking: BetaThinkingConfigParam """Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. """ ``` #### anthropic_metadata ```python anthropic_metadata: BetaMetadataParam ``` An object describing metadata about the request. Contains `user_id`, an external identifier for the user who is associated with the request. #### anthropic_thinking ```python anthropic_thinking: BetaThinkingConfigParam ``` Determine whether the model should generate a thinking block. See [the Anthropic docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking) for more information. ### AnthropicModel Bases: `Model` A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass(init=False) class AnthropicModel(Model): """A model that uses the Anthropic API. Internally, this uses the [Anthropic Python client](https://github.com/anthropics/anthropic-sdk-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncAnthropicClient = field(repr=False) _model_name: AnthropicModelName = field(repr=False) _provider: Provider[AsyncAnthropicClient] = field(repr=False) def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> AnthropicModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._messages_create( messages, False, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._messages_create( messages, True, cast(AnthropicModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[BetaRawMessageStreamEvent]: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: AnthropicModelSettings, model_request_parameters: ModelRequestParameters, ) -> BetaMessage | AsyncStream[BetaRawMessageStreamEvent]: # standalone function to make it easier to override tools = self._get_tools(model_request_parameters) tools, beta_features = self._add_builtin_tools(tools, model_request_parameters) tool_choice: BetaToolChoiceParam | None if not tools: tool_choice = None else: if not model_request_parameters.allow_text_output: tool_choice = {'type': 'any'} if (thinking := model_settings.get('anthropic_thinking')) and thinking.get('type') == 'enabled': raise UserError( 'Anthropic does not support thinking and output tools at the same time. Use `output_type=PromptedOutput(...)` instead.' ) else: tool_choice = {'type': 'auto'} if (allow_parallel_tool_calls := model_settings.get('parallel_tool_calls')) is not None: tool_choice['disable_parallel_tool_use'] = not allow_parallel_tool_calls system_prompt, anthropic_messages = await self._map_message(messages) try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) if beta_features: if 'anthropic-beta' in extra_headers: beta_features.insert(0, extra_headers['anthropic-beta']) extra_headers['anthropic-beta'] = ','.join(beta_features) return await self.client.beta.messages.create( max_tokens=model_settings.get('max_tokens', 4096), system=system_prompt or OMIT, messages=anthropic_messages, model=self._model_name, tools=tools or OMIT, tool_choice=tool_choice or OMIT, stream=stream, thinking=model_settings.get('anthropic_thinking', OMIT), stop_sequences=model_settings.get('stop_sequences', OMIT), temperature=model_settings.get('temperature', OMIT), top_p=model_settings.get('top_p', OMIT), timeout=model_settings.get('timeout', NOT_GIVEN), metadata=model_settings.get('anthropic_metadata', OMIT), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: BetaMessage) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" items: list[ModelResponsePart] = [] for item in response.content: if isinstance(item, BetaTextBlock): items.append(TextPart(content=item.text)) elif isinstance(item, BetaServerToolUseBlock): items.append(_map_server_tool_use_block(item, self.system)) elif isinstance(item, BetaWebSearchToolResultBlock): items.append(_map_web_search_tool_result_block(item, self.system)) elif isinstance(item, BetaCodeExecutionToolResultBlock): items.append(_map_code_execution_tool_result_block(item, self.system)) elif isinstance(item, BetaRedactedThinkingBlock): items.append( ThinkingPart(id='redacted_thinking', content='', signature=item.data, provider_name=self.system) ) elif isinstance(item, BetaThinkingBlock): items.append(ThinkingPart(content=item.thinking, signature=item.signature, provider_name=self.system)) else: assert isinstance(item, BetaToolUseBlock), f'unexpected item type {type(item)}' items.append( ToolCallPart( tool_name=item.name, args=cast(dict[str, Any], item.input), tool_call_id=item.id, ) ) finish_reason: FinishReason | None = None provider_details: dict[str, Any] | None = None if raw_finish_reason := response.stop_reason: # pragma: no branch provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response, self._provider.name, self._provider.base_url, self._model_name), model_name=response.model, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[BetaRawMessageStreamEvent], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover assert isinstance(first_chunk, BetaRawMessageStartEvent) return AnthropicStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.message.model, _response=peekable_response, _timestamp=_utils.now_utc(), _provider_name=self._provider.name, _provider_url=self._provider.base_url, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[BetaToolUnionParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _add_builtin_tools( self, tools: list[BetaToolUnionParam], model_request_parameters: ModelRequestParameters ) -> tuple[list[BetaToolUnionParam], list[str]]: beta_features: list[str] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): user_location = UserLocation(type='approximate', **tool.user_location) if tool.user_location else None tools.append( BetaWebSearchTool20250305Param( name='web_search', type='web_search_20250305', max_uses=tool.max_uses, allowed_domains=tool.allowed_domains, blocked_domains=tool.blocked_domains, user_location=user_location, ) ) elif isinstance(tool, CodeExecutionTool): # pragma: no branch tools.append(BetaCodeExecutionTool20250522Param(name='code_execution', type='code_execution_20250522')) beta_features.append('code-execution-2025-05-22') elif isinstance(tool, MemoryTool): # pragma: no branch if 'memory' not in model_request_parameters.tool_defs: raise UserError("Built-in `MemoryTool` requires a 'memory' tool to be defined.") # Replace the memory tool definition with the built-in memory tool tools = [tool for tool in tools if tool['name'] != 'memory'] tools.append(BetaMemoryTool20250818Param(name='memory', type='memory_20250818')) beta_features.append('context-management-2025-06-27') else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `AnthropicModel`. If it should be, please file an issue.' ) return tools, beta_features async def _map_message(self, messages: list[ModelMessage]) -> tuple[str, list[BetaMessageParam]]: # noqa: C901 """Just maps a `pydantic_ai.Message` to a `anthropic.types.MessageParam`.""" system_prompt_parts: list[str] = [] anthropic_messages: list[BetaMessageParam] = [] for m in messages: if isinstance(m, ModelRequest): user_content_params: list[BetaContentBlockParam] = [] for request_part in m.parts: if isinstance(request_part, SystemPromptPart): system_prompt_parts.append(request_part.content) elif isinstance(request_part, UserPromptPart): async for content in self._map_user_prompt(request_part): user_content_params.append(content) elif isinstance(request_part, ToolReturnPart): tool_result_block_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response_str(), is_error=False, ) user_content_params.append(tool_result_block_param) elif isinstance(request_part, RetryPromptPart): # pragma: no branch if request_part.tool_name is None: text = request_part.model_response() # pragma: no cover retry_param = BetaTextBlockParam(type='text', text=text) # pragma: no cover else: retry_param = BetaToolResultBlockParam( tool_use_id=_guard_tool_call_id(t=request_part), type='tool_result', content=request_part.model_response(), is_error=True, ) user_content_params.append(retry_param) if len(user_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='user', content=user_content_params)) elif isinstance(m, ModelResponse): assistant_content_params: list[ BetaTextBlockParam | BetaToolUseBlockParam | BetaServerToolUseBlockParam | BetaWebSearchToolResultBlockParam | BetaCodeExecutionToolResultBlockParam | BetaThinkingBlockParam | BetaRedactedThinkingBlockParam ] = [] for response_part in m.parts: if isinstance(response_part, TextPart): if response_part.content: assistant_content_params.append(BetaTextBlockParam(text=response_part.content, type='text')) elif isinstance(response_part, ToolCallPart): tool_use_block_param = BetaToolUseBlockParam( id=_guard_tool_call_id(t=response_part), type='tool_use', name=response_part.tool_name, input=response_part.args_as_dict(), ) assistant_content_params.append(tool_use_block_param) elif isinstance(response_part, ThinkingPart): if ( response_part.provider_name == self.system and response_part.signature is not None ): # pragma: no branch if response_part.id == 'redacted_thinking': assistant_content_params.append( BetaRedactedThinkingBlockParam( data=response_part.signature, type='redacted_thinking', ) ) else: assistant_content_params.append( BetaThinkingBlockParam( thinking=response_part.content, signature=response_part.signature, type='thinking', ) ) elif response_part.content: # pragma: no branch start_tag, end_tag = self.profile.thinking_tags assistant_content_params.append( BetaTextBlockParam( text='\n'.join([start_tag, response_part.content, end_tag]), type='text' ) ) elif isinstance(response_part, BuiltinToolCallPart): if response_part.provider_name == self.system: tool_use_id = _guard_tool_call_id(t=response_part) if response_part.tool_name == WebSearchTool.kind: server_tool_use_block_param = BetaServerToolUseBlockParam( id=tool_use_id, type='server_tool_use', name='web_search', input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif response_part.tool_name == CodeExecutionTool.kind: # pragma: no branch server_tool_use_block_param = BetaServerToolUseBlockParam( id=tool_use_id, type='server_tool_use', name='code_execution', input=response_part.args_as_dict(), ) assistant_content_params.append(server_tool_use_block_param) elif isinstance(response_part, BuiltinToolReturnPart): if response_part.provider_name == self.system: tool_use_id = _guard_tool_call_id(t=response_part) if response_part.tool_name in ( WebSearchTool.kind, 'web_search_tool_result', # Backward compatibility ) and isinstance(response_part.content, dict | list): assistant_content_params.append( BetaWebSearchToolResultBlockParam( tool_use_id=tool_use_id, type='web_search_tool_result', content=cast( BetaWebSearchToolResultBlockParamContentParam, response_part.content, # pyright: ignore[reportUnknownMemberType] ), ) ) elif response_part.tool_name in ( # pragma: no branch CodeExecutionTool.kind, 'code_execution_tool_result', # Backward compatibility ) and isinstance(response_part.content, dict): assistant_content_params.append( BetaCodeExecutionToolResultBlockParam( tool_use_id=tool_use_id, type='code_execution_tool_result', content=cast( BetaCodeExecutionToolResultBlockParamContentParam, response_part.content, # pyright: ignore[reportUnknownMemberType] ), ) ) elif isinstance(response_part, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(response_part) if len(assistant_content_params) > 0: anthropic_messages.append(BetaMessageParam(role='assistant', content=assistant_content_params)) else: assert_never(m) if instructions := self._get_instructions(messages): system_prompt_parts.insert(0, instructions) system_prompt = '\n\n'.join(system_prompt_parts) return system_prompt, anthropic_messages @staticmethod async def _map_user_prompt( part: UserPromptPart, ) -> AsyncGenerator[BetaContentBlockParam]: if isinstance(part.content, str): if part.content: # Only yield non-empty text yield BetaTextBlockParam(text=part.content, type='text') else: for item in part.content: if isinstance(item, str): if item: # Only yield non-empty text yield BetaTextBlockParam(text=item, type='text') elif isinstance(item, BinaryContent): if item.is_image: yield BetaImageBlockParam( source={'data': io.BytesIO(item.data), 'media_type': item.media_type, 'type': 'base64'}, # type: ignore type='image', ) elif item.media_type == 'application/pdf': yield BetaBase64PDFBlockParam( source=BetaBase64PDFSourceParam( data=io.BytesIO(item.data), media_type='application/pdf', type='base64', ), type='document', ) else: raise RuntimeError('Only images and PDFs are supported for binary content') elif isinstance(item, ImageUrl): yield BetaImageBlockParam(source={'type': 'url', 'url': item.url}, type='image') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': yield BetaBase64PDFBlockParam(source={'url': item.url, 'type': 'url'}, type='document') elif item.media_type == 'text/plain': downloaded_item = await download_item(item, data_format='text') yield BetaBase64PDFBlockParam( source=BetaPlainTextSourceParam( data=downloaded_item['data'], media_type=item.media_type, type='text' ), type='document', ) else: # pragma: no cover raise RuntimeError(f'Unsupported media type: {item.media_type}') else: raise RuntimeError(f'Unsupported content type: {type(item)}') # pragma: no cover @staticmethod def _map_tool_definition(f: ToolDefinition) -> BetaToolParam: return { 'name': f.name, 'description': f.description or '', 'input_schema': f.parameters_json_schema, } ``` #### __init__ ```python __init__( model_name: AnthropicModelName, *, provider: ( Literal["anthropic"] | Provider[AsyncAnthropicClient] ) = "anthropic", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Anthropic model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `AnthropicModelName` | The name of the Anthropic model to use. List of model names available here. | *required* | | `provider` | `Literal['anthropic'] | Provider[AsyncAnthropicClient]` | The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of Provider[AsyncAnthropicClient]. If not provided, the other parameters will be used. | `'anthropic'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python def __init__( self, model_name: AnthropicModelName, *, provider: Literal['anthropic'] | Provider[AsyncAnthropicClient] = 'anthropic', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Anthropic model. Args: model_name: The name of the Anthropic model to use. List of model names available [here](https://docs.anthropic.com/en/docs/about-claude/models). provider: The provider to use for the Anthropic API. Can be either the string 'anthropic' or an instance of `Provider[AsyncAnthropicClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: AnthropicModelName ``` The model name. #### system ```python system: str ``` The model provider. ### AnthropicStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Anthropic models. Source code in `pydantic_ai_slim/pydantic_ai/models/anthropic.py` ```python @dataclass class AnthropicStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Anthropic models.""" _model_name: AnthropicModelName _response: AsyncIterable[BetaRawMessageStreamEvent] _timestamp: datetime _provider_name: str _provider_url: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 current_block: BetaContentBlock | None = None async for event in self._response: if isinstance(event, BetaRawMessageStartEvent): self._usage = _map_usage(event, self._provider_name, self._provider_url, self._model_name) self.provider_response_id = event.message.id elif isinstance(event, BetaRawContentBlockStartEvent): current_block = event.content_block if isinstance(current_block, BetaTextBlock) and current_block.text: maybe_event = self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=current_block.text ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaThinkingBlock): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=current_block.thinking, signature=current_block.signature, provider_name=self.provider_name, ) elif isinstance(current_block, BetaRedactedThinkingBlock): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, id='redacted_thinking', signature=current_block.data, provider_name=self.provider_name, ) elif isinstance(current_block, BetaToolUseBlock): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, tool_name=current_block.name, args=cast(dict[str, Any], current_block.input) or None, tool_call_id=current_block.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(current_block, BetaServerToolUseBlock): yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_server_tool_use_block(current_block, self.provider_name), ) elif isinstance(current_block, BetaWebSearchToolResultBlock): yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_web_search_tool_result_block(current_block, self.provider_name), ) elif isinstance(current_block, BetaCodeExecutionToolResultBlock): yield self._parts_manager.handle_part( vendor_part_id=event.index, part=_map_code_execution_tool_result_block(current_block, self.provider_name), ) elif isinstance(event, BetaRawContentBlockDeltaEvent): if isinstance(event.delta, BetaTextDelta): maybe_event = self._parts_manager.handle_text_delta( vendor_part_id=event.index, content=event.delta.text ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(event.delta, BetaThinkingDelta): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, content=event.delta.thinking, provider_name=self.provider_name, ) elif isinstance(event.delta, BetaSignatureDelta): yield self._parts_manager.handle_thinking_delta( vendor_part_id=event.index, signature=event.delta.signature, provider_name=self.provider_name, ) elif isinstance(event.delta, BetaInputJSONDelta): maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=event.index, args=event.delta.partial_json, ) if maybe_event is not None: # pragma: no branch yield maybe_event # TODO(Marcelo): We need to handle citations. elif isinstance(event.delta, BetaCitationsDelta): pass elif isinstance(event, BetaRawMessageDeltaEvent): self._usage = _map_usage(event, self._provider_name, self._provider_url, self._model_name, self._usage) if raw_finish_reason := event.delta.stop_reason: # pragma: no branch self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) elif isinstance(event, BetaRawContentBlockStopEvent | BetaRawMessageStopEvent): # pragma: no branch current_block = None @property def model_name(self) -> AnthropicModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: AnthropicModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models` Logic related to making requests to an LLM. The aim here is to make a common interface for different LLMs, so that the rest of the code can be agnostic to the specific LLM being used. ### KnownModelName ```python KnownModelName = TypeAliasType( "KnownModelName", Literal[ "anthropic:claude-3-5-haiku-20241022", "anthropic:claude-3-5-haiku-latest", "anthropic:claude-3-5-sonnet-20240620", "anthropic:claude-3-5-sonnet-20241022", "anthropic:claude-3-5-sonnet-latest", "anthropic:claude-3-7-sonnet-20250219", "anthropic:claude-3-7-sonnet-latest", "anthropic:claude-3-haiku-20240307", "anthropic:claude-3-opus-20240229", "anthropic:claude-3-opus-latest", "anthropic:claude-4-opus-20250514", "anthropic:claude-4-sonnet-20250514", "anthropic:claude-opus-4-0", "anthropic:claude-opus-4-1-20250805", "anthropic:claude-opus-4-20250514", "anthropic:claude-sonnet-4-0", "anthropic:claude-sonnet-4-20250514", "anthropic:claude-sonnet-4-5", "anthropic:claude-sonnet-4-5-20250929", "bedrock:amazon.titan-tg1-large", "bedrock:amazon.titan-text-lite-v1", "bedrock:amazon.titan-text-express-v1", "bedrock:us.amazon.nova-pro-v1:0", "bedrock:us.amazon.nova-lite-v1:0", "bedrock:us.amazon.nova-micro-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0", "bedrock:anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0", "bedrock:anthropic.claude-instant-v1", "bedrock:anthropic.claude-v2:1", "bedrock:anthropic.claude-v2", "bedrock:anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:us.anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:anthropic.claude-3-haiku-20240307-v1:0", "bedrock:us.anthropic.claude-3-haiku-20240307-v1:0", "bedrock:anthropic.claude-3-opus-20240229-v1:0", "bedrock:us.anthropic.claude-3-opus-20240229-v1:0", "bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:us.anthropic.claude-3-5-sonnet-20240620-v1:0", "bedrock:anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0", "bedrock:anthropic.claude-opus-4-20250514-v1:0", "bedrock:us.anthropic.claude-opus-4-20250514-v1:0", "bedrock:anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:us.anthropic.claude-sonnet-4-20250514-v1:0", "bedrock:cohere.command-text-v14", "bedrock:cohere.command-r-v1:0", "bedrock:cohere.command-r-plus-v1:0", "bedrock:cohere.command-light-text-v14", "bedrock:meta.llama3-8b-instruct-v1:0", "bedrock:meta.llama3-70b-instruct-v1:0", "bedrock:meta.llama3-1-8b-instruct-v1:0", "bedrock:us.meta.llama3-1-8b-instruct-v1:0", "bedrock:meta.llama3-1-70b-instruct-v1:0", "bedrock:us.meta.llama3-1-70b-instruct-v1:0", "bedrock:meta.llama3-1-405b-instruct-v1:0", "bedrock:us.meta.llama3-2-11b-instruct-v1:0", "bedrock:us.meta.llama3-2-90b-instruct-v1:0", "bedrock:us.meta.llama3-2-1b-instruct-v1:0", "bedrock:us.meta.llama3-2-3b-instruct-v1:0", "bedrock:us.meta.llama3-3-70b-instruct-v1:0", "bedrock:mistral.mistral-7b-instruct-v0:2", "bedrock:mistral.mixtral-8x7b-instruct-v0:1", "bedrock:mistral.mistral-large-2402-v1:0", "bedrock:mistral.mistral-large-2407-v1:0", "cerebras:gpt-oss-120b", "cerebras:llama3.1-8b", "cerebras:llama-3.3-70b", "cerebras:llama-4-scout-17b-16e-instruct", "cerebras:llama-4-maverick-17b-128e-instruct", "cerebras:qwen-3-235b-a22b-instruct-2507", "cerebras:qwen-3-32b", "cerebras:qwen-3-coder-480b", "cerebras:qwen-3-235b-a22b-thinking-2507", "cohere:c4ai-aya-expanse-32b", "cohere:c4ai-aya-expanse-8b", "cohere:command", "cohere:command-light", "cohere:command-light-nightly", "cohere:command-nightly", "cohere:command-r", "cohere:command-r-03-2024", "cohere:command-r-08-2024", "cohere:command-r-plus", "cohere:command-r-plus-04-2024", "cohere:command-r-plus-08-2024", "cohere:command-r7b-12-2024", "deepseek:deepseek-chat", "deepseek:deepseek-reasoner", "google-gla:gemini-2.0-flash", "google-gla:gemini-2.0-flash-lite", "google-gla:gemini-2.5-flash", "google-gla:gemini-2.5-flash-preview-09-2025", "google-gla:gemini-flash-latest", "google-gla:gemini-2.5-flash-lite", "google-gla:gemini-2.5-flash-lite-preview-09-2025", "google-gla:gemini-flash-lite-latest", "google-gla:gemini-2.5-pro", "google-vertex:gemini-2.0-flash", "google-vertex:gemini-2.0-flash-lite", "google-vertex:gemini-2.5-flash", "google-vertex:gemini-2.5-flash-preview-09-2025", "google-vertex:gemini-flash-latest", "google-vertex:gemini-2.5-flash-lite", "google-vertex:gemini-2.5-flash-lite-preview-09-2025", "google-vertex:gemini-flash-lite-latest", "google-vertex:gemini-2.5-pro", "grok:grok-4", "grok:grok-4-0709", "grok:grok-3", "grok:grok-3-mini", "grok:grok-3-fast", "grok:grok-3-mini-fast", "grok:grok-2-vision-1212", "grok:grok-2-image-1212", "groq:distil-whisper-large-v3-en", "groq:gemma2-9b-it", "groq:llama-3.3-70b-versatile", "groq:llama-3.1-8b-instant", "groq:llama-guard-3-8b", "groq:llama3-70b-8192", "groq:llama3-8b-8192", "groq:moonshotai/kimi-k2-instruct", "groq:whisper-large-v3", "groq:whisper-large-v3-turbo", "groq:playai-tts", "groq:playai-tts-arabic", "groq:qwen-qwq-32b", "groq:mistral-saba-24b", "groq:qwen-2.5-coder-32b", "groq:qwen-2.5-32b", "groq:deepseek-r1-distill-qwen-32b", "groq:deepseek-r1-distill-llama-70b", "groq:llama-3.3-70b-specdec", "groq:llama-3.2-1b-preview", "groq:llama-3.2-3b-preview", "groq:llama-3.2-11b-vision-preview", "groq:llama-3.2-90b-vision-preview", "heroku:claude-3-5-haiku", "heroku:claude-3-5-sonnet-latest", "heroku:claude-3-7-sonnet", "heroku:claude-4-sonnet", "heroku:claude-3-haiku", "heroku:gpt-oss-120b", "heroku:nova-lite", "heroku:nova-pro", "huggingface:Qwen/QwQ-32B", "huggingface:Qwen/Qwen2.5-72B-Instruct", "huggingface:Qwen/Qwen3-235B-A22B", "huggingface:Qwen/Qwen3-32B", "huggingface:deepseek-ai/DeepSeek-R1", "huggingface:meta-llama/Llama-3.3-70B-Instruct", "huggingface:meta-llama/Llama-4-Maverick-17B-128E-Instruct", "huggingface:meta-llama/Llama-4-Scout-17B-16E-Instruct", "mistral:codestral-latest", "mistral:mistral-large-latest", "mistral:mistral-moderation-latest", "mistral:mistral-small-latest", "moonshotai:moonshot-v1-8k", "moonshotai:moonshot-v1-32k", "moonshotai:moonshot-v1-128k", "moonshotai:moonshot-v1-8k-vision-preview", "moonshotai:moonshot-v1-32k-vision-preview", "moonshotai:moonshot-v1-128k-vision-preview", "moonshotai:kimi-latest", "moonshotai:kimi-thinking-preview", "moonshotai:kimi-k2-0711-preview", "openai:chatgpt-4o-latest", "openai:codex-mini-latest", "openai:gpt-3.5-turbo", "openai:gpt-3.5-turbo-0125", "openai:gpt-3.5-turbo-0301", "openai:gpt-3.5-turbo-0613", "openai:gpt-3.5-turbo-1106", "openai:gpt-3.5-turbo-16k", "openai:gpt-3.5-turbo-16k-0613", "openai:gpt-4", "openai:gpt-4-0125-preview", "openai:gpt-4-0314", "openai:gpt-4-0613", "openai:gpt-4-1106-preview", "openai:gpt-4-32k", "openai:gpt-4-32k-0314", "openai:gpt-4-32k-0613", "openai:gpt-4-turbo", "openai:gpt-4-turbo-2024-04-09", "openai:gpt-4-turbo-preview", "openai:gpt-4-vision-preview", "openai:gpt-4.1", "openai:gpt-4.1-2025-04-14", "openai:gpt-4.1-mini", "openai:gpt-4.1-mini-2025-04-14", "openai:gpt-4.1-nano", "openai:gpt-4.1-nano-2025-04-14", "openai:gpt-4o", "openai:gpt-4o-2024-05-13", "openai:gpt-4o-2024-08-06", "openai:gpt-4o-2024-11-20", "openai:gpt-4o-audio-preview", "openai:gpt-4o-audio-preview-2024-10-01", "openai:gpt-4o-audio-preview-2024-12-17", "openai:gpt-4o-audio-preview-2025-06-03", "openai:gpt-4o-mini", "openai:gpt-4o-mini-2024-07-18", "openai:gpt-4o-mini-audio-preview", "openai:gpt-4o-mini-audio-preview-2024-12-17", "openai:gpt-4o-mini-search-preview", "openai:gpt-4o-mini-search-preview-2025-03-11", "openai:gpt-4o-search-preview", "openai:gpt-4o-search-preview-2025-03-11", "openai:gpt-5", "openai:gpt-5-2025-08-07", "openai:o1", "openai:gpt-5-chat-latest", "openai:o1-2024-12-17", "openai:gpt-5-mini", "openai:o1-mini", "openai:gpt-5-mini-2025-08-07", "openai:o1-mini-2024-09-12", "openai:gpt-5-nano", "openai:o1-preview", "openai:gpt-5-nano-2025-08-07", "openai:o1-preview-2024-09-12", "openai:o1-pro", "openai:o1-pro-2025-03-19", "openai:o3", "openai:o3-2025-04-16", "openai:o3-deep-research", "openai:o3-deep-research-2025-06-26", "openai:o3-mini", "openai:o3-mini-2025-01-31", "openai:o4-mini", "openai:o4-mini-2025-04-16", "openai:o4-mini-deep-research", "openai:o4-mini-deep-research-2025-06-26", "openai:o3-pro", "openai:o3-pro-2025-06-10", "openai:computer-use-preview", "openai:computer-use-preview-2025-03-11", "test", ], ) ``` Known model names that can be used with the `model` parameter of Agent. `KnownModelName` is provided as a concise way to specify a model. ### ModelRequestParameters Configuration for an agent's request to a model, specifically related to tools and output handling. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass(repr=False, kw_only=True) class ModelRequestParameters: """Configuration for an agent's request to a model, specifically related to tools and output handling.""" function_tools: list[ToolDefinition] = field(default_factory=list) builtin_tools: list[AbstractBuiltinTool] = field(default_factory=list) output_mode: OutputMode = 'text' output_object: OutputObjectDefinition | None = None output_tools: list[ToolDefinition] = field(default_factory=list) allow_text_output: bool = True allow_image_output: bool = False @cached_property def tool_defs(self) -> dict[str, ToolDefinition]: return {tool_def.name: tool_def for tool_def in [*self.function_tools, *self.output_tools]} __repr__ = _utils.dataclasses_no_defaults_repr ``` ### Model Bases: `ABC` Abstract class for a model. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python class Model(ABC): """Abstract class for a model.""" _profile: ModelProfileSpec | None = None _settings: ModelSettings | None = None def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile @property def settings(self) -> ModelSettings | None: """Get the model settings.""" return self._settings @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model.""" raise NotImplementedError() async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: """Prepare request inputs before they are passed to the provider. This merges the given ``model_settings`` with the model's own ``settings`` attribute and ensures ``customize_request_parameters`` is applied to the resolved [`ModelRequestParameters`][pydantic_ai.models.ModelRequestParameters]. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call ``self.prepare_request(...)`` at the start of their ``request`` (and related) methods. """ merged_settings = merge_model_settings(self.settings, model_settings) customized_parameters = self.customize_request_parameters(model_request_parameters) return merged_settings, customized_parameters @property @abstractmethod def model_name(self) -> str: """The model name.""" raise NotImplementedError() @cached_property def profile(self) -> ModelProfile: """The model profile.""" _profile = self._profile if callable(_profile): _profile = _profile(self.model_name) if _profile is None: return DEFAULT_PROFILE return _profile @property @abstractmethod def system(self) -> str: """The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. """ raise NotImplementedError() @property def base_url(self) -> str | None: """The base URL for the provider API, if available.""" return None @staticmethod def _get_instructions(messages: list[ModelMessage]) -> str | None: """Get instructions from the first ModelRequest found when iterating messages in reverse. In the case that a "mock" request was generated to include a tool-return part for a result tool, we want to use the instructions from the second-to-most-recent request (which should correspond to the original request that generated the response that resulted in the tool-return part). """ last_two_requests: list[ModelRequest] = [] for message in reversed(messages): if isinstance(message, ModelRequest): last_two_requests.append(message) if len(last_two_requests) == 2: break if message.instructions is not None: return message.instructions # If we don't have two requests, and we didn't already return instructions, there are definitely not any: if len(last_two_requests) != 2: return None most_recent_request = last_two_requests[0] second_most_recent_request = last_two_requests[1] # If we've gotten this far and the most recent request consists of only tool-return parts or retry-prompt parts, # we use the instructions from the second-to-most-recent request. This is necessary because when handling # result tools, we generate a "mock" ModelRequest with a tool-return part for it, and that ModelRequest will not # have the relevant instructions from the agent. # While it's possible that you could have a message history where the most recent request has only tool returns, # I believe there is no way to achieve that would _change_ the instructions without manually crafting the most # recent message. That might make sense in principle for some usage pattern, but it's enough of an edge case # that I think it's not worth worrying about, since you can work around this by inserting another ModelRequest # with no parts at all immediately before the request that has the tool calls (that works because we only look # at the two most recent ModelRequests here). # If you have a use case where this causes pain, please open a GitHub issue and we can discuss alternatives. if all(p.part_kind == 'tool-return' or p.part_kind == 'retry-prompt' for p in most_recent_request.parts): return second_most_recent_request.instructions return None ``` #### __init__ ```python __init__( *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None ) -> None ``` Initialize the model with optional settings and profile. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | | `profile` | `ModelProfileSpec | None` | The model profile to use. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __init__( self, *, settings: ModelSettings | None = None, profile: ModelProfileSpec | None = None, ) -> None: """Initialize the model with optional settings and profile. Args: settings: Model-specific settings that will be used as defaults for this model. profile: The model profile to use. """ self._settings = settings self._profile = profile ``` #### settings ```python settings: ModelSettings | None ``` Get the model settings. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a request to the model. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @abstractmethod async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a request to the model.""" raise NotImplementedError() ``` #### count_tokens ```python count_tokens( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage ``` Make a request to the model for counting tokens. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> RequestUsage: """Make a request to the model for counting tokens.""" # This method is not required, but you need to implement it if you want to support `UsageLimits.count_tokens_before_request`. raise NotImplementedError(f'Token counting ahead of the request is not supported by {self.__class__.__name__}') ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a request to the model and return a streaming response. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a request to the model and return a streaming response.""" # This method is not required, but you need to implement it if you want to support streamed responses raise NotImplementedError(f'Streamed requests not supported by this {self.__class__.__name__}') # yield is required to make this a generator for type checking # noinspection PyUnreachableCode yield # pragma: no cover ``` #### customize_request_parameters ```python customize_request_parameters( model_request_parameters: ModelRequestParameters, ) -> ModelRequestParameters ``` Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: """Customize the request parameters for the model. This method can be overridden by subclasses to modify the request parameters before sending them to the model. In particular, this method can be used to make modifications to the generated tool JSON schemas if necessary for vendor/model-specific reasons. """ if transformer := self.profile.json_schema_transformer: model_request_parameters = replace( model_request_parameters, function_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.function_tools], output_tools=[_customize_tool_def(transformer, t) for t in model_request_parameters.output_tools], ) if output_object := model_request_parameters.output_object: model_request_parameters = replace( model_request_parameters, output_object=_customize_output_object(transformer, output_object), ) return model_request_parameters ``` #### prepare_request ```python prepare_request( model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters] ``` Prepare request inputs before they are passed to the provider. This merges the given `model_settings` with the model's own `settings` attribute and ensures `customize_request_parameters` is applied to the resolved ModelRequestParameters. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call `self.prepare_request(...)` at the start of their `request` (and related) methods. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: """Prepare request inputs before they are passed to the provider. This merges the given ``model_settings`` with the model's own ``settings`` attribute and ensures ``customize_request_parameters`` is applied to the resolved [`ModelRequestParameters`][pydantic_ai.models.ModelRequestParameters]. Subclasses can override this method if they need to customize the preparation flow further, but most implementations should simply call ``self.prepare_request(...)`` at the start of their ``request`` (and related) methods. """ merged_settings = merge_model_settings(self.settings, model_settings) customized_parameters = self.customize_request_parameters(model_request_parameters) return merged_settings, customized_parameters ``` #### model_name ```python model_name: str ``` The model name. #### profile ```python profile: ModelProfile ``` The model profile. #### system ```python system: str ``` The model provider, ex: openai. Use to populate the `gen_ai.system` OpenTelemetry semantic convention attribute, so should use well-known values listed in https://opentelemetry.io/docs/specs/semconv/attributes-registry/gen-ai/#gen-ai-system when applicable. #### base_url ```python base_url: str | None ``` The base URL for the provider API, if available. ### StreamedResponse Bases: `ABC` Streamed response from an LLM when calling a tool. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @dataclass class StreamedResponse(ABC): """Streamed response from an LLM when calling a tool.""" model_request_parameters: ModelRequestParameters final_result_event: FinalResultEvent | None = field(default=None, init=False) provider_response_id: str | None = field(default=None, init=False) provider_details: dict[str, Any] | None = field(default=None, init=False) finish_reason: FinishReason | None = field(default=None, init=False) _parts_manager: ModelResponsePartsManager = field(default_factory=ModelResponsePartsManager, init=False) _event_iterator: AsyncIterator[ModelResponseStreamEvent] | None = field(default=None, init=False) _usage: RequestUsage = field(default_factory=RequestUsage, init=False) def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event self._event_iterator = iterator_with_final_event(self._get_event_iterator()) return self._event_iterator @abstractmethod async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. It should use the `_parts_manager` to handle deltas, and should update the `_usage` attributes as it goes. """ raise NotImplementedError() # noinspection PyUnreachableCode yield def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) # TODO (v2): Make this a property def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage @property @abstractmethod def model_name(self) -> str: """Get the model name of the response.""" raise NotImplementedError() @property @abstractmethod def provider_name(self) -> str | None: """Get the provider name.""" raise NotImplementedError() @property @abstractmethod def timestamp(self) -> datetime: """Get the timestamp of the response.""" raise NotImplementedError() ``` #### __aiter__ ```python __aiter__() -> AsyncIterator[ModelResponseStreamEvent] ``` Stream the response as an async iterable of ModelResponseStreamEvents. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a FinalResultEvent if/when the first match is found. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def __aiter__(self) -> AsyncIterator[ModelResponseStreamEvent]: """Stream the response as an async iterable of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This proxies the `_event_iterator()` and emits all events, while also checking for matches on the result schema and emitting a [`FinalResultEvent`][pydantic_ai.messages.FinalResultEvent] if/when the first match is found. """ if self._event_iterator is None: async def iterator_with_final_event( iterator: AsyncIterator[ModelResponseStreamEvent], ) -> AsyncIterator[ModelResponseStreamEvent]: async for event in iterator: yield event if ( final_result_event := _get_final_result_event(event, self.model_request_parameters) ) is not None: self.final_result_event = final_result_event yield final_result_event break # If we broke out of the above loop, we need to yield the rest of the events # If we didn't, this will just be a no-op async for event in iterator: yield event self._event_iterator = iterator_with_final_event(self._get_event_iterator()) return self._event_iterator ``` #### get ```python get() -> ModelResponse ``` Build a ModelResponse from the data received from the stream so far. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def get(self) -> ModelResponse: """Build a [`ModelResponse`][pydantic_ai.messages.ModelResponse] from the data received from the stream so far.""" return ModelResponse( parts=self._parts_manager.get_parts(), model_name=self.model_name, timestamp=self.timestamp, usage=self.usage(), provider_name=self.provider_name, provider_response_id=self.provider_response_id, provider_details=self.provider_details, finish_reason=self.finish_reason, ) ``` #### usage ```python usage() -> RequestUsage ``` Get the usage of the response so far. This will not be the final usage until the stream is exhausted. Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def usage(self) -> RequestUsage: """Get the usage of the response so far. This will not be the final usage until the stream is exhausted.""" return self._usage ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str | None ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. ### ALLOW_MODEL_REQUESTS ```python ALLOW_MODEL_REQUESTS = True ``` Whether to allow requests to models. This global setting allows you to disable request to most models, e.g. to make sure you don't accidentally make costly requests to a model during tests. The testing models TestModel and FunctionModel are no affected by this setting. ### check_allow_model_requests ```python check_allow_model_requests() -> None ``` Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in Model.request and Model.request_stream. Raises: | Type | Description | | --- | --- | | `RuntimeError` | If model requests are not allowed. | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python def check_allow_model_requests() -> None: """Check if model requests are allowed. If you're defining your own models that have costs or latency associated with their use, you should call this in [`Model.request`][pydantic_ai.models.Model.request] and [`Model.request_stream`][pydantic_ai.models.Model.request_stream]. Raises: RuntimeError: If model requests are not allowed. """ if not ALLOW_MODEL_REQUESTS: raise RuntimeError('Model requests are not allowed, since ALLOW_MODEL_REQUESTS is False') ``` ### override_allow_model_requests ```python override_allow_model_requests( allow_model_requests: bool, ) -> Iterator[None] ``` Context manager to temporarily override ALLOW_MODEL_REQUESTS. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `allow_model_requests` | `bool` | Whether to allow model requests within the context. | *required* | Source code in `pydantic_ai_slim/pydantic_ai/models/__init__.py` ```python @contextmanager def override_allow_model_requests(allow_model_requests: bool) -> Iterator[None]: """Context manager to temporarily override [`ALLOW_MODEL_REQUESTS`][pydantic_ai.models.ALLOW_MODEL_REQUESTS]. Args: allow_model_requests: Whether to allow model requests within the context. """ global ALLOW_MODEL_REQUESTS old_value = ALLOW_MODEL_REQUESTS ALLOW_MODEL_REQUESTS = allow_model_requests # pyright: ignore[reportConstantRedefinition] try: yield finally: ALLOW_MODEL_REQUESTS = old_value # pyright: ignore[reportConstantRedefinition] ``` # `pydantic_ai.models.bedrock` ## Setup For details on how to set up authentication with this model, see [model configuration for Bedrock](../../../models/bedrock/). ### LatestBedrockModelNames ```python LatestBedrockModelNames = Literal[ "amazon.titan-tg1-large", "amazon.titan-text-lite-v1", "amazon.titan-text-express-v1", "us.amazon.nova-pro-v1:0", "us.amazon.nova-lite-v1:0", "us.amazon.nova-micro-v1:0", "anthropic.claude-3-5-sonnet-20241022-v2:0", "us.anthropic.claude-3-5-sonnet-20241022-v2:0", "anthropic.claude-3-5-haiku-20241022-v1:0", "us.anthropic.claude-3-5-haiku-20241022-v1:0", "anthropic.claude-instant-v1", "anthropic.claude-v2:1", "anthropic.claude-v2", "anthropic.claude-3-sonnet-20240229-v1:0", "us.anthropic.claude-3-sonnet-20240229-v1:0", "anthropic.claude-3-haiku-20240307-v1:0", "us.anthropic.claude-3-haiku-20240307-v1:0", "anthropic.claude-3-opus-20240229-v1:0", "us.anthropic.claude-3-opus-20240229-v1:0", "anthropic.claude-3-5-sonnet-20240620-v1:0", "us.anthropic.claude-3-5-sonnet-20240620-v1:0", "anthropic.claude-3-7-sonnet-20250219-v1:0", "us.anthropic.claude-3-7-sonnet-20250219-v1:0", "anthropic.claude-opus-4-20250514-v1:0", "us.anthropic.claude-opus-4-20250514-v1:0", "anthropic.claude-sonnet-4-20250514-v1:0", "us.anthropic.claude-sonnet-4-20250514-v1:0", "cohere.command-text-v14", "cohere.command-r-v1:0", "cohere.command-r-plus-v1:0", "cohere.command-light-text-v14", "meta.llama3-8b-instruct-v1:0", "meta.llama3-70b-instruct-v1:0", "meta.llama3-1-8b-instruct-v1:0", "us.meta.llama3-1-8b-instruct-v1:0", "meta.llama3-1-70b-instruct-v1:0", "us.meta.llama3-1-70b-instruct-v1:0", "meta.llama3-1-405b-instruct-v1:0", "us.meta.llama3-2-11b-instruct-v1:0", "us.meta.llama3-2-90b-instruct-v1:0", "us.meta.llama3-2-1b-instruct-v1:0", "us.meta.llama3-2-3b-instruct-v1:0", "us.meta.llama3-3-70b-instruct-v1:0", "mistral.mistral-7b-instruct-v0:2", "mistral.mixtral-8x7b-instruct-v0:1", "mistral.mistral-large-2402-v1:0", "mistral.mistral-large-2407-v1:0", ] ``` Latest Bedrock models. ### BedrockModelName ```python BedrockModelName = str | LatestBedrockModelNames ``` Possible Bedrock model names. Since Bedrock supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Bedrock docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) for a full list. ### BedrockModelSettings Bases: `ModelSettings` Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python class BedrockModelSettings(ModelSettings, total=False): """Settings for Bedrock models. See [the Bedrock Converse API docs](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html#API_runtime_Converse_RequestSyntax) for a full list. See [the boto3 implementation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html) of the Bedrock Converse API. """ # ALL FIELDS MUST BE `bedrock_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. bedrock_guardrail_config: GuardrailConfigurationTypeDef """Content moderation and safety settings for Bedrock API requests. See more about it on . """ bedrock_performance_configuration: PerformanceConfigurationTypeDef """Performance optimization settings for model inference. See more about it on . """ bedrock_request_metadata: dict[str, str] """Additional metadata to attach to Bedrock API requests. See more about it on . """ bedrock_additional_model_response_fields_paths: list[str] """JSON paths to extract additional fields from model responses. See more about it on . """ bedrock_prompt_variables: Mapping[str, PromptVariableValuesTypeDef] """Variables for substitution into prompt templates. See more about it on . """ bedrock_additional_model_requests_fields: Mapping[str, Any] """Additional model-specific parameters to include in requests. See more about it on . """ ``` #### bedrock_guardrail_config ```python bedrock_guardrail_config: GuardrailConfigurationTypeDef ``` Content moderation and safety settings for Bedrock API requests. See more about it on . #### bedrock_performance_configuration ```python bedrock_performance_configuration: ( PerformanceConfigurationTypeDef ) ``` Performance optimization settings for model inference. See more about it on . #### bedrock_request_metadata ```python bedrock_request_metadata: dict[str, str] ``` Additional metadata to attach to Bedrock API requests. See more about it on . #### bedrock_additional_model_response_fields_paths ```python bedrock_additional_model_response_fields_paths: list[str] ``` JSON paths to extract additional fields from model responses. See more about it on . #### bedrock_prompt_variables ```python bedrock_prompt_variables: Mapping[ str, PromptVariableValuesTypeDef ] ``` Variables for substitution into prompt templates. See more about it on . #### bedrock_additional_model_requests_fields ```python bedrock_additional_model_requests_fields: Mapping[str, Any] ``` Additional model-specific parameters to include in requests. See more about it on . ### BedrockConverseModel Bases: `Model` A model that uses the Bedrock Converse API. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass(init=False) class BedrockConverseModel(Model): """A model that uses the Bedrock Converse API.""" client: BedrockRuntimeClient _model_name: BedrockModelName = field(repr=False) _provider: Provider[BaseClient] = field(repr=False) def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.meta.endpoint_url) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolTypeDef]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolTypeDef: tool_spec: ToolSpecificationTypeDef = {'name': f.name, 'inputSchema': {'json': f.parameters_json_schema}} if f.description: # pragma: no branch tool_spec['description'] = f.description return {'toolSpec': tool_spec} async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, False, settings, model_request_parameters) model_response = await self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) settings = cast(BedrockModelSettings, model_settings or {}) response = await self._messages_create(messages, True, settings, model_request_parameters) yield BedrockStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self.model_name, _event_stream=response['stream'], _provider_name=self._provider.name, _provider_response_id=response.get('ResponseMetadata', {}).get('RequestId', None), ) async def _process_response(self, response: ConverseResponseTypeDef) -> ModelResponse: items: list[ModelResponsePart] = [] if message := response['output'].get('message'): # pragma: no branch for item in message['content']: if reasoning_content := item.get('reasoningContent'): if redacted_content := reasoning_content.get('redactedContent'): items.append( ThinkingPart( id='redacted_content', content='', signature=redacted_content.decode('utf-8'), provider_name=self.system, ) ) elif reasoning_text := reasoning_content.get('reasoningText'): # pragma: no branch signature = reasoning_text.get('signature') items.append( ThinkingPart( content=reasoning_text['text'], signature=signature, provider_name=self.system if signature else None, ) ) if text := item.get('text'): items.append(TextPart(content=text)) elif tool_use := item.get('toolUse'): items.append( ToolCallPart( tool_name=tool_use['name'], args=tool_use['input'], tool_call_id=tool_use['toolUseId'], ), ) u = usage.RequestUsage( input_tokens=response['usage']['inputTokens'], output_tokens=response['usage']['outputTokens'], ) response_id = response.get('ResponseMetadata', {}).get('RequestId', None) raw_finish_reason = response['stopReason'] provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=u, model_name=self.model_name, provider_response_id=response_id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseStreamResponseTypeDef: pass @overload async def _messages_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef: pass async def _messages_create( self, messages: list[ModelMessage], stream: bool, model_settings: BedrockModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ConverseResponseTypeDef | ConverseStreamResponseTypeDef: system_prompt, bedrock_messages = await self._map_messages(messages) inference_config = self._map_inference_config(model_settings) params: ConverseRequestTypeDef = { 'modelId': self.model_name, 'messages': bedrock_messages, 'system': system_prompt, 'inferenceConfig': inference_config, } tool_config = self._map_tool_config(model_request_parameters) if tool_config: params['toolConfig'] = tool_config if model_request_parameters.builtin_tools: raise UserError('Bedrock does not support built-in tools') # Bedrock supports a set of specific extra parameters if model_settings: if guardrail_config := model_settings.get('bedrock_guardrail_config', None): params['guardrailConfig'] = guardrail_config if performance_configuration := model_settings.get('bedrock_performance_configuration', None): params['performanceConfig'] = performance_configuration if request_metadata := model_settings.get('bedrock_request_metadata', None): params['requestMetadata'] = request_metadata if additional_model_response_fields_paths := model_settings.get( 'bedrock_additional_model_response_fields_paths', None ): params['additionalModelResponseFieldPaths'] = additional_model_response_fields_paths if additional_model_requests_fields := model_settings.get('bedrock_additional_model_requests_fields', None): params['additionalModelRequestFields'] = additional_model_requests_fields if prompt_variables := model_settings.get('bedrock_prompt_variables', None): params['promptVariables'] = prompt_variables if stream: model_response = await anyio.to_thread.run_sync(functools.partial(self.client.converse_stream, **params)) else: model_response = await anyio.to_thread.run_sync(functools.partial(self.client.converse, **params)) return model_response @staticmethod def _map_inference_config( model_settings: ModelSettings | None, ) -> InferenceConfigurationTypeDef: model_settings = model_settings or {} inference_config: InferenceConfigurationTypeDef = {} if max_tokens := model_settings.get('max_tokens'): inference_config['maxTokens'] = max_tokens if (temperature := model_settings.get('temperature')) is not None: inference_config['temperature'] = temperature if top_p := model_settings.get('top_p'): inference_config['topP'] = top_p if stop_sequences := model_settings.get('stop_sequences'): inference_config['stopSequences'] = stop_sequences return inference_config def _map_tool_config(self, model_request_parameters: ModelRequestParameters) -> ToolConfigurationTypeDef | None: tools = self._get_tools(model_request_parameters) if not tools: return None tool_choice: ToolChoiceTypeDef if not model_request_parameters.allow_text_output: tool_choice = {'any': {}} else: tool_choice = {'auto': {}} tool_config: ToolConfigurationTypeDef = {'tools': tools} if tool_choice and BedrockModelProfile.from_profile(self.profile).bedrock_supports_tool_choice: tool_config['toolChoice'] = tool_choice return tool_config async def _map_messages( # noqa: C901 self, messages: list[ModelMessage] ) -> tuple[list[SystemContentBlockTypeDef], list[MessageUnionTypeDef]]: """Maps a `pydantic_ai.Message` to the Bedrock `MessageUnionTypeDef`. Groups consecutive ToolReturnPart objects into a single user message as required by Bedrock Claude/Nova models. """ profile = BedrockModelProfile.from_profile(self.profile) system_prompt: list[SystemContentBlockTypeDef] = [] bedrock_messages: list[MessageUnionTypeDef] = [] document_count: Iterator[int] = count(1) for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart) and part.content: system_prompt.append({'text': part.content}) elif isinstance(part, UserPromptPart): bedrock_messages.extend(await self._map_user_prompt(part, document_count)) elif isinstance(part, ToolReturnPart): assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [ {'text': part.model_response_str()} if profile.bedrock_tool_result_format == 'text' else {'json': part.model_response_object()} ], 'status': 'success', } } ], } ) elif isinstance(part, RetryPromptPart): # TODO(Marcelo): We need to add a test here. if part.tool_name is None: # pragma: no cover bedrock_messages.append({'role': 'user', 'content': [{'text': part.model_response()}]}) else: assert part.tool_call_id is not None bedrock_messages.append( { 'role': 'user', 'content': [ { 'toolResult': { 'toolUseId': part.tool_call_id, 'content': [{'text': part.model_response()}], 'status': 'error', } } ], } ) elif isinstance(message, ModelResponse): content: list[ContentBlockOutputTypeDef] = [] for item in message.parts: if isinstance(item, TextPart): content.append({'text': item.content}) elif isinstance(item, ThinkingPart): if ( item.provider_name == self.system and item.signature and BedrockModelProfile.from_profile(self.profile).bedrock_send_back_thinking_parts ): if item.id == 'redacted_content': reasoning_content: ReasoningContentBlockOutputTypeDef = { 'redactedContent': item.signature.encode('utf-8'), } else: reasoning_content: ReasoningContentBlockOutputTypeDef = { 'reasoningText': { 'text': item.content, 'signature': item.signature, } } content.append({'reasoningContent': reasoning_content}) else: start_tag, end_tag = self.profile.thinking_tags content.append({'text': '\n'.join([start_tag, item.content, end_tag])}) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): pass else: assert isinstance(item, ToolCallPart) content.append(self._map_tool_call(item)) bedrock_messages.append({'role': 'assistant', 'content': content}) else: assert_never(message) # Merge together sequential user messages. processed_messages: list[MessageUnionTypeDef] = [] last_message: dict[str, Any] | None = None for current_message in bedrock_messages: if ( last_message is not None and current_message['role'] == last_message['role'] and current_message['role'] == 'user' ): # Add the new user content onto the existing user message. last_content = list(last_message['content']) last_content.extend(current_message['content']) last_message['content'] = last_content continue # Add the entire message to the list of messages. processed_messages.append(current_message) last_message = cast(dict[str, Any], current_message) if instructions := self._get_instructions(messages): system_prompt.insert(0, {'text': instructions}) return system_prompt, processed_messages @staticmethod async def _map_user_prompt(part: UserPromptPart, document_count: Iterator[int]) -> list[MessageUnionTypeDef]: content: list[ContentBlockUnionTypeDef] = [] if isinstance(part.content, str): content.append({'text': part.content}) else: for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): format = item.format if item.is_document: name = f'Document {next(document_count)}' assert format in ('pdf', 'txt', 'csv', 'doc', 'docx', 'xls', 'xlsx', 'html', 'md') content.append({'document': {'name': name, 'format': format, 'source': {'bytes': item.data}}}) elif item.is_image: assert format in ('jpeg', 'png', 'gif', 'webp') content.append({'image': {'format': format, 'source': {'bytes': item.data}}}) elif item.is_video: assert format in ('mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp') content.append({'video': {'format': format, 'source': {'bytes': item.data}}}) else: raise NotImplementedError('Binary content is not supported yet.') elif isinstance(item, ImageUrl | DocumentUrl | VideoUrl): downloaded_item = await download_item(item, data_format='bytes', type_format='extension') format = downloaded_item['data_type'] if item.kind == 'image-url': format = item.media_type.split('/')[1] assert format in ('jpeg', 'png', 'gif', 'webp'), f'Unsupported image format: {format}' image: ImageBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} content.append({'image': image}) elif item.kind == 'document-url': name = f'Document {next(document_count)}' document: DocumentBlockTypeDef = { 'name': name, 'format': item.format, 'source': {'bytes': downloaded_item['data']}, } content.append({'document': document}) elif item.kind == 'video-url': # pragma: no branch format = item.media_type.split('/')[1] assert format in ( 'mkv', 'mov', 'mp4', 'webm', 'flv', 'mpeg', 'mpg', 'wmv', 'three_gp', ), f'Unsupported video format: {format}' video: VideoBlockTypeDef = {'format': format, 'source': {'bytes': downloaded_item['data']}} content.append({'video': video}) elif isinstance(item, AudioUrl): # pragma: no cover raise NotImplementedError('Audio is not supported yet.') else: assert_never(item) return [{'role': 'user', 'content': content}] @staticmethod def _map_tool_call(t: ToolCallPart) -> ContentBlockOutputTypeDef: return { 'toolUse': {'toolUseId': _utils.guard_tool_call_id(t=t), 'name': t.tool_name, 'input': t.args_as_dict()} } ``` #### __init__ ```python __init__( model_name: BedrockModelName, *, provider: ( Literal["bedrock"] | Provider[BaseClient] ) = "bedrock", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Bedrock model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `BedrockModelName` | The name of the model to use. | *required* | | `model_name` | `BedrockModelName` | The name of the Bedrock model to use. List of model names available here. | *required* | | `provider` | `Literal['bedrock'] | Provider[BaseClient]` | The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of Provider[BaseClient]. If not provided, a new provider will be created using the other parameters. | `'bedrock'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python def __init__( self, model_name: BedrockModelName, *, provider: Literal['bedrock'] | Provider[BaseClient] = 'bedrock', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Bedrock model. Args: model_name: The name of the model to use. model_name: The name of the Bedrock model to use. List of model names available [here](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html). provider: The provider to use for authentication and API access. Can be either the string 'bedrock' or an instance of `Provider[BaseClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = cast('BedrockRuntimeClient', provider.client) super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. ### BedrockStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Bedrock models. Source code in `pydantic_ai_slim/pydantic_ai/models/bedrock.py` ```python @dataclass class BedrockStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Bedrock models.""" _model_name: BedrockModelName _event_stream: EventStream[ConverseStreamOutputTypeDef] _provider_name: str _timestamp: datetime = field(default_factory=_utils.now_utc) _provider_response_id: str | None = None async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 """Return an async iterator of [`ModelResponseStreamEvent`][pydantic_ai.messages.ModelResponseStreamEvent]s. This method should be implemented by subclasses to translate the vendor-specific stream of events into pydantic_ai-format events. """ if self._provider_response_id is not None: # pragma: no cover self.provider_response_id = self._provider_response_id chunk: ConverseStreamOutputTypeDef tool_id: str | None = None async for chunk in _AsyncIteratorWrapper(self._event_stream): match chunk: case {'messageStart': _}: continue case {'messageStop': message_stop}: raw_finish_reason = message_stop['stopReason'] self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) case {'metadata': metadata}: if 'usage' in metadata: # pragma: no branch self._usage += self._map_usage(metadata) case {'contentBlockStart': content_block_start}: index = content_block_start['contentBlockIndex'] start = content_block_start['start'] if 'toolUse' in start: # pragma: no branch tool_use_start = start['toolUse'] tool_id = tool_use_start['toolUseId'] tool_name = tool_use_start['name'] maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_name, args=None, tool_call_id=tool_id, ) if maybe_event: # pragma: no branch yield maybe_event case {'contentBlockDelta': content_block_delta}: index = content_block_delta['contentBlockIndex'] delta = content_block_delta['delta'] if 'reasoningContent' in delta: if redacted_content := delta['reasoningContent'].get('redactedContent'): yield self._parts_manager.handle_thinking_delta( vendor_part_id=index, id='redacted_content', signature=redacted_content.decode('utf-8'), provider_name=self.provider_name, ) else: signature = delta['reasoningContent'].get('signature') yield self._parts_manager.handle_thinking_delta( vendor_part_id=index, content=delta['reasoningContent'].get('text'), signature=signature, provider_name=self.provider_name if signature else None, ) if 'text' in delta: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=index, content=delta['text']) if maybe_event is not None: # pragma: no branch yield maybe_event if 'toolUse' in delta: tool_use = delta['toolUse'] maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=index, tool_name=tool_use.get('name'), args=tool_use.get('input'), tool_call_id=tool_id, ) if maybe_event: # pragma: no branch yield maybe_event case _: pass # pyright wants match statements to be exhaustive @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: return self._timestamp def _map_usage(self, metadata: ConverseStreamMetadataEventTypeDef) -> usage.RequestUsage: return usage.RequestUsage( input_tokens=metadata['usage']['inputTokens'], output_tokens=metadata['usage']['outputTokens'], ) ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. # `pydantic_ai.models.cohere` ## Setup For details on how to set up authentication with this model, see [model configuration for Cohere](../../../models/cohere/). ### LatestCohereModelNames ```python LatestCohereModelNames = Literal[ "c4ai-aya-expanse-32b", "c4ai-aya-expanse-8b", "command", "command-light", "command-light-nightly", "command-nightly", "command-r", "command-r-03-2024", "command-r-08-2024", "command-r-plus", "command-r-plus-04-2024", "command-r-plus-08-2024", "command-r7b-12-2024", ] ``` Latest Cohere models. ### CohereModelName ```python CohereModelName = str | LatestCohereModelNames ``` Possible Cohere model names. Since Cohere supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [Cohere's docs](https://docs.cohere.com/v2/docs/models) for a list of all available models. ### CohereModelSettings Bases: `ModelSettings` Settings used for a Cohere model request. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python class CohereModelSettings(ModelSettings, total=False): """Settings used for a Cohere model request.""" ``` ### CohereModel Bases: `Model` A model that uses the Cohere API. Internally, this uses the [Cohere Python client](https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python @dataclass(init=False) class CohereModel(Model): """A model that uses the Cohere API. Internally, this uses the [Cohere Python client]( https://github.com/cohere-ai/cohere-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncClientV2 = field(repr=False) _model_name: CohereModelName = field(repr=False) _provider: Provider[AsyncClientV2] = field(repr=False) def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: client_wrapper = self.client._client_wrapper # type: ignore return str(client_wrapper.get_base_url()) @property def model_name(self) -> CohereModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._chat(messages, cast(CohereModelSettings, model_settings or {}), model_request_parameters) model_response = self._process_response(response) return model_response async def _chat( self, messages: list[ModelMessage], model_settings: CohereModelSettings, model_request_parameters: ModelRequestParameters, ) -> V2ChatResponse: tools = self._get_tools(model_request_parameters) if model_request_parameters.builtin_tools: raise UserError('Cohere does not support built-in tools') cohere_messages = self._map_messages(messages) try: return await self.client.chat( model=self._model_name, messages=cohere_messages, tools=tools or OMIT, max_tokens=model_settings.get('max_tokens', OMIT), stop_sequences=model_settings.get('stop_sequences', OMIT), temperature=model_settings.get('temperature', OMIT), p=model_settings.get('top_p', OMIT), seed=model_settings.get('seed', OMIT), presence_penalty=model_settings.get('presence_penalty', OMIT), frequency_penalty=model_settings.get('frequency_penalty', OMIT), ) except ApiError as e: if (status_code := e.status_code) and status_code >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: V2ChatResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" parts: list[ModelResponsePart] = [] if response.message.content is not None: for content in response.message.content: if content.type == 'text': parts.append(TextPart(content=content.text)) elif content.type == 'thinking': # pragma: no branch parts.append(ThinkingPart(content=content.thinking)) for c in response.message.tool_calls or []: if c.function and c.function.name and c.function.arguments: # pragma: no branch parts.append( ToolCallPart( tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id or _generate_tool_call_id(), ) ) raw_finish_reason = response.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=self._model_name, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) def _map_messages(self, messages: list[ModelMessage]) -> list[ChatMessageV2]: """Just maps a `pydantic_ai.Message` to a `cohere.ChatMessageV2`.""" cohere_messages: list[ChatMessageV2] = [] for message in messages: if isinstance(message, ModelRequest): cohere_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] thinking: list[str] = [] tool_calls: list[ToolCallV2] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ThinkingPart): thinking.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from cohere pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = AssistantChatMessageV2(role='assistant') if texts or thinking: contents: list[AssistantMessageV2ContentItem] = [] if thinking: contents.append(ThinkingAssistantMessageV2ContentItem(thinking='\n\n'.join(thinking))) if texts: # pragma: no branch contents.append(TextAssistantMessageV2ContentItem(text='\n\n'.join(texts))) message_param.content = contents if tool_calls: message_param.tool_calls = tool_calls cohere_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): cohere_messages.insert(0, SystemChatMessageV2(role='system', content=instructions)) return cohere_messages def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolV2]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] @staticmethod def _map_tool_call(t: ToolCallPart) -> ToolCallV2: return ToolCallV2( id=_guard_tool_call_id(t=t), type='function', function=ToolCallV2Function( name=t.tool_name, arguments=t.args_as_json_str(), ), ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ToolV2: return ToolV2( type='function', function=ToolV2Function( name=f.name, description=f.description, parameters=f.parameters_json_schema, ), ) @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[ChatMessageV2]: for part in message.parts: if isinstance(part, SystemPromptPart): yield SystemChatMessageV2(role='system', content=part.content) elif isinstance(part, UserPromptPart): if isinstance(part.content, str): yield UserChatMessageV2(role='user', content=part.content) else: raise RuntimeError('Cohere does not yet support multi-modal inputs.') elif isinstance(part, ToolReturnPart): yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield UserChatMessageV2(role='user', content=part.model_response()) # pragma: no cover else: yield ToolChatMessageV2( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) ``` #### __init__ ```python __init__( model_name: CohereModelName, *, provider: ( Literal["cohere"] | Provider[AsyncClientV2] ) = "cohere", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an Cohere model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `CohereModelName` | The name of the Cohere model to use. List of model names available here. | *required* | | `provider` | `Literal['cohere'] | Provider[AsyncClientV2]` | The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of Provider[AsyncClientV2]. If not provided, a new provider will be created using the other parameters. | `'cohere'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/cohere.py` ```python def __init__( self, model_name: CohereModelName, *, provider: Literal['cohere'] | Provider[AsyncClientV2] = 'cohere', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an Cohere model. Args: model_name: The name of the Cohere model to use. List of model names available [here](https://docs.cohere.com/docs/models#command). provider: The provider to use for authentication and API access. Can be either the string 'cohere' or an instance of `Provider[AsyncClientV2]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: CohereModelName ``` The model name. #### system ```python system: str ``` The model provider. # pydantic_ai.models.fallback ### FallbackModel Bases: `Model` A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @dataclass(init=False) class FallbackModel(Model): """A model that uses one or more fallback models upon failure. Apart from `__init__`, all methods are private or match those of the base class. """ models: list[Model] _model_name: str = field(repr=False) _fallback_on: Callable[[Exception], bool] def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelHTTPError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) else: self._fallback_on = fallback_on @property def model_name(self) -> str: """The model name.""" return f'fallback:{",".join(model.model_name for model in self.models)}' @property def system(self) -> str: return f'fallback:{",".join(model.system for model in self.models)}' @property def base_url(self) -> str | None: return self.models[0].base_url async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: try: response = await model.request(messages, model_settings, model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: async with AsyncExitStack() as stack: try: response = await stack.enter_async_context( model.request_stream(messages, model_settings, model_request_parameters, run_context) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) def _set_span_attributes(self, model: Model): with suppress(Exception): span = get_current_span() if span.is_recording(): attributes = getattr(span, 'attributes', {}) if attributes.get('gen_ai.request.model') == self.model_name: # pragma: no branch span.set_attributes(InstrumentedModel.model_attributes(model)) ``` #### __init__ ```python __init__( default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: ( Callable[[Exception], bool] | tuple[type[Exception], ...] ) = (ModelHTTPError,) ) ``` Initialize a fallback model instance. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `default_model` | `Model | KnownModelName | str` | The name or instance of the default model to use. | *required* | | `fallback_models` | `Model | KnownModelName | str` | The names or instances of the fallback models to use upon failure. | `()` | | `fallback_on` | `Callable[[Exception], bool] | tuple[type[Exception], ...]` | A callable or tuple of exceptions that should trigger a fallback. | `(ModelHTTPError,)` | Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python def __init__( self, default_model: Model | KnownModelName | str, *fallback_models: Model | KnownModelName | str, fallback_on: Callable[[Exception], bool] | tuple[type[Exception], ...] = (ModelHTTPError,), ): """Initialize a fallback model instance. Args: default_model: The name or instance of the default model to use. fallback_models: The names or instances of the fallback models to use upon failure. fallback_on: A callable or tuple of exceptions that should trigger a fallback. """ super().__init__() self.models = [infer_model(default_model), *[infer_model(m) for m in fallback_models]] if isinstance(fallback_on, tuple): self._fallback_on = _default_fallback_condition_factory(fallback_on) else: self._fallback_on = fallback_on ``` #### model_name ```python model_name: str ``` The model name. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Try each model in sequence until one succeeds. In case of failure, raise a FallbackExceptionGroup with all exceptions. """ exceptions: list[Exception] = [] for model in self.models: try: response = await model.request(messages, model_settings, model_request_parameters) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc self._set_span_attributes(model) return response raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Try each model in sequence until one succeeds. Source code in `pydantic_ai_slim/pydantic_ai/models/fallback.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Try each model in sequence until one succeeds.""" exceptions: list[Exception] = [] for model in self.models: async with AsyncExitStack() as stack: try: response = await stack.enter_async_context( model.request_stream(messages, model_settings, model_request_parameters, run_context) ) except Exception as exc: if self._fallback_on(exc): exceptions.append(exc) continue raise exc # pragma: no cover self._set_span_attributes(model) yield response return raise FallbackExceptionGroup('All models from FallbackModel failed', exceptions) ``` # `pydantic_ai.models.function` A model controlled by a local function. FunctionModel is similar to [`TestModel`](../test/), but allows greater control over the model's behavior. Its primary use case is for more advanced unit testing than is possible with `TestModel`. Here's a minimal example: function_model_usage.py ```py from pydantic_ai import Agent from pydantic_ai import ModelMessage, ModelResponse, TextPart from pydantic_ai.models.function import FunctionModel, AgentInfo my_agent = Agent('openai:gpt-4o') async def model_function( messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: print(messages) """ [ ModelRequest( parts=[ UserPromptPart( content='Testing my agent...', timestamp=datetime.datetime(...), ) ] ) ] """ print(info) """ AgentInfo( function_tools=[], allow_text_output=True, output_tools=[], model_settings=None ) """ return ModelResponse(parts=[TextPart('hello world')]) async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" with my_agent.override(model=FunctionModel(model_function)): result = await my_agent.run('Testing my agent...') assert result.output == 'hello world' ``` See [Unit testing with `FunctionModel`](../../../testing/#unit-testing-with-functionmodel) for detailed documentation. ### FunctionModel Bases: `Model` A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(init=False) class FunctionModel(Model): """A model controlled by a local function. Apart from `__init__`, all methods are private or match those of the base class. """ function: FunctionDef | None stream_function: StreamFunctionDef | None _model_name: str = field(repr=False) _system: str = field(default='function', repr=False) @overload def __init__( self, function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @overload def __init__( self, function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, ) assert self.function is not None, 'FunctionModel must receive a `function` to support non-streamed requests' if inspect.iscoroutinefunction(self.function): response = await self.function(messages, agent_info) else: response_ = await _utils.run_in_executor(self.function, messages, agent_info) assert isinstance(response_, ModelResponse), response_ response = response_ response.model_name = self._model_name # Add usage data if not already present if not response.usage.has_values(): # pragma: no branch response.usage = _estimate_usage(chain(messages, [response])) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) agent_info = AgentInfo( function_tools=model_request_parameters.function_tools, allow_text_output=model_request_parameters.allow_text_output, output_tools=model_request_parameters.output_tools, model_settings=model_settings, ) assert self.stream_function is not None, ( 'FunctionModel must receive a `stream_function` to support streamed requests' ) response_stream = PeekableAsyncStream(self.stream_function(messages, agent_info)) first = await response_stream.peek() if isinstance(first, _utils.Unset): raise ValueError('Stream function must return at least one item') yield FunctionStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _iter=response_stream, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._system ``` #### __init__ ```python __init__( function: FunctionDef, *, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef, *, stream_function: StreamFunctionDef, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `function` | `FunctionDef | None` | The function to call for non-streamed requests. | `None` | | `stream_function` | `StreamFunctionDef | None` | The function to call for streamed requests. | `None` | | `model_name` | `str | None` | The name of the model. If not provided, a name is generated from the function names. | `None` | | `profile` | `ModelProfileSpec | None` | The model profile to use. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python def __init__( self, function: FunctionDef | None = None, *, stream_function: StreamFunctionDef | None = None, model_name: str | None = None, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a `FunctionModel`. Either `function` or `stream_function` must be provided, providing both is allowed. Args: function: The function to call for non-streamed requests. stream_function: The function to call for streamed requests. model_name: The name of the model. If not provided, a name is generated from the function names. profile: The model profile to use. settings: Model-specific settings that will be used as defaults for this model. """ if function is None and stream_function is None: raise TypeError('Either `function` or `stream_function` must be provided') self.function = function self.stream_function = stream_function function_name = self.function.__name__ if self.function is not None else '' stream_function_name = self.stream_function.__name__ if self.stream_function is not None else '' self._model_name = model_name or f'function:{function_name}:{stream_function_name}' # Use a default profile that supports JSON schema and object output if none provided if profile is None: profile = ModelProfile( supports_json_schema_output=True, supports_json_object_output=True, ) super().__init__(settings=settings, profile=profile) ``` #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The system / model provider. ### AgentInfo Information about an agent. This is passed as the second to functions used within FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(frozen=True, kw_only=True) class AgentInfo: """Information about an agent. This is passed as the second to functions used within [`FunctionModel`][pydantic_ai.models.function.FunctionModel]. """ function_tools: list[ToolDefinition] """The function tools available on this agent. These are the tools registered via the [`tool`][pydantic_ai.Agent.tool] and [`tool_plain`][pydantic_ai.Agent.tool_plain] decorators. """ allow_text_output: bool """Whether a plain text output is allowed.""" output_tools: list[ToolDefinition] """The tools that can called to produce the final output of the run.""" model_settings: ModelSettings | None """The model settings passed to the run call.""" ``` #### function_tools ```python function_tools: list[ToolDefinition] ``` The function tools available on this agent. These are the tools registered via the tool and tool_plain decorators. #### allow_text_output ```python allow_text_output: bool ``` Whether a plain text output is allowed. #### output_tools ```python output_tools: list[ToolDefinition] ``` The tools that can called to produce the final output of the run. #### model_settings ```python model_settings: ModelSettings | None ``` The model settings passed to the run call. ### DeltaToolCall Incremental change to a tool call. Used to describe a chunk when streaming structured responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class DeltaToolCall: """Incremental change to a tool call. Used to describe a chunk when streaming structured responses. """ name: str | None = None """Incremental change to the name of the tool.""" json_args: str | None = None """Incremental change to the arguments as JSON""" _: KW_ONLY tool_call_id: str | None = None """Incremental change to the tool call ID.""" ``` #### name ```python name: str | None = None ``` Incremental change to the name of the tool. #### json_args ```python json_args: str | None = None ``` Incremental change to the arguments as JSON #### tool_call_id ```python tool_call_id: str | None = None ``` Incremental change to the tool call ID. ### DeltaThinkingPart Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass(kw_only=True) class DeltaThinkingPart: """Incremental change to a thinking part. Used to describe a chunk when streaming thinking responses. """ content: str | None = None """Incremental change to the thinking content.""" signature: str | None = None """Incremental change to the thinking signature.""" ``` #### content ```python content: str | None = None ``` Incremental change to the thinking content. #### signature ```python signature: str | None = None ``` Incremental change to the thinking signature. ### DeltaToolCalls ```python DeltaToolCalls: TypeAlias = dict[int, DeltaToolCall] ``` A mapping of tool call IDs to incremental changes. ### DeltaThinkingCalls ```python DeltaThinkingCalls: TypeAlias = dict[int, DeltaThinkingPart] ``` A mapping of thinking call IDs to incremental changes. ### FunctionDef ```python FunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], ModelResponse | Awaitable[ModelResponse], ] ``` A function used to generate a non-streamed response. ### StreamFunctionDef ```python StreamFunctionDef: TypeAlias = Callable[ [list[ModelMessage], AgentInfo], AsyncIterator[ str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns ], ] ``` A function used to generate a streamed response. While this is defined as having return type of `AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinTools]`, it should really be considered as `AsyncIterator[str] | AsyncIterator[DeltaToolCalls] | AsyncIterator[DeltaThinkingCalls]`, E.g. you need to yield all text, all `DeltaToolCalls`, all `DeltaThinkingCalls`, or all `BuiltinToolCallsReturns`, not mix them. ### FunctionStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for FunctionModel. Source code in `pydantic_ai_slim/pydantic_ai/models/function.py` ```python @dataclass class FunctionStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for [FunctionModel][pydantic_ai.models.function.FunctionModel].""" _model_name: str _iter: AsyncIterator[str | DeltaToolCalls | DeltaThinkingCalls | BuiltinToolCallsReturns] _timestamp: datetime = field(default_factory=_utils.now_utc) def __post_init__(self): self._usage += _estimate_usage([]) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: async for item in self._iter: if isinstance(item, str): response_tokens = _estimate_string_tokens(item) self._usage += usage.RequestUsage(output_tokens=response_tokens) maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=item) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(item, dict) and item: for dtc_index, delta in item.items(): if isinstance(delta, DeltaThinkingPart): if delta.content: # pragma: no branch response_tokens = _estimate_string_tokens(delta.content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_thinking_delta( vendor_part_id=dtc_index, content=delta.content, signature=delta.signature, provider_name='function' if delta.signature else None, ) elif isinstance(delta, DeltaToolCall): if delta.json_args: response_tokens = _estimate_string_tokens(delta.json_args) self._usage += usage.RequestUsage(output_tokens=response_tokens) maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc_index, tool_name=delta.name, args=delta.json_args, tool_call_id=delta.tool_call_id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(delta, BuiltinToolCallPart): if content := delta.args_as_json_str(): # pragma: no branch response_tokens = _estimate_string_tokens(content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_part(vendor_part_id=dtc_index, part=delta) elif isinstance(delta, BuiltinToolReturnPart): if content := delta.model_response_str(): # pragma: no branch response_tokens = _estimate_string_tokens(content) self._usage += usage.RequestUsage(output_tokens=response_tokens) yield self._parts_manager.handle_part(vendor_part_id=dtc_index, part=delta) else: assert_never(delta) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> None: """Get the provider name.""" return None @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: None ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.google` Interface that uses the [`google-genai`](https://pypi.org/project/google-genai/) package under the hood to access Google's Gemini models via both the Generative Language API and Vertex AI. ## Setup For details on how to set up authentication with this model, see [model configuration for Google](../../../models/google/). ### LatestGoogleModelNames ```python LatestGoogleModelNames = Literal[ "gemini-2.0-flash", "gemini-2.0-flash-lite", "gemini-2.5-flash", "gemini-2.5-flash-preview-09-2025", "gemini-flash-latest", "gemini-2.5-flash-lite", "gemini-2.5-flash-lite-preview-09-2025", "gemini-flash-lite-latest", "gemini-2.5-pro", ] ``` Latest Gemini models. ### GoogleModelName ```python GoogleModelName = str | LatestGoogleModelNames ``` Possible Gemini model names. Since Gemini supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the Gemini API docs](https://ai.google.dev/gemini-api/docs/models/gemini#model-variations) for a full list. ### GoogleModelSettings Bases: `ModelSettings` Settings used for a Gemini model request. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python class GoogleModelSettings(ModelSettings, total=False): """Settings used for a Gemini model request.""" # ALL FIELDS MUST BE `gemini_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. google_safety_settings: list[SafetySettingDict] """The safety settings to use for the model. See for more information. """ google_thinking_config: ThinkingConfigDict """The thinking configuration to use for the model. See for more information. """ google_labels: dict[str, str] """User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. """ google_video_resolution: MediaResolution """The video resolution to use for the model. See for more information. """ google_cached_content: str """The name of the cached content to use for the model. See for more information. """ ``` #### google_safety_settings ```python google_safety_settings: list[SafetySettingDict] ``` The safety settings to use for the model. See for more information. #### google_thinking_config ```python google_thinking_config: ThinkingConfigDict ``` The thinking configuration to use for the model. See for more information. #### google_labels ```python google_labels: dict[str, str] ``` User-defined metadata to break down billed charges. Only supported by the Vertex AI API. See the [Gemini API docs](https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/add-labels-to-api-calls) for use cases and limitations. #### google_video_resolution ```python google_video_resolution: MediaResolution ``` The video resolution to use for the model. See for more information. #### google_cached_content ```python google_cached_content: str ``` The name of the cached content to use for the model. See for more information. ### GoogleModel Bases: `Model` A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass(init=False) class GoogleModel(Model): """A model that uses Gemini via `generativelanguage.googleapis.com` API. This is implemented from scratch rather than using a dedicated SDK, good API documentation is available [here](https://ai.google.dev/api). Apart from `__init__`, all methods are private or match those of the base class. """ client: Client = field(repr=False) _model_name: GoogleModelName = field(repr=False) _provider: Provider[Client] = field(repr=False) _url: str | None = field(repr=False) def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = GoogleProvider(vertexai=provider == 'google-vertex') self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> GoogleModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, False, model_settings, model_request_parameters) return self._process_response(response) async def count_tokens( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> usage.RequestUsage: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) contents, generation_config = await self._build_content_and_config( messages, model_settings, model_request_parameters ) # Annoyingly, the type of `GenerateContentConfigDict.get` is "partially `Unknown`" because `response_schema` includes `typing._UnionGenericAlias`, # so without this we'd need `pyright: ignore[reportUnknownMemberType]` on every line and wouldn't get type checking anyway. generation_config = cast(dict[str, Any], generation_config) config = CountTokensConfigDict( http_options=generation_config.get('http_options'), ) if self._provider.name != 'google-gla': # The fields are not supported by the Gemini API per https://github.com/googleapis/python-genai/blob/7e4ec284dc6e521949626f3ed54028163ef9121d/google/genai/models.py#L1195-L1214 config.update( # pragma: lax no cover system_instruction=generation_config.get('system_instruction'), tools=cast(list[ToolDict], generation_config.get('tools')), # Annoyingly, GenerationConfigDict has fewer fields than GenerateContentConfigDict, and no extra fields are allowed. generation_config=GenerationConfigDict( temperature=generation_config.get('temperature'), top_p=generation_config.get('top_p'), max_output_tokens=generation_config.get('max_output_tokens'), stop_sequences=generation_config.get('stop_sequences'), presence_penalty=generation_config.get('presence_penalty'), frequency_penalty=generation_config.get('frequency_penalty'), seed=generation_config.get('seed'), thinking_config=generation_config.get('thinking_config'), media_resolution=generation_config.get('media_resolution'), response_mime_type=generation_config.get('response_mime_type'), response_schema=generation_config.get('response_schema'), ), ) response = await self.client.aio.models.count_tokens( model=self._model_name, contents=contents, config=config, ) if response.total_tokens is None: raise UnexpectedModelBehavior( # pragma: no cover 'Total tokens missing from Gemini response', str(response) ) return usage.RequestUsage( input_tokens=response.total_tokens, ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) model_settings = cast(GoogleModelSettings, model_settings or {}) response = await self._generate_content(messages, True, model_settings, model_request_parameters) yield await self._process_streamed_response(response, model_request_parameters) # type: ignore def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ToolDict] | None: tools: list[ToolDict] = [ ToolDict(function_declarations=[_function_declaration_from_tool(t)]) for t in model_request_parameters.tool_defs.values() ] if model_request_parameters.builtin_tools: if model_request_parameters.output_tools: raise UserError( 'Gemini does not support output tools and built-in tools at the same time. Use `output_type=PromptedOutput(...)` instead.' ) if model_request_parameters.function_tools: raise UserError('Gemini does not support user tools and built-in tools at the same time.') for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): tools.append(ToolDict(google_search=GoogleSearchDict())) elif isinstance(tool, UrlContextTool): tools.append(ToolDict(url_context=UrlContextDict())) elif isinstance(tool, CodeExecutionTool): tools.append(ToolDict(code_execution=ToolCodeExecutionDict())) elif isinstance(tool, ImageGenerationTool): # pragma: no branch if not self.profile.supports_image_output: raise UserError( "`ImageGenerationTool` is not supported by this model. Use a model with 'image' in the name instead." ) else: # pragma: no cover raise UserError( f'`{tool.__class__.__name__}` is not supported by `GoogleModel`. If it should be, please file an issue.' ) return tools or None def _get_tool_config( self, model_request_parameters: ModelRequestParameters, tools: list[ToolDict] | None ) -> ToolConfigDict | None: if not model_request_parameters.allow_text_output and tools: names: list[str] = [] for tool in tools: for function_declaration in tool.get('function_declarations') or []: if name := function_declaration.get('name'): # pragma: no branch names.append(name) return _tool_config(names) else: return None @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse: ... @overload async def _generate_content( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> Awaitable[AsyncIterator[GenerateContentResponse]]: ... async def _generate_content( self, messages: list[ModelMessage], stream: bool, model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> GenerateContentResponse | Awaitable[AsyncIterator[GenerateContentResponse]]: contents, config = await self._build_content_and_config(messages, model_settings, model_request_parameters) func = self.client.aio.models.generate_content_stream if stream else self.client.aio.models.generate_content return await func(model=self._model_name, contents=contents, config=config) # type: ignore async def _build_content_and_config( self, messages: list[ModelMessage], model_settings: GoogleModelSettings, model_request_parameters: ModelRequestParameters, ) -> tuple[list[ContentUnionDict], GenerateContentConfigDict]: tools = self._get_tools(model_request_parameters) if tools and not self.profile.supports_tools: raise UserError('Tools are not supported by this model.') response_mime_type = None response_schema = None if model_request_parameters.output_mode == 'native': if tools: raise UserError( 'Gemini does not support `NativeOutput` and tools at the same time. Use `output_type=ToolOutput(...)` instead.' ) response_mime_type = 'application/json' output_object = model_request_parameters.output_object assert output_object is not None response_schema = self._map_response_schema(output_object) elif model_request_parameters.output_mode == 'prompted' and not tools: if not self.profile.supports_json_object_output: raise UserError('JSON output is not supported by this model.') response_mime_type = 'application/json' tool_config = self._get_tool_config(model_request_parameters, tools) system_instruction, contents = await self._map_messages(messages) modalities = [Modality.TEXT.value] if self.profile.supports_image_output: modalities.append(Modality.IMAGE.value) http_options: HttpOptionsDict = { 'headers': {'Content-Type': 'application/json', 'User-Agent': get_user_agent()} } if timeout := model_settings.get('timeout'): if isinstance(timeout, int | float): http_options['timeout'] = int(1000 * timeout) else: raise UserError('Google does not support setting ModelSettings.timeout to a httpx.Timeout') config = GenerateContentConfigDict( http_options=http_options, system_instruction=system_instruction, temperature=model_settings.get('temperature'), top_p=model_settings.get('top_p'), max_output_tokens=model_settings.get('max_tokens'), stop_sequences=model_settings.get('stop_sequences'), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), seed=model_settings.get('seed'), safety_settings=model_settings.get('google_safety_settings'), thinking_config=model_settings.get('google_thinking_config'), labels=model_settings.get('google_labels'), media_resolution=model_settings.get('google_video_resolution'), cached_content=model_settings.get('google_cached_content'), tools=cast(ToolListUnionDict, tools), tool_config=tool_config, response_mime_type=response_mime_type, response_schema=response_schema, response_modalities=modalities, ) return contents, config def _process_response(self, response: GenerateContentResponse) -> ModelResponse: if not response.candidates: raise UnexpectedModelBehavior('Expected at least one candidate in Gemini response') # pragma: no cover candidate = response.candidates[0] if candidate.content is None or candidate.content.parts is None: if candidate.finish_reason == 'SAFETY': raise UnexpectedModelBehavior('Safety settings triggered', str(response)) else: raise UnexpectedModelBehavior( 'Content field missing from Gemini response', str(response) ) # pragma: no cover parts = candidate.content.parts or [] vendor_id = response.response_id vendor_details: dict[str, Any] | None = None finish_reason: FinishReason | None = None if raw_finish_reason := candidate.finish_reason: # pragma: no branch vendor_details = {'finish_reason': raw_finish_reason.value} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) usage = _metadata_as_usage(response) return _process_response_from_parts( parts, candidate.grounding_metadata, response.model_version or self._model_name, self._provider.name, usage, vendor_id=vendor_id, vendor_details=vendor_details, finish_reason=finish_reason, ) async def _process_streamed_response( self, response: AsyncIterator[GenerateContentResponse], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') # pragma: no cover return GeminiStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model_version or self._model_name, _response=peekable_response, _timestamp=first_chunk.create_time or _utils.now_utc(), _provider_name=self._provider.name, ) async def _map_messages(self, messages: list[ModelMessage]) -> tuple[ContentDict | None, list[ContentUnionDict]]: contents: list[ContentUnionDict] = [] system_parts: list[PartDict] = [] for m in messages: if isinstance(m, ModelRequest): message_parts: list[PartDict] = [] for part in m.parts: if isinstance(part, SystemPromptPart): system_parts.append({'text': part.content}) elif isinstance(part, UserPromptPart): message_parts.extend(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': part.model_response_object(), 'id': part.tool_call_id, } } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: message_parts.append({'text': part.model_response()}) # pragma: no cover else: message_parts.append( { 'function_response': { 'name': part.tool_name, 'response': {'call_error': part.model_response()}, 'id': part.tool_call_id, } } ) else: assert_never(part) # Google GenAI requires at least one part in the message. if not message_parts: message_parts = [{'text': ''}] contents.append({'role': 'user', 'parts': message_parts}) elif isinstance(m, ModelResponse): contents.append(_content_model_response(m, self.system)) else: assert_never(m) if instructions := self._get_instructions(messages): system_parts.insert(0, {'text': instructions}) system_instruction = ContentDict(role='user', parts=system_parts) if system_parts else None return system_instruction, contents async def _map_user_prompt(self, part: UserPromptPart) -> list[PartDict]: if isinstance(part.content, str): return [{'text': part.content}] else: content: list[PartDict] = [] for item in part.content: if isinstance(item, str): content.append({'text': item}) elif isinstance(item, BinaryContent): inline_data_dict: BlobDict = {'data': item.data, 'mime_type': item.media_type} part_dict: PartDict = {'inline_data': inline_data_dict} if item.vendor_metadata: part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) elif isinstance(item, VideoUrl) and item.is_youtube: file_data_dict: FileDataDict = {'file_uri': item.url, 'mime_type': item.media_type} part_dict: PartDict = {'file_data': file_data_dict} if item.vendor_metadata: # pragma: no branch part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata) content.append(part_dict) elif isinstance(item, FileUrl): if item.force_download or ( # google-gla does not support passing file urls directly, except for youtube videos # (see above) and files uploaded to the file API (which cannot be downloaded anyway) self.system == 'google-gla' and not item.url.startswith(r'https://generativelanguage.googleapis.com/v1beta/files') ): downloaded_item = await download_item(item, data_format='bytes') inline_data: BlobDict = { 'data': downloaded_item['data'], 'mime_type': downloaded_item['data_type'], } content.append({'inline_data': inline_data}) else: file_data_dict: FileDataDict = {'file_uri': item.url, 'mime_type': item.media_type} content.append({'file_data': file_data_dict}) # pragma: lax no cover else: assert_never(item) return content def _map_response_schema(self, o: OutputObjectDefinition) -> dict[str, Any]: response_schema = o.json_schema.copy() if o.name: response_schema['title'] = o.name if o.description: response_schema['description'] = o.description return response_schema ``` #### __init__ ```python __init__( model_name: GoogleModelName, *, provider: ( Literal["google-gla", "google-vertex"] | Provider[Client] ) = "google-gla", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Gemini model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `GoogleModelName` | The name of the model to use. | *required* | | `provider` | `Literal['google-gla', 'google-vertex'] | Provider[Client]` | The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of Provider[httpx.AsyncClient]. If not provided, a new provider will be created using the other parameters. | `'google-gla'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | The model settings to use. Defaults to None. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python def __init__( self, model_name: GoogleModelName, *, provider: Literal['google-gla', 'google-vertex'] | Provider[Client] = 'google-gla', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Gemini model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'google-gla' or 'google-vertex' or an instance of `Provider[httpx.AsyncClient]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: The model settings to use. Defaults to None. """ self._model_name = model_name if isinstance(provider, str): provider = GoogleProvider(vertexai=provider == 'google-vertex') self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GoogleModelName ``` The model name. #### system ```python system: str ``` The model provider. ### GeminiStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for the Gemini model. Source code in `pydantic_ai_slim/pydantic_ai/models/google.py` ```python @dataclass class GeminiStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for the Gemini model.""" _model_name: GoogleModelName _response: AsyncIterator[GenerateContentResponse] _timestamp: datetime _provider_name: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 code_execution_tool_call_id: str | None = None async for chunk in self._response: self._usage = _metadata_as_usage(chunk) if not chunk.candidates: continue # pragma: no cover candidate = chunk.candidates[0] if chunk.response_id: # pragma: no branch self.provider_response_id = chunk.response_id if raw_finish_reason := candidate.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason.value} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Google streams the grounding metadata (including the web search queries and results) # _after_ the text that was generated using it, so it would show up out of order in the stream, # and cause issues with the logic that doesn't consider text ahead of built-in tool calls as output. # If that gets fixed (or we have a workaround), we can uncomment this: # web_search_call, web_search_return = _map_grounding_metadata( # candidate.grounding_metadata, self.provider_name # ) # if web_search_call and web_search_return: # yield self._parts_manager.handle_part(vendor_part_id=uuid4(), part=web_search_call) # yield self._parts_manager.handle_part( # vendor_part_id=uuid4(), part=web_search_return # ) if candidate.content is None or candidate.content.parts is None: if candidate.finish_reason == 'STOP': # pragma: no cover # Normal completion - skip this chunk continue elif candidate.finish_reason == 'SAFETY': # pragma: no cover raise UnexpectedModelBehavior('Safety settings triggered', str(chunk)) else: # pragma: no cover raise UnexpectedModelBehavior('Content field missing from streaming Gemini response', str(chunk)) parts = candidate.content.parts if not parts: continue # pragma: no cover for part in parts: if part.thought_signature: signature = base64.b64encode(part.thought_signature).decode('utf-8') yield self._parts_manager.handle_thinking_delta( vendor_part_id='thinking', signature=signature, provider_name=self.provider_name, ) if part.text is not None: if part.thought: yield self._parts_manager.handle_thinking_delta(vendor_part_id='thinking', content=part.text) else: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=part.text) if maybe_event is not None: # pragma: no branch yield maybe_event elif part.function_call: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=uuid4(), tool_name=part.function_call.name, args=part.function_call.args, tool_call_id=part.function_call.id, ) if maybe_event is not None: # pragma: no branch yield maybe_event elif part.inline_data is not None: data = part.inline_data.data mime_type = part.inline_data.mime_type assert data and mime_type, 'Inline data must have data and mime type' content = BinaryContent(data=data, media_type=mime_type) yield self._parts_manager.handle_part( vendor_part_id=uuid4(), part=FilePart(content=BinaryContent.narrow_type(content)), ) elif part.executable_code is not None: code_execution_tool_call_id = _utils.generate_tool_call_id() yield self._parts_manager.handle_part( vendor_part_id=uuid4(), part=_map_executable_code( part.executable_code, self.provider_name, code_execution_tool_call_id ), ) elif part.code_execution_result is not None: assert code_execution_tool_call_id is not None yield self._parts_manager.handle_part( vendor_part_id=uuid4(), part=_map_code_execution_result( part.code_execution_result, self.provider_name, code_execution_tool_call_id ), ) else: assert part.function_response is not None, f'Unexpected part: {part}' # pragma: no cover @property def model_name(self) -> GoogleModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GoogleModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.groq` ## Setup For details on how to set up authentication with this model, see [model configuration for Groq](../../../models/groq/). ### ProductionGroqModelNames ```python ProductionGroqModelNames = Literal[ "distil-whisper-large-v3-en", "gemma2-9b-it", "llama-3.3-70b-versatile", "llama-3.1-8b-instant", "llama-guard-3-8b", "llama3-70b-8192", "llama3-8b-8192", "whisper-large-v3", "whisper-large-v3-turbo", ] ``` Production Groq models from . ### PreviewGroqModelNames ```python PreviewGroqModelNames = Literal[ "playai-tts", "playai-tts-arabic", "qwen-qwq-32b", "mistral-saba-24b", "qwen-2.5-coder-32b", "qwen-2.5-32b", "deepseek-r1-distill-qwen-32b", "deepseek-r1-distill-llama-70b", "llama-3.3-70b-specdec", "llama-3.2-1b-preview", "llama-3.2-3b-preview", "llama-3.2-11b-vision-preview", "llama-3.2-90b-vision-preview", "moonshotai/kimi-k2-instruct", ] ``` Preview Groq models from . ### GroqModelName ```python GroqModelName = ( str | ProductionGroqModelNames | PreviewGroqModelNames ) ``` Possible Groq model names. Since Groq supports a variety of models and the list changes frequencly, we explicitly list the named models as of 2025-03-31 but allow any name in the type hints. See for an up to date date list of models and more details. ### GroqModelSettings Bases: `ModelSettings` Settings used for a Groq model request. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python class GroqModelSettings(ModelSettings, total=False): """Settings used for a Groq model request.""" # ALL FIELDS MUST BE `groq_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. groq_reasoning_format: Literal['hidden', 'raw', 'parsed'] """The format of the reasoning output. See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details. """ ``` #### groq_reasoning_format ```python groq_reasoning_format: Literal['hidden', 'raw', 'parsed'] ``` The format of the reasoning output. See [the Groq docs](https://console.groq.com/docs/reasoning#reasoning-format) for more details. ### GroqModel Bases: `Model` A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass(init=False) class GroqModel(Model): """A model that uses the Groq API. Internally, this uses the [Groq Python client](https://github.com/groq/groq-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncGroq = field(repr=False) _model_name: GroqModelName = field(repr=False) _provider: Provider[AsyncGroq] = field(repr=False) def __init__( self, model_name: GroqModelName, *, provider: Literal['groq'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> GroqModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) try: response = await self._completions_create( messages, False, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) except ModelHTTPError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call. try: error = _GroqToolUseFailedError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] tool_call_part = ToolCallPart( tool_name=error.error.failed_generation.name, args=error.error.failed_generation.arguments, ) return ModelResponse( parts=[tool_call_part], model_name=e.model_name, timestamp=_utils.now_utc(), provider_name=self._provider.name, finish_reason='error', ) except ValidationError: pass raise model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, True, cast(GroqModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[chat.ChatCompletionChunk]: pass @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion: pass async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: GroqModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[chat.ChatCompletionChunk]: tools = self._get_tools(model_request_parameters) tools += self._get_builtin_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' groq_messages = self._map_messages(messages) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and not tools and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.chat.completions.create( model=self._model_name, messages=groq_messages, n=1, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stop=model_settings.get('stop_sequences', NOT_GIVEN), stream=stream, response_format=response_format or NOT_GIVEN, max_tokens=model_settings.get('max_tokens', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), seed=model_settings.get('seed', NOT_GIVEN), presence_penalty=model_settings.get('presence_penalty', NOT_GIVEN), reasoning_format=model_settings.get('groq_reasoning_format', NOT_GIVEN), frequency_penalty=model_settings.get('frequency_penalty', NOT_GIVEN), logit_bias=model_settings.get('logit_bias', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: chat.ChatCompletion) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" timestamp = number_to_datetime(response.created) choice = response.choices[0] items: list[ModelResponsePart] = [] if choice.message.reasoning is not None: # NOTE: The `reasoning` field is only present if `groq_reasoning_format` is set to `parsed`. items.append(ThinkingPart(content=choice.message.reasoning)) if choice.message.executed_tools: for tool in choice.message.executed_tools: call_part, return_part = _map_executed_tool(tool, self.system) if call_part and return_part: # pragma: no branch items.append(call_part) items.append(return_part) if choice.message.content is not None: # NOTE: The `` tag is only present if `groq_reasoning_format` is set to `raw`. items.extend(split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags)) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: items.append(ToolCallPart(tool_name=c.function.name, args=c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[chat.ChatCompletionChunk], model_request_parameters: ModelRequestParameters ) -> GroqStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return GroqStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.model, _model_profile=self.profile, _timestamp=number_to_datetime(first_chunk.created), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools( self, model_request_parameters: ModelRequestParameters ) -> list[chat.ChatCompletionToolParam]: tools: list[chat.ChatCompletionToolParam] = [] for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): if not GroqModelProfile.from_profile(self.profile).groq_always_has_web_search_builtin_tool: raise UserError('`WebSearchTool` is not supported by Groq') # pragma: no cover else: raise UserError( f'`{tool.__class__.__name__}` is not supported by `GroqModel`. If it should be, please file an issue.' ) return tools def _map_messages(self, messages: list[ModelMessage]) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `groq.types.ChatCompletionMessageParam`.""" groq_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): groq_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[chat.ChatCompletionMessageToolCallParam] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # These are not currently sent back pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls groq_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): groq_messages.insert(0, chat.ChatCompletionSystemMessageParam(role='system', content=instructions)) return groq_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> chat.ChatCompletionMessageToolCallParam: return chat.ChatCompletionMessageToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> chat.ChatCompletionToolParam: return { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatResponseFormatJsonSchema = { 'type': 'json_schema', 'json_schema': { 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, 'strict': o.strict, }, } if o.description: # pragma: no branch response_format_param['json_schema']['description'] = o.description return response_format_param @classmethod def _map_user_message(cls, message: ModelRequest) -> Iterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield cls._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): # pragma: no branch if part.tool_name is None: yield chat.ChatCompletionUserMessageParam( # pragma: no cover role='user', content=part.model_response() ) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) @staticmethod def _map_user_prompt(part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: content: str | list[chat.ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(chat.ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url = ImageURL(url=item.url) content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): if item.is_image: image_url = ImageURL(url=item.data_uri) content.append(chat.ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) else: raise RuntimeError('Only images are supported for binary content in Groq.') elif isinstance(item, DocumentUrl): # pragma: no cover raise RuntimeError('DocumentUrl is not supported in Groq.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return chat.ChatCompletionUserMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: GroqModelName, *, provider: ( Literal["groq"] | Provider[AsyncGroq] ) = "groq", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Groq model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `GroqModelName` | The name of the Groq model to use. List of model names available here. | *required* | | `provider` | `Literal['groq'] | Provider[AsyncGroq]` | The provider to use for authentication and API access. Can be either the string 'groq' or an instance of Provider[AsyncGroq]. If not provided, a new provider will be created using the other parameters. | `'groq'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python def __init__( self, model_name: GroqModelName, *, provider: Literal['groq'] | Provider[AsyncGroq] = 'groq', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Groq model. Args: model_name: The name of the Groq model to use. List of model names available [here](https://console.groq.com/docs/models). provider: The provider to use for authentication and API access. Can be either the string 'groq' or an instance of `Provider[AsyncGroq]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: GroqModelName ``` The model name. #### system ```python system: str ``` The model provider. ### GroqStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Groq models. Source code in `pydantic_ai_slim/pydantic_ai/models/groq.py` ```python @dataclass class GroqStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Groq models.""" _model_name: GroqModelName _model_profile: ModelProfile _response: AsyncIterable[chat.ChatCompletionChunk] _timestamp: datetime _provider_name: str async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: # noqa: C901 try: executed_tool_call_id: str | None = None async for chunk in self._response: self._usage += _map_usage(chunk) if chunk.id: # pragma: no branch self.provider_response_id = chunk.id try: choice = chunk.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) if choice.delta.reasoning is not None: # NOTE: The `reasoning` field is only present if `groq_reasoning_format` is set to `parsed`. yield self._parts_manager.handle_thinking_delta( vendor_part_id='reasoning', content=choice.delta.reasoning ) if choice.delta.executed_tools: for tool in choice.delta.executed_tools: call_part, return_part = _map_executed_tool( tool, self.provider_name, streaming=True, tool_call_id=executed_tool_call_id ) if call_part: executed_tool_call_id = call_part.tool_call_id yield self._parts_manager.handle_part( vendor_part_id=f'executed_tools-{tool.index}-call', part=call_part ) if return_part: executed_tool_call_id = None yield self._parts_manager.handle_part( vendor_part_id=f'executed_tools-{tool.index}-return', part=return_part ) # Handle the text part of the response content = choice.delta.content if content is not None: maybe_event = self._parts_manager.handle_text_delta( vendor_part_id='content', content=content, thinking_tags=self._model_profile.thinking_tags, ignore_leading_whitespace=self._model_profile.ignore_streamed_leading_whitespace, ) if maybe_event is not None: # pragma: no branch yield maybe_event # Handle the tool calls for dtc in choice.delta.tool_calls or []: maybe_event = self._parts_manager.handle_tool_call_delta( vendor_part_id=dtc.index, tool_name=dtc.function and dtc.function.name, args=dtc.function and dtc.function.arguments, tool_call_id=dtc.id, ) if maybe_event is not None: yield maybe_event except APIError as e: if isinstance(e.body, dict): # pragma: no branch # The Groq SDK tries to be helpful by raising an exception when generated tool arguments don't match the schema, # but we'd rather handle it ourselves so we can tell the model to retry the tool call try: error = _GroqToolUseFailedInnerError.model_validate(e.body) # pyright: ignore[reportUnknownMemberType] yield self._parts_manager.handle_tool_call_part( vendor_part_id='tool_use_failed', tool_name=error.failed_generation.name, args=error.failed_generation.arguments, ) return except ValidationError as e: # pragma: no cover pass raise # pragma: no cover @property def model_name(self) -> GroqModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: GroqModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.huggingface` ## Setup For details on how to set up authentication with this model, see [model configuration for Hugging Face](../../../models/huggingface/). ### HuggingFaceModelSettings Bases: `ModelSettings` Settings used for a Hugging Face model request. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python class HuggingFaceModelSettings(ModelSettings, total=False): """Settings used for a Hugging Face model request.""" ``` ### HuggingFaceModel Bases: `Model` A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python @dataclass(init=False) class HuggingFaceModel(Model): """A model that uses Hugging Face Inference Providers. Internally, this uses the [HF Python client](https://github.com/huggingface/huggingface_hub) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncInferenceClient = field(repr=False) _model_name: str = field(repr=False) _provider: Provider[AsyncInferenceClient] = field(repr=False) def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def model_name(self) -> HuggingFaceModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The system / model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, False, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, True, cast(HuggingFaceModelSettings, model_settings or {}), model_request_parameters ) yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncIterable[ChatCompletionStreamOutput]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: HuggingFaceModelSettings, model_request_parameters: ModelRequestParameters, ) -> ChatCompletionOutput | AsyncIterable[ChatCompletionStreamOutput]: tools = self._get_tools(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' if model_request_parameters.builtin_tools: raise UserError('HuggingFace does not support built-in tools') hf_messages = await self._map_messages(messages) try: return await self.client.chat.completions.create( # type: ignore model=self._model_name, messages=hf_messages, # type: ignore tools=tools, tool_choice=tool_choice or None, stream=stream, stop=model_settings.get('stop_sequences', None), temperature=model_settings.get('temperature', None), top_p=model_settings.get('top_p', None), seed=model_settings.get('seed', None), presence_penalty=model_settings.get('presence_penalty', None), frequency_penalty=model_settings.get('frequency_penalty', None), logit_bias=model_settings.get('logit_bias', None), # type: ignore logprobs=model_settings.get('logprobs', None), top_logprobs=model_settings.get('top_logprobs', None), extra_body=model_settings.get('extra_body'), # type: ignore ) except aiohttp.ClientResponseError as e: raise ModelHTTPError( status_code=e.status, model_name=self.model_name, body=e.response_error_payload, # type: ignore ) from e except HfHubHTTPError as e: raise ModelHTTPError( status_code=e.response.status_code, model_name=self.model_name, body=e.response.content, ) from e def _process_response(self, response: ChatCompletionOutput) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" if response.created: timestamp = datetime.fromtimestamp(response.created, tz=timezone.utc) else: timestamp = _now_utc() choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls items: list[ModelResponsePart] = [] if content is not None: items.extend(split_content_into_text_and_thinking(content, self.profile.thinking_tags)) if tool_calls is not None: for c in tool_calls: items.append(ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id)) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(cast(TextGenerationOutputFinishReason, raw_finish_reason), None) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncIterable[ChatCompletionStreamOutput], model_request_parameters: ModelRequestParameters ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) return HuggingFaceStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.model, _model_profile=self.profile, _response=peekable_response, _timestamp=datetime.fromtimestamp(first_chunk.created, tz=timezone.utc), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[ChatCompletionInputTool]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] async def _map_messages( self, messages: list[ModelMessage] ) -> list[ChatCompletionInputMessage | ChatCompletionOutputMessage]: """Just maps a `pydantic_ai.Message` to a `huggingface_hub.ChatCompletionInputMessage`.""" hf_messages: list[ChatCompletionInputMessage | ChatCompletionOutputMessage] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): hf_messages.append(item) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[ChatCompletionInputToolCall] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) elif isinstance(item, ThinkingPart): start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from huggingface pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = ChatCompletionInputMessage(role='assistant') # type: ignore if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) if tool_calls: message_param['tool_calls'] = tool_calls hf_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): hf_messages.insert(0, ChatCompletionInputMessage(content=instructions, role='system')) # type: ignore return hf_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionInputToolCall: return ChatCompletionInputToolCall.parse_obj_as_instance( # type: ignore { 'id': _guard_tool_call_id(t=t), 'type': 'function', 'function': { 'name': t.tool_name, 'arguments': t.args_as_json_str(), }, } ) @staticmethod def _map_tool_definition(f: ToolDefinition) -> ChatCompletionInputTool: tool_param: ChatCompletionInputTool = ChatCompletionInputTool.parse_obj_as_instance( # type: ignore { 'type': 'function', 'function': { 'name': f.name, 'description': f.description, 'parameters': f.parameters_json_schema, }, } ) return tool_param async def _map_user_message( self, message: ModelRequest ) -> AsyncIterable[ChatCompletionInputMessage | ChatCompletionOutputMessage]: for part in message.parts: if isinstance(part, SystemPromptPart): yield ChatCompletionInputMessage.parse_obj_as_instance({'role': 'system', 'content': part.content}) # type: ignore elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield ChatCompletionOutputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response_str(), } ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore {'role': 'user', 'content': part.model_response()} ) else: yield ChatCompletionInputMessage.parse_obj_as_instance( # type: ignore { 'role': 'tool', 'tool_call_id': _guard_tool_call_id(t=part), 'content': part.model_response(), } ) else: assert_never(part) @staticmethod async def _map_user_prompt(part: UserPromptPart) -> ChatCompletionInputMessage: content: str | list[ChatCompletionInputMessage] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionInputMessageChunk(type='text', text=item)) # type: ignore elif isinstance(item, ImageUrl): url = ChatCompletionInputURL(url=item.url) # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore elif isinstance(item, BinaryContent): if item.is_image: url = ChatCompletionInputURL(url=item.data_uri) # type: ignore content.append(ChatCompletionInputMessageChunk(type='image_url', image_url=url)) # type: ignore else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): raise NotImplementedError('AudioUrl is not supported for Hugging Face') elif isinstance(item, DocumentUrl): raise NotImplementedError('DocumentUrl is not supported for Hugging Face') elif isinstance(item, VideoUrl): raise NotImplementedError('VideoUrl is not supported for Hugging Face') else: assert_never(item) return ChatCompletionInputMessage(role='user', content=content) # type: ignore ``` #### __init__ ```python __init__( model_name: str, *, provider: ( Literal["huggingface"] | Provider[AsyncInferenceClient] ) = "huggingface", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize a Hugging Face model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `str` | The name of the Model to use. You can browse available models here. | *required* | | `provider` | `Literal['huggingface'] | Provider[AsyncInferenceClient]` | The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of Provider[AsyncInferenceClient]. If not provided, the other parameters will be used. | `'huggingface'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/huggingface.py` ```python def __init__( self, model_name: str, *, provider: Literal['huggingface'] | Provider[AsyncInferenceClient] = 'huggingface', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize a Hugging Face model. Args: model_name: The name of the Model to use. You can browse available models [here](https://huggingface.co/models?pipeline_tag=text-generation&inference_provider=all&sort=trending). provider: The provider to use for Hugging Face Inference Providers. Can be either the string 'huggingface' or an instance of `Provider[AsyncInferenceClient]`. If not provided, the other parameters will be used. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: HuggingFaceModelName ``` The model name. #### system ```python system: str ``` The system / model provider. # pydantic_ai.models.instrumented ### instrument_model ```python instrument_model( model: Model, instrument: InstrumentationSettings | bool ) -> Model ``` Instrument a model with OpenTelemetry/logfire. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def instrument_model(model: Model, instrument: InstrumentationSettings | bool) -> Model: """Instrument a model with OpenTelemetry/logfire.""" if instrument and not isinstance(model, InstrumentedModel): if instrument is True: instrument = InstrumentationSettings() model = InstrumentedModel(model, instrument) return model ``` ### InstrumentationSettings Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - Agent.instrument_all() - InstrumentedModel See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentationSettings: """Options for instrumenting models and agents with OpenTelemetry. Used in: - `Agent(instrument=...)` - [`Agent.instrument_all()`][pydantic_ai.agent.Agent.instrument_all] - [`InstrumentedModel`][pydantic_ai.models.instrumented.InstrumentedModel] See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ tracer: Tracer = field(repr=False) event_logger: EventLogger = field(repr=False) event_mode: Literal['attributes', 'logs'] = 'attributes' include_binary_content: bool = True include_content: bool = True version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events def messages_to_otel_messages(self, messages: list[ModelMessage]) -> list[_otel_messages.ChatMessage]: result: list[_otel_messages.ChatMessage] = [] for message in messages: if isinstance(message, ModelRequest): for is_system, group in itertools.groupby(message.parts, key=lambda p: isinstance(p, SystemPromptPart)): message_parts: list[_otel_messages.MessagePart] = [] for part in group: if hasattr(part, 'otel_message_parts'): message_parts.extend(part.otel_message_parts(self)) result.append( _otel_messages.ChatMessage(role='system' if is_system else 'user', parts=message_parts) ) elif isinstance(message, ModelResponse): # pragma: no branch otel_message = _otel_messages.OutputMessage(role='assistant', parts=message.otel_message_parts(self)) if message.finish_reason is not None: otel_message['finish_reason'] = message.finish_reason result.append(otel_message) return result def handle_messages(self, input_messages: list[ModelMessage], response: ModelResponse, system: str, span: Span): if self.version == 1: events = self.messages_to_otel_events(input_messages) for event in self.messages_to_otel_events([response]): events.append( Event( 'gen_ai.choice', body={ 'index': 0, 'message': event.body, }, ) ) for event in events: event.attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, **(event.attributes or {}), } self._emit_events(span, events) else: output_messages = self.messages_to_otel_messages([response]) assert len(output_messages) == 1 output_message = output_messages[0] instructions = InstrumentedModel._get_instructions(input_messages) # pyright: ignore [reportPrivateUsage] system_instructions_attributes = self.system_instructions_attributes(instructions) attributes: dict[str, AttributeValue] = { 'gen_ai.input.messages': json.dumps(self.messages_to_otel_messages(input_messages)), 'gen_ai.output.messages': json.dumps([output_message]), **system_instructions_attributes, 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { 'gen_ai.input.messages': {'type': 'array'}, 'gen_ai.output.messages': {'type': 'array'}, **( {'gen_ai.system_instructions': {'type': 'array'}} if system_instructions_attributes else {} ), 'model_request_parameters': {'type': 'object'}, }, } ), } span.set_attributes(attributes) def system_instructions_attributes(self, instructions: str | None) -> dict[str, str]: if instructions and self.include_content: return { 'gen_ai.system_instructions': json.dumps([_otel_messages.TextPart(type='text', content=instructions)]), } return {} def _emit_events(self, span: Span, events: list[Event]) -> None: if self.event_mode == 'logs': for event in events: self.event_logger.emit(event) else: attr_name = 'events' span.set_attributes( { attr_name: json.dumps([InstrumentedModel.event_to_dict(event) for event in events]), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': { attr_name: {'type': 'array'}, 'model_request_parameters': {'type': 'object'}, }, } ), } ) def record_metrics( self, response: ModelResponse, price_calculation: PriceCalculation | None, attributes: dict[str, AttributeValue], ): for typ in ['input', 'output']: if not (tokens := getattr(response.usage, f'{typ}_tokens', 0)): # pragma: no cover continue token_attributes = {**attributes, 'gen_ai.token.type': typ} self.tokens_histogram.record(tokens, token_attributes) if price_calculation: cost = float(getattr(price_calculation, f'{typ}_price')) self.cost_histogram.record(cost, token_attributes) ``` #### __init__ ```python __init__( *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[ 1, 2, 3 ] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal[ "attributes", "logs" ] = "attributes", event_logger_provider: EventLoggerProvider | None = None ) ``` Create instrumentation options. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `tracer_provider` | `TracerProvider | None` | The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling logfire.configure() sets the global tracer provider, so most users don't need this. | `None` | | `meter_provider` | `MeterProvider | None` | The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling logfire.configure() sets the global meter provider, so most users don't need this. | `None` | | `include_binary_content` | `bool` | Whether to include binary content in the instrumentation events. | `True` | | `include_content` | `bool` | Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. | `True` | | `version` | `Literal[1, 2, 3]` | Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters event_mode and event_logger_provider are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - gen_ai.system_instructions for instructions passed to the agent. - gen_ai.input.messages and gen_ai.output.messages on model request spans. - pydantic_ai.all_messages on agent run spans. | `DEFAULT_INSTRUMENTATION_VERSION` | | `event_mode` | `Literal['attributes', 'logs']` | The mode for emitting events in version 1. If 'attributes', events are attached to the span as attributes. If 'logs', events are emitted as OpenTelemetry log-based events. | `'attributes'` | | `event_logger_provider` | `EventLoggerProvider | None` | The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling logfire.configure() sets the global event logger provider, so most users don't need this. This is only used if event_mode='logs' and version=1. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def __init__( self, *, tracer_provider: TracerProvider | None = None, meter_provider: MeterProvider | None = None, include_binary_content: bool = True, include_content: bool = True, version: Literal[1, 2, 3] = DEFAULT_INSTRUMENTATION_VERSION, event_mode: Literal['attributes', 'logs'] = 'attributes', event_logger_provider: EventLoggerProvider | None = None, ): """Create instrumentation options. Args: tracer_provider: The OpenTelemetry tracer provider to use. If not provided, the global tracer provider is used. Calling `logfire.configure()` sets the global tracer provider, so most users don't need this. meter_provider: The OpenTelemetry meter provider to use. If not provided, the global meter provider is used. Calling `logfire.configure()` sets the global meter provider, so most users don't need this. include_binary_content: Whether to include binary content in the instrumentation events. include_content: Whether to include prompts, completions, and tool call arguments and responses in the instrumentation events. version: Version of the data format. This is unrelated to the Pydantic AI package version. Version 1 is based on the legacy event-based OpenTelemetry GenAI spec and will be removed in a future release. The parameters `event_mode` and `event_logger_provider` are only relevant for version 1. Version 2 uses the newer OpenTelemetry GenAI spec and stores messages in the following attributes: - `gen_ai.system_instructions` for instructions passed to the agent. - `gen_ai.input.messages` and `gen_ai.output.messages` on model request spans. - `pydantic_ai.all_messages` on agent run spans. event_mode: The mode for emitting events in version 1. If `'attributes'`, events are attached to the span as attributes. If `'logs'`, events are emitted as OpenTelemetry log-based events. event_logger_provider: The OpenTelemetry event logger provider to use. If not provided, the global event logger provider is used. Calling `logfire.configure()` sets the global event logger provider, so most users don't need this. This is only used if `event_mode='logs'` and `version=1`. """ from pydantic_ai import __version__ tracer_provider = tracer_provider or get_tracer_provider() meter_provider = meter_provider or get_meter_provider() event_logger_provider = event_logger_provider or get_event_logger_provider() scope_name = 'pydantic-ai' self.tracer = tracer_provider.get_tracer(scope_name, __version__) self.meter = meter_provider.get_meter(scope_name, __version__) self.event_logger = event_logger_provider.get_event_logger(scope_name, __version__) self.event_mode = event_mode self.include_binary_content = include_binary_content self.include_content = include_content if event_mode == 'logs' and version != 1: warnings.warn( 'event_mode is only relevant for version=1 which is deprecated and will be removed in a future release.', stacklevel=2, ) version = 1 self.version = version # As specified in the OpenTelemetry GenAI metrics spec: # https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-metrics/#metric-gen_aiclienttokenusage tokens_histogram_kwargs = dict( name='gen_ai.client.token.usage', unit='{token}', description='Measures number of input and output tokens used', ) try: self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, explicit_bucket_boundaries_advisory=TOKEN_HISTOGRAM_BOUNDARIES, ) except TypeError: # pragma: lax no cover # Older OTel/logfire versions don't support explicit_bucket_boundaries_advisory self.tokens_histogram = self.meter.create_histogram( **tokens_histogram_kwargs, # pyright: ignore ) self.cost_histogram = self.meter.create_histogram( 'operation.cost', unit='{USD}', description='Monetary cost', ) ``` #### messages_to_otel_events ```python messages_to_otel_events( messages: list[ModelMessage], ) -> list[Event] ``` Convert a list of model messages to OpenTelemetry events. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `messages` | `list[ModelMessage]` | The messages to convert. | *required* | Returns: | Type | Description | | --- | --- | | `list[Event]` | A list of OpenTelemetry events. | Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python def messages_to_otel_events(self, messages: list[ModelMessage]) -> list[Event]: """Convert a list of model messages to OpenTelemetry events. Args: messages: The messages to convert. Returns: A list of OpenTelemetry events. """ events: list[Event] = [] instructions = InstrumentedModel._get_instructions(messages) # pyright: ignore [reportPrivateUsage] if instructions is not None: events.append( Event( 'gen_ai.system.message', body={**({'content': instructions} if self.include_content else {}), 'role': 'system'}, ) ) for message_index, message in enumerate(messages): message_events: list[Event] = [] if isinstance(message, ModelRequest): for part in message.parts: if hasattr(part, 'otel_event'): message_events.append(part.otel_event(self)) elif isinstance(message, ModelResponse): # pragma: no branch message_events = message.otel_events(self) for event in message_events: event.attributes = { 'gen_ai.message.index': message_index, **(event.attributes or {}), } events.extend(message_events) for event in events: event.body = InstrumentedModel.serialize_any(event.body) return events ``` ### InstrumentedModel Bases: `WrapperModel` Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. Source code in `pydantic_ai_slim/pydantic_ai/models/instrumented.py` ```python @dataclass(init=False) class InstrumentedModel(WrapperModel): """Model which wraps another model so that requests are instrumented with OpenTelemetry. See the [Debugging and Monitoring guide](https://ai.pydantic.dev/logfire/) for more info. """ instrumentation_settings: InstrumentationSettings """Instrumentation settings for this model.""" def __init__( self, wrapped: Model | KnownModelName, options: InstrumentationSettings | None = None, ) -> None: super().__init__(wrapped) self.instrumentation_settings = options or InstrumentationSettings() async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: prepared_settings, prepared_parameters = self.wrapped.prepare_request( model_settings, model_request_parameters, ) with self._instrument(messages, prepared_settings, prepared_parameters) as finish: response = await self.wrapped.request(messages, model_settings, model_request_parameters) finish(response) return response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: prepared_settings, prepared_parameters = self.wrapped.prepare_request( model_settings, model_request_parameters, ) with self._instrument(messages, prepared_settings, prepared_parameters) as finish: response_stream: StreamedResponse | None = None try: async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream finally: if response_stream: # pragma: no branch finish(response_stream.get()) @contextmanager def _instrument( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> Iterator[Callable[[ModelResponse], None]]: operation = 'chat' span_name = f'{operation} {self.model_name}' # TODO Missing attributes: # - error.type: unclear if we should do something here or just always rely on span exceptions # - gen_ai.request.stop_sequences/top_k: model_settings doesn't include these attributes: dict[str, AttributeValue] = { 'gen_ai.operation.name': operation, **self.model_attributes(self.wrapped), 'model_request_parameters': json.dumps(InstrumentedModel.serialize_any(model_request_parameters)), 'logfire.json_schema': json.dumps( { 'type': 'object', 'properties': {'model_request_parameters': {'type': 'object'}}, } ), } if model_settings: for key in MODEL_SETTING_ATTRIBUTES: if isinstance(value := model_settings.get(key), float | int): attributes[f'gen_ai.request.{key}'] = value record_metrics: Callable[[], None] | None = None try: with self.instrumentation_settings.tracer.start_as_current_span(span_name, attributes=attributes) as span: def finish(response: ModelResponse): # FallbackModel updates these span attributes. attributes.update(getattr(span, 'attributes', {})) request_model = attributes[GEN_AI_REQUEST_MODEL_ATTRIBUTE] system = cast(str, attributes[GEN_AI_SYSTEM_ATTRIBUTE]) response_model = response.model_name or request_model price_calculation = None def _record_metrics(): metric_attributes = { GEN_AI_SYSTEM_ATTRIBUTE: system, 'gen_ai.operation.name': operation, 'gen_ai.request.model': request_model, 'gen_ai.response.model': response_model, } self.instrumentation_settings.record_metrics(response, price_calculation, metric_attributes) nonlocal record_metrics record_metrics = _record_metrics if not span.is_recording(): return self.instrumentation_settings.handle_messages(messages, response, system, span) attributes_to_set = { **response.usage.opentelemetry_attributes(), 'gen_ai.response.model': response_model, } try: price_calculation = response.cost() except LookupError: # The cost of this provider/model is unknown, which is common. pass except Exception as e: warnings.warn( f'Failed to get cost from response: {type(e).__name__}: {e}', CostCalculationFailedWarning ) else: attributes_to_set['operation.cost'] = float(price_calculation.total_price) if response.provider_response_id is not None: attributes_to_set['gen_ai.response.id'] = response.provider_response_id if response.finish_reason is not None: attributes_to_set['gen_ai.response.finish_reasons'] = [response.finish_reason] span.set_attributes(attributes_to_set) span.update_name(f'{operation} {request_model}') yield finish finally: if record_metrics: # We only want to record metrics after the span is finished, # to prevent them from being redundantly recorded in the span itself by logfire. record_metrics() @staticmethod def model_attributes(model: Model): attributes: dict[str, AttributeValue] = { GEN_AI_SYSTEM_ATTRIBUTE: model.system, GEN_AI_REQUEST_MODEL_ATTRIBUTE: model.model_name, } if base_url := model.base_url: try: parsed = urlparse(base_url) except Exception: # pragma: no cover pass else: if parsed.hostname: # pragma: no branch attributes['server.address'] = parsed.hostname if parsed.port: # pragma: no branch attributes['server.port'] = parsed.port return attributes @staticmethod def event_to_dict(event: Event) -> dict[str, Any]: if not event.body: body = {} # pragma: no cover elif isinstance(event.body, Mapping): body = event.body # type: ignore else: body = {'body': event.body} return {**body, **(event.attributes or {})} @staticmethod def serialize_any(value: Any) -> str: try: return ANY_ADAPTER.dump_python(value, mode='json') except Exception: try: return str(value) except Exception as e: return f'Unable to serialize: {e}' ``` #### instrumentation_settings ```python instrumentation_settings: InstrumentationSettings = ( options or InstrumentationSettings() ) ``` Instrumentation settings for this model. # pydantic_ai.models.mcp_sampling ### MCPSamplingModelSettings Bases: `ModelSettings` Settings used for an MCP Sampling model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python class MCPSamplingModelSettings(ModelSettings, total=False): """Settings used for an MCP Sampling model request.""" # ALL FIELDS MUST BE `mcp_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. mcp_model_preferences: ModelPreferences """Model preferences to use for MCP Sampling.""" ``` #### mcp_model_preferences ```python mcp_model_preferences: ModelPreferences ``` Model preferences to use for MCP Sampling. ### MCPSamplingModel Bases: `Model` A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. Source code in `pydantic_ai_slim/pydantic_ai/models/mcp_sampling.py` ```python @dataclass class MCPSamplingModel(Model): """A model that uses MCP Sampling. [MCP Sampling](https://modelcontextprotocol.io/docs/concepts/sampling) allows an MCP server to make requests to a model by calling back to the MCP client that connected to it. """ session: ServerSession """The MCP server session to use for sampling.""" _: KW_ONLY default_max_tokens: int = 16_384 """Default max tokens to use if not set in [`ModelSettings`][pydantic_ai.settings.ModelSettings.max_tokens]. Max tokens is a required parameter for MCP Sampling, but optional on [`ModelSettings`][pydantic_ai.settings.ModelSettings], so this value is used as fallback. """ async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: system_prompt, sampling_messages = _mcp.map_from_pai_messages(messages) model_settings, _ = self.prepare_request(model_settings, model_request_parameters) model_settings = cast(MCPSamplingModelSettings, model_settings or {}) result = await self.session.create_message( sampling_messages, max_tokens=model_settings.get('max_tokens', self.default_max_tokens), system_prompt=system_prompt, temperature=model_settings.get('temperature'), model_preferences=model_settings.get('mcp_model_preferences'), stop_sequences=model_settings.get('stop_sequences'), ) if result.role == 'assistant': return ModelResponse( parts=[_mcp.map_from_sampling_content(result.content)], model_name=result.model, ) else: raise exceptions.UnexpectedModelBehavior( f'Unexpected result from MCP sampling, expected "assistant" role, got {result.role}.' ) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: raise NotImplementedError('MCP Sampling does not support streaming') yield @property def model_name(self) -> str: """The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. """ return 'mcp-sampling' @property def system(self) -> str: """The system / model provider, returns `'MCP'`.""" return 'MCP' ``` #### session ```python session: ServerSession ``` The MCP server session to use for sampling. #### default_max_tokens ```python default_max_tokens: int = 16384 ``` Default max tokens to use if not set in ModelSettings. Max tokens is a required parameter for MCP Sampling, but optional on ModelSettings, so this value is used as fallback. #### model_name ```python model_name: str ``` The model name. Since the model name isn't known until the request is made, this property always returns `'mcp-sampling'`. #### system ```python system: str ``` The system / model provider, returns `'MCP'`. # `pydantic_ai.models.mistral` ## Setup For details on how to set up authentication with this model, see [model configuration for Mistral](../../../models/mistral/). ### LatestMistralModelNames ```python LatestMistralModelNames = Literal[ "mistral-large-latest", "mistral-small-latest", "codestral-latest", "mistral-moderation-latest", ] ``` Latest Mistral models. ### MistralModelName ```python MistralModelName = str | LatestMistralModelNames ``` Possible Mistral model names. Since Mistral supports a variety of date-stamped models, we explicitly list the most popular models but allow any name in the type hints. Since [the Mistral docs](https://docs.mistral.ai/getting-started/models/models_overview/) for a full list. ### MistralModelSettings Bases: `ModelSettings` Settings used for a Mistral model request. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python class MistralModelSettings(ModelSettings, total=False): """Settings used for a Mistral model request.""" ``` ### MistralModel Bases: `Model` A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python @dataclass(init=False) class MistralModel(Model): """A model that uses Mistral. Internally, this uses the [Mistral Python client](https://github.com/mistralai/client-python) to interact with the API. [API Documentation](https://docs.mistral.ai/) """ client: Mistral = field(repr=False) json_mode_schema_prompt: str _model_name: MistralModelName = field(repr=False) _provider: Provider[Mistral] = field(repr=False) def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def base_url(self) -> str: return self._provider.base_url @property def model_name(self) -> MistralModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) async def _completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralChatCompletionResponse: """Make a non-streaming request to the model.""" # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. if model_request_parameters.builtin_tools: raise UserError('Mistral does not support built-in tools') try: response = await self.client.chat.complete_async( model=str(self._model_name), messages=self._map_messages(messages), n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), stream=False, max_tokens=model_settings.get('max_tokens', UNSET), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), random_seed=model_settings.get('seed', UNSET), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) except SDKError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover assert response, 'A unexpected empty response from Mistral.' return response async def _stream_completions_create( self, messages: list[ModelMessage], model_settings: MistralModelSettings, model_request_parameters: ModelRequestParameters, ) -> MistralEventStreamAsync[MistralCompletionEvent]: """Create a streaming completion request to the Mistral model.""" response: MistralEventStreamAsync[MistralCompletionEvent] | None mistral_messages = self._map_messages(messages) # TODO(Marcelo): We need to replace the current MistralAI client to use the beta client. # See https://docs.mistral.ai/agents/connectors/websearch/ to support web search. if model_request_parameters.builtin_tools: raise UserError('Mistral does not support built-in tools') if model_request_parameters.function_tools: # Function Calling response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, n=1, tools=self._map_function_and_output_tools_definition(model_request_parameters) or UNSET, tool_choice=self._get_tool_choice(model_request_parameters), temperature=model_settings.get('temperature', UNSET), top_p=model_settings.get('top_p', 1), max_tokens=model_settings.get('max_tokens', UNSET), timeout_ms=self._get_timeout_ms(model_settings.get('timeout')), presence_penalty=model_settings.get('presence_penalty'), frequency_penalty=model_settings.get('frequency_penalty'), stop=model_settings.get('stop_sequences', None), http_headers={'User-Agent': get_user_agent()}, ) elif model_request_parameters.output_tools: # TODO: Port to native "manual JSON" mode # Json Mode parameters_json_schemas = [tool.parameters_json_schema for tool in model_request_parameters.output_tools] user_output_format_message = self._generate_user_output_format(parameters_json_schemas) mistral_messages.append(user_output_format_message) response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, response_format={ 'type': 'json_object' }, # TODO: Should be able to use json_schema now: https://docs.mistral.ai/capabilities/structured-output/custom_structured_output/, https://github.com/mistralai/client-python/blob/bc4adf335968c8a272e1ab7da8461c9943d8e701/src/mistralai/extra/utils/response_format.py#L9 stream=True, http_headers={'User-Agent': get_user_agent()}, ) else: # Stream Mode response = await self.client.chat.stream_async( model=str(self._model_name), messages=mistral_messages, stream=True, http_headers={'User-Agent': get_user_agent()}, ) assert response, 'A unexpected empty response from Mistral.' return response def _get_tool_choice(self, model_request_parameters: ModelRequestParameters) -> MistralToolChoiceEnum | None: """Get tool choice for the model. - "auto": Default mode. Model decides if it uses the tool or not. - "any": Select any tool. - "none": Prevents tool use. - "required": Forces tool use. """ if not model_request_parameters.function_tools and not model_request_parameters.output_tools: return None elif not model_request_parameters.allow_text_output: return 'required' else: return 'auto' def _map_function_and_output_tools_definition( self, model_request_parameters: ModelRequestParameters ) -> list[MistralTool] | None: """Map function and output tools to MistralTool format. Returns None if both function_tools and output_tools are empty. """ tools = [ MistralTool( function=MistralFunction( name=r.name, parameters=r.parameters_json_schema, description=r.description or '' ) ) for r in model_request_parameters.tool_defs.values() ] return tools if tools else None def _process_response(self, response: MistralChatCompletionResponse) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" assert response.choices, 'Unexpected empty response choice.' if response.created: timestamp = number_to_datetime(response.created) else: timestamp = _now_utc() choice = response.choices[0] content = choice.message.content tool_calls = choice.message.tool_calls parts: list[ModelResponsePart] = [] text, thinking = _map_content(content) for thought in thinking: parts.append(ThinkingPart(content=thought)) if text: parts.append(TextPart(content=text)) if isinstance(tool_calls, list): for tool_call in tool_calls: tool = self._map_mistral_to_pydantic_tool_call(tool_call=tool_call) parts.append(tool) raw_finish_reason = choice.finish_reason provider_details = {'finish_reason': raw_finish_reason} finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=parts, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: MistralEventStreamAsync[MistralCompletionEvent], model_request_parameters: ModelRequestParameters, ) -> StreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) if first_chunk.data.created: timestamp = number_to_datetime(first_chunk.data.created) else: timestamp = _now_utc() return MistralStreamedResponse( model_request_parameters=model_request_parameters, _response=peekable_response, _model_name=first_chunk.data.model, _timestamp=timestamp, _provider_name=self._provider.name, ) @staticmethod def _map_mistral_to_pydantic_tool_call(tool_call: MistralToolCall) -> ToolCallPart: """Maps a MistralToolCall to a ToolCall.""" tool_call_id = tool_call.id or _generate_tool_call_id() func_call = tool_call.function return ToolCallPart(func_call.name, func_call.arguments, tool_call_id) @staticmethod def _map_tool_call(t: ToolCallPart) -> MistralToolCall: """Maps a pydantic-ai ToolCall to a MistralToolCall.""" return MistralToolCall( id=_utils.guard_tool_call_id(t=t), type='function', function=MistralFunctionCall(name=t.tool_name, arguments=t.args or {}), ) def _generate_user_output_format(self, schemas: list[dict[str, Any]]) -> MistralUserMessage: """Get a message with an example of the expected output format.""" examples: list[dict[str, Any]] = [] for schema in schemas: typed_dict_definition: dict[str, Any] = {} for key, value in schema.get('properties', {}).items(): typed_dict_definition[key] = self._get_python_type(value) examples.append(typed_dict_definition) example_schema = examples[0] if len(examples) == 1 else examples return MistralUserMessage(content=self.json_mode_schema_prompt.format(schema=example_schema)) @classmethod def _get_python_type(cls, value: dict[str, Any]) -> str: """Return a string representation of the Python type for a single JSON schema property. This function handles recursion for nested arrays/objects and `anyOf`. """ # 1) Handle anyOf first, because it's a different schema structure if any_of := value.get('anyOf'): # Simplistic approach: pick the first option in anyOf # (In reality, you'd possibly want to merge or union types) return f'Optional[{cls._get_python_type(any_of[0])}]' # 2) If we have a top-level "type" field value_type = value.get('type') if not value_type: # No explicit type; fallback return 'Any' # 3) Direct simple type mapping (string, integer, float, bool, None) if value_type in SIMPLE_JSON_TYPE_MAPPING and value_type != 'array' and value_type != 'object': return SIMPLE_JSON_TYPE_MAPPING[value_type] # 4) Array: Recursively get the item type if value_type == 'array': items = value.get('items', {}) return f'list[{cls._get_python_type(items)}]' # 5) Object: Check for additionalProperties if value_type == 'object': additional_properties = value.get('additionalProperties', {}) if isinstance(additional_properties, bool): return 'bool' # pragma: lax no cover additional_properties_type = additional_properties.get('type') if ( additional_properties_type in SIMPLE_JSON_TYPE_MAPPING and additional_properties_type != 'array' and additional_properties_type != 'object' ): # dict[str, bool/int/float/etc...] return f'dict[str, {SIMPLE_JSON_TYPE_MAPPING[additional_properties_type]}]' elif additional_properties_type == 'array': array_items = additional_properties.get('items', {}) return f'dict[str, list[{cls._get_python_type(array_items)}]]' elif additional_properties_type == 'object': # nested dictionary of unknown shape return 'dict[str, dict[str, Any]]' else: # If no additionalProperties type or something else, default to a generic dict return 'dict[str, Any]' # 6) Fallback return 'Any' @staticmethod def _get_timeout_ms(timeout: Timeout | float | None) -> int | None: """Convert a timeout to milliseconds.""" if timeout is None: return None if isinstance(timeout, float): # pragma: no cover return int(1000 * timeout) raise NotImplementedError('Timeout object is not yet supported for MistralModel.') def _map_user_message(self, message: ModelRequest) -> Iterable[MistralMessages]: for part in message.parts: if isinstance(part, SystemPromptPart): yield MistralSystemMessage(content=part.content) elif isinstance(part, UserPromptPart): yield self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield MistralUserMessage(content=part.model_response()) # pragma: no cover else: yield MistralToolMessage( tool_call_id=part.tool_call_id, content=part.model_response(), ) else: assert_never(part) def _map_messages(self, messages: list[ModelMessage]) -> list[MistralMessages]: """Just maps a `pydantic_ai.Message` to a `MistralMessage`.""" mistral_messages: list[MistralMessages] = [] for message in messages: if isinstance(message, ModelRequest): mistral_messages.extend(self._map_user_message(message)) elif isinstance(message, ModelResponse): content_chunks: list[MistralContentChunk] = [] thinking_chunks: list[MistralTextChunk | MistralReferenceChunk] = [] tool_calls: list[MistralToolCall] = [] for part in message.parts: if isinstance(part, TextPart): content_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ThinkingPart): thinking_chunks.append(MistralTextChunk(text=part.content)) elif isinstance(part, ToolCallPart): tool_calls.append(self._map_tool_call(part)) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # This is currently never returned from mistral pass elif isinstance(part, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(part) if thinking_chunks: content_chunks.insert(0, MistralThinkChunk(thinking=thinking_chunks)) mistral_messages.append(MistralAssistantMessage(content=content_chunks, tool_calls=tool_calls)) else: assert_never(message) if instructions := self._get_instructions(messages): mistral_messages.insert(0, MistralSystemMessage(content=instructions)) # Post-process messages to insert fake assistant message after tool message if followed by user message # to work around `Unexpected role 'user' after role 'tool'` error. processed_messages: list[MistralMessages] = [] for i, current_message in enumerate(mistral_messages): processed_messages.append(current_message) if isinstance(current_message, MistralToolMessage) and i + 1 < len(mistral_messages): next_message = mistral_messages[i + 1] if isinstance(next_message, MistralUserMessage): # Insert a dummy assistant message processed_messages.append(MistralAssistantMessage(content=[MistralTextChunk(text='OK')])) return processed_messages def _map_user_prompt(self, part: UserPromptPart) -> MistralUserMessage: content: str | list[MistralContentChunk] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(MistralTextChunk(text=item)) elif isinstance(item, ImageUrl): content.append(MistralImageURLChunk(image_url=MistralImageURL(url=item.url))) elif isinstance(item, BinaryContent): if item.is_image: image_url = MistralImageURL(url=item.data_uri) content.append(MistralImageURLChunk(image_url=image_url, type='image_url')) elif item.media_type == 'application/pdf': content.append(MistralDocumentURLChunk(document_url=item.data_uri, type='document_url')) else: raise RuntimeError('BinaryContent other than image or PDF is not supported in Mistral.') elif isinstance(item, DocumentUrl): if item.media_type == 'application/pdf': content.append(MistralDocumentURLChunk(document_url=item.url, type='document_url')) else: raise RuntimeError('DocumentUrl other than PDF is not supported in Mistral.') elif isinstance(item, VideoUrl): raise RuntimeError('VideoUrl is not supported in Mistral.') else: # pragma: no cover raise RuntimeError(f'Unsupported content type: {type(item)}') return MistralUserMessage(content=content) ```` #### __init__ ````python __init__( model_name: MistralModelName, *, provider: ( Literal["mistral"] | Provider[Mistral] ) = "mistral", profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = "Answer in JSON Object, respect the format:\n```\n{schema}\n```\n", settings: ModelSettings | None = None ) ```` Initialize a Mistral model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `MistralModelName` | The name of the model to use. | *required* | | `provider` | `Literal['mistral'] | Provider[Mistral]` | The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of Provider[Mistral]. If not provided, a new provider will be created using the other parameters. | `'mistral'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `json_mode_schema_prompt` | `str` | The prompt to show when the model expects a JSON object as input. | ```` 'Answer in JSON Object, respect the format:\n```\n{schema}\n```\n' ```` | | `settings` | `ModelSettings | None` | Model-specific settings that will be used as defaults for this model. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ````python def __init__( self, model_name: MistralModelName, *, provider: Literal['mistral'] | Provider[Mistral] = 'mistral', profile: ModelProfileSpec | None = None, json_mode_schema_prompt: str = """Answer in JSON Object, respect the format:\n```\n{schema}\n```\n""", settings: ModelSettings | None = None, ): """Initialize a Mistral model. Args: model_name: The name of the model to use. provider: The provider to use for authentication and API access. Can be either the string 'mistral' or an instance of `Provider[Mistral]`. If not provided, a new provider will be created using the other parameters. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. json_mode_schema_prompt: The prompt to show when the model expects a JSON object as input. settings: Model-specific settings that will be used as defaults for this model. """ self._model_name = model_name self.json_mode_schema_prompt = json_mode_schema_prompt if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ```` #### model_name ```python model_name: MistralModelName ``` The model name. #### system ```python system: str ``` The model provider. #### request ```python request( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse ``` Make a non-streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: """Make a non-streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response ``` #### request_stream ```python request_stream( messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse] ``` Make a streaming request to the model from Pydantic AI call. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: """Make a streaming request to the model from Pydantic AI call.""" check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._stream_completions_create( messages, cast(MistralModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) ``` ### MistralStreamedResponse Bases: `StreamedResponse` Implementation of `StreamedResponse` for Mistral models. Source code in `pydantic_ai_slim/pydantic_ai/models/mistral.py` ```python @dataclass class MistralStreamedResponse(StreamedResponse): """Implementation of `StreamedResponse` for Mistral models.""" _model_name: MistralModelName _response: AsyncIterable[MistralCompletionEvent] _timestamp: datetime _provider_name: str _delta_content: str = field(default='', init=False) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: chunk: MistralCompletionEvent async for chunk in self._response: self._usage += _map_usage(chunk.data) if chunk.data.id: # pragma: no branch self.provider_response_id = chunk.data.id try: choice = chunk.data.choices[0] except IndexError: continue if raw_finish_reason := choice.finish_reason: self.provider_details = {'finish_reason': raw_finish_reason} self.finish_reason = _FINISH_REASON_MAP.get(raw_finish_reason) # Handle the text part of the response content = choice.delta.content text, thinking = _map_content(content) for thought in thinking: self._parts_manager.handle_thinking_delta(vendor_part_id='thinking', content=thought) if text: # Attempt to produce an output tool call from the received text output_tools = {c.name: c for c in self.model_request_parameters.output_tools} if output_tools: self._delta_content += text # TODO: Port to native "manual JSON" mode maybe_tool_call_part = self._try_get_output_tool_from_text(self._delta_content, output_tools) if maybe_tool_call_part: yield self._parts_manager.handle_tool_call_part( vendor_part_id='output', tool_name=maybe_tool_call_part.tool_name, args=maybe_tool_call_part.args_as_dict(), tool_call_id=maybe_tool_call_part.tool_call_id, ) else: maybe_event = self._parts_manager.handle_text_delta(vendor_part_id='content', content=text) if maybe_event is not None: # pragma: no branch yield maybe_event # Handle the explicit tool calls for index, dtc in enumerate(choice.delta.tool_calls or []): # It seems that mistral just sends full tool calls, so we just use them directly, rather than building yield self._parts_manager.handle_tool_call_part( vendor_part_id=index, tool_name=dtc.function.name, args=dtc.function.arguments, tool_call_id=dtc.id ) @property def model_name(self) -> MistralModelName: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp @staticmethod def _try_get_output_tool_from_text(text: str, output_tools: dict[str, ToolDefinition]) -> ToolCallPart | None: output_json: dict[str, Any] | None = pydantic_core.from_json(text, allow_partial='trailing-strings') if output_json: for output_tool in output_tools.values(): # NOTE: Additional verification to prevent JSON validation to crash # Ensures required parameters in the JSON schema are respected, especially for stream-based return types. # Example with BaseModel and required fields. if not MistralStreamedResponse._validate_required_json_schema( output_json, output_tool.parameters_json_schema ): continue # The following part_id will be thrown away return ToolCallPart(tool_name=output_tool.name, args=output_json) @staticmethod def _validate_required_json_schema(json_dict: dict[str, Any], json_schema: dict[str, Any]) -> bool: """Validate that all required parameters in the JSON schema are present in the JSON dictionary.""" required_params = json_schema.get('required', []) properties = json_schema.get('properties', {}) for param in required_params: if param not in json_dict: return False param_schema = properties.get(param, {}) param_type = param_schema.get('type') param_items_type = param_schema.get('items', {}).get('type') if param_type == 'array' and param_items_type: if not isinstance(json_dict[param], list): return False for item in json_dict[param]: if not isinstance(item, VALID_JSON_TYPE_MAPPING[param_items_type]): return False elif param_type and not isinstance(json_dict[param], VALID_JSON_TYPE_MAPPING[param_type]): return False if isinstance(json_dict[param], dict) and 'properties' in param_schema: nested_schema = param_schema if not MistralStreamedResponse._validate_required_json_schema(json_dict[param], nested_schema): return False return True ``` #### model_name ```python model_name: MistralModelName ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # `pydantic_ai.models.openai` ## Setup For details on how to set up authentication with this model, see [model configuration for OpenAI](../../../models/openai/). ### OpenAIModelName ```python OpenAIModelName = str | AllModels ``` Possible OpenAI model names. Since OpenAI supports a variety of date-stamped models, we explicitly list the latest models but allow any name in the type hints. See [the OpenAI docs](https://platform.openai.com/docs/models) for a full list. Using this more broad type for the model name instead of the ChatModel definition allows this model to be used more easily with other model types (ie, Ollama, Deepseek). ### OpenAIChatModelSettings Bases: `ModelSettings` Settings used for an OpenAI model request. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIChatModelSettings(ModelSettings, total=False): """Settings used for an OpenAI model request.""" # ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. openai_reasoning_effort: ReasoningEffort """Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. """ openai_logprobs: bool """Include log probabilities in the response.""" openai_top_logprobs: int """Include log probabilities of the top n tokens in the response.""" openai_user: str """A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. """ openai_service_tier: Literal['auto', 'default', 'flex', 'priority'] """The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). """ openai_prediction: ChatCompletionPredictionContentParam """Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. """ ``` #### openai_reasoning_effort ```python openai_reasoning_effort: ReasoningEffort ``` Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). Currently supported values are `low`, `medium`, and `high`. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response. #### openai_logprobs ```python openai_logprobs: bool ``` Include log probabilities in the response. #### openai_top_logprobs ```python openai_top_logprobs: int ``` Include log probabilities of the top n tokens in the response. #### openai_user ```python openai_user: str ``` A unique identifier representing the end-user, which can help OpenAI monitor and detect abuse. See [OpenAI's safety best practices](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids) for more details. #### openai_service_tier ```python openai_service_tier: Literal[ "auto", "default", "flex", "priority" ] ``` The service tier to use for the model request. Currently supported values are `auto`, `default`, `flex`, and `priority`. For more information, see [OpenAI's service tiers documentation](https://platform.openai.com/docs/api-reference/chat/object#chat/object-service_tier). #### openai_prediction ```python openai_prediction: ChatCompletionPredictionContentParam ``` Enables [predictive outputs](https://platform.openai.com/docs/guides/predicted-outputs). This feature is currently only supported for some OpenAI models. ### OpenAIModelSettings Bases: `OpenAIChatModelSettings` Deprecated Use `OpenAIChatModelSettings` instead. Deprecated alias for `OpenAIChatModelSettings`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated('Use `OpenAIChatModelSettings` instead.') class OpenAIModelSettings(OpenAIChatModelSettings, total=False): """Deprecated alias for `OpenAIChatModelSettings`.""" ``` ### OpenAIResponsesModelSettings Bases: `OpenAIChatModelSettings` Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python class OpenAIResponsesModelSettings(OpenAIChatModelSettings, total=False): """Settings used for an OpenAI Responses model request. ALL FIELDS MUST BE `openai_` PREFIXED SO YOU CAN MERGE THEM WITH OTHER MODELS. """ openai_builtin_tools: Sequence[FileSearchToolParam | WebSearchToolParam | ComputerToolParam] """The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. """ openai_reasoning_generate_summary: Literal['detailed', 'concise'] """Deprecated alias for `openai_reasoning_summary`.""" openai_reasoning_summary: Literal['detailed', 'concise'] """A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise` or `detailed`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. """ openai_send_reasoning_ids: bool """Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../../message-history.md#processing-message-history). In that case, you'll want to disable this. """ openai_truncation: Literal['disabled', 'auto'] """The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. """ openai_text_verbosity: Literal['low', 'medium', 'high'] """Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. """ openai_previous_response_id: Literal['auto'] | str """The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. """ openai_include_code_execution_outputs: bool """Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. """ openai_include_web_search_sources: bool """Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. """ ``` #### openai_builtin_tools ```python openai_builtin_tools: Sequence[ FileSearchToolParam | WebSearchToolParam | ComputerToolParam ] ``` The provided OpenAI built-in tools to use. See [OpenAI's built-in tools](https://platform.openai.com/docs/guides/tools?api-mode=responses) for more details. #### openai_reasoning_generate_summary ```python openai_reasoning_generate_summary: Literal[ "detailed", "concise" ] ``` Deprecated alias for `openai_reasoning_summary`. #### openai_reasoning_summary ```python openai_reasoning_summary: Literal['detailed', 'concise'] ``` A summary of the reasoning performed by the model. This can be useful for debugging and understanding the model's reasoning process. One of `concise` or `detailed`. Check the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning?api-mode=responses#reasoning-summaries) for more details. #### openai_send_reasoning_ids ```python openai_send_reasoning_ids: bool ``` Whether to send the unique IDs of reasoning, text, and function call parts from the message history to the model. Enabled by default for reasoning models. This can result in errors like `"Item 'rs_123' of type 'reasoning' was provided without its required following item."` if the message history you're sending does not match exactly what was received from the Responses API in a previous response, for example if you're using a [history processor](../../../message-history/#processing-message-history). In that case, you'll want to disable this. #### openai_truncation ```python openai_truncation: Literal['disabled', 'auto'] ``` The truncation strategy to use for the model response. It can be either: - `disabled` (default): If a model response will exceed the context window size for a model, the request will fail with a 400 error. - `auto`: If the context of this response and previous ones exceeds the model's context window size, the model will truncate the response to fit the context window by dropping input items in the middle of the conversation. #### openai_text_verbosity ```python openai_text_verbosity: Literal['low', 'medium', 'high'] ``` Constrains the verbosity of the model's text response. Lower values will result in more concise responses, while higher values will result in more verbose responses. Currently supported values are `low`, `medium`, and `high`. #### openai_previous_response_id ```python openai_previous_response_id: Literal['auto'] | str ``` The ID of a previous response from the model to use as the starting point for a continued conversation. When set to `'auto'`, the request automatically uses the most recent `provider_response_id` from the message history and omits earlier messages. This enables the model to use server-side conversation state and faithfully reference previous reasoning. See the [OpenAI Responses API documentation](https://platform.openai.com/docs/guides/reasoning#keeping-reasoning-items-in-context) for more information. #### openai_include_code_execution_outputs ```python openai_include_code_execution_outputs: bool ``` Whether to include the code execution results in the response. Corresponds to the `code_interpreter_call.outputs` value of the `include` parameter in the Responses API. #### openai_include_web_search_sources ```python openai_include_web_search_sources: bool ``` Whether to include the web search results in the response. Corresponds to the `web_search_call.action.sources` value of the `include` parameter in the Responses API. ### OpenAIChatModel Bases: `Model` A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIChatModel(Model): """A model that uses the OpenAI API. Internally, this uses the [OpenAI Python client](https://github.com/openai/openai-python) to interact with the API. Apart from `__init__`, all methods are private or match those of the base class. """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) @overload def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', 'nebius', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ) -> None: ... @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') @overload def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', 'nebius', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ) -> None: ... def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', 'nebius', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) @property def base_url(self) -> str: return str(self.client.base_url) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name @property @deprecated('Set the `system_prompt_role` in the `OpenAIModelProfile` instead.') def system_prompt_role(self) -> OpenAISystemPromptRole | None: return OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, False, cast(OpenAIChatModelSettings, model_settings or {}), model_request_parameters ) model_response = self._process_response(response) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._completions_create( messages, True, cast(OpenAIChatModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[True], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[ChatCompletionChunk]: ... @overload async def _completions_create( self, messages: list[ModelMessage], stream: Literal[False], model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion: ... async def _completions_create( self, messages: list[ModelMessage], stream: bool, model_settings: OpenAIChatModelSettings, model_request_parameters: ModelRequestParameters, ) -> chat.ChatCompletion | AsyncStream[ChatCompletionChunk]: tools = self._get_tools(model_request_parameters) web_search_options = self._get_web_search_options(model_request_parameters) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif ( not model_request_parameters.allow_text_output and OpenAIModelProfile.from_profile(self.profile).openai_supports_tool_choice_required ): tool_choice = 'required' else: tool_choice = 'auto' openai_messages = await self._map_messages(messages) response_format: chat.completion_create_params.ResponseFormat | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None response_format = self._map_json_schema(output_object) elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch response_format = {'type': 'json_object'} unsupported_model_settings = OpenAIModelProfile.from_profile(self.profile).openai_unsupported_model_settings for setting in unsupported_model_settings: model_settings.pop(setting, None) try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.chat.completions.create( model=self._model_name, messages=openai_messages, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, stream=stream, stream_options={'include_usage': True} if stream else NOT_GIVEN, stop=model_settings.get('stop_sequences', NOT_GIVEN), max_completion_tokens=model_settings.get('max_tokens', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), response_format=response_format or NOT_GIVEN, seed=model_settings.get('seed', NOT_GIVEN), reasoning_effort=model_settings.get('openai_reasoning_effort', NOT_GIVEN), user=model_settings.get('openai_user', NOT_GIVEN), web_search_options=web_search_options or NOT_GIVEN, service_tier=model_settings.get('openai_service_tier', NOT_GIVEN), prediction=model_settings.get('openai_prediction', NOT_GIVEN), temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), presence_penalty=model_settings.get('presence_penalty', NOT_GIVEN), frequency_penalty=model_settings.get('frequency_penalty', NOT_GIVEN), logit_bias=model_settings.get('logit_bias', NOT_GIVEN), logprobs=model_settings.get('openai_logprobs', NOT_GIVEN), top_logprobs=model_settings.get('openai_top_logprobs', NOT_GIVEN), extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _process_response(self, response: chat.ChatCompletion | str) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" # Although the OpenAI SDK claims to return a Pydantic model (`ChatCompletion`) from the chat completions function: # * it hasn't actually performed validation (presumably they're creating the model with `model_construct` or something?!) # * if the endpoint returns plain text, the return type is a string # Thus we validate it fully here. if not isinstance(response, chat.ChatCompletion): raise UnexpectedModelBehavior('Invalid response from OpenAI chat completions endpoint, expected JSON data') if response.created: timestamp = number_to_datetime(response.created) else: timestamp = _now_utc() response.created = int(timestamp.timestamp()) # Workaround for local Ollama which sometimes returns a `None` finish reason. if response.choices and (choice := response.choices[0]) and choice.finish_reason is None: # pyright: ignore[reportUnnecessaryComparison] choice.finish_reason = 'stop' try: response = chat.ChatCompletion.model_validate(response.model_dump()) except ValidationError as e: raise UnexpectedModelBehavior(f'Invalid response from OpenAI chat completions endpoint: {e}') from e choice = response.choices[0] items: list[ModelResponsePart] = [] # The `reasoning_content` field is only present in DeepSeek models. # https://api-docs.deepseek.com/guides/reasoning_model if reasoning_content := getattr(choice.message, 'reasoning_content', None): items.append(ThinkingPart(id='reasoning_content', content=reasoning_content, provider_name=self.system)) # The `reasoning` field is only present in gpt-oss via Ollama and OpenRouter. # - https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot#chat-completions-api # - https://openrouter.ai/docs/use-cases/reasoning-tokens#basic-usage-with-reasoning-tokens if reasoning := getattr(choice.message, 'reasoning', None): items.append(ThinkingPart(id='reasoning', content=reasoning, provider_name=self.system)) # NOTE: We don't currently handle OpenRouter `reasoning_details`: # - https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks # If you need this, please file an issue. vendor_details: dict[str, Any] = {} # Add logprobs to vendor_details if available if choice.logprobs is not None and choice.logprobs.content: # Convert logprobs to a serializable format vendor_details['logprobs'] = [ { 'token': lp.token, 'bytes': lp.bytes, 'logprob': lp.logprob, 'top_logprobs': [ {'token': tlp.token, 'bytes': tlp.bytes, 'logprob': tlp.logprob} for tlp in lp.top_logprobs ], } for lp in choice.logprobs.content ] if choice.message.content is not None: items.extend( (replace(part, id='content', provider_name=self.system) if isinstance(part, ThinkingPart) else part) for part in split_content_into_text_and_thinking(choice.message.content, self.profile.thinking_tags) ) if choice.message.tool_calls is not None: for c in choice.message.tool_calls: if isinstance(c, ChatCompletionMessageFunctionToolCall): part = ToolCallPart(c.function.name, c.function.arguments, tool_call_id=c.id) elif isinstance(c, ChatCompletionMessageCustomToolCall): # pragma: no cover # NOTE: Custom tool calls are not supported. # See for more details. raise RuntimeError('Custom tool calls are not supported') else: assert_never(c) part.tool_call_id = _guard_tool_call_id(part) items.append(part) raw_finish_reason = choice.finish_reason vendor_details['finish_reason'] = raw_finish_reason finish_reason = _CHAT_FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, timestamp=timestamp, provider_details=vendor_details or None, provider_response_id=response.id, provider_name=self._provider.name, finish_reason=finish_reason, ) async def _process_streamed_response( self, response: AsyncStream[ChatCompletionChunk], model_request_parameters: ModelRequestParameters ) -> OpenAIStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): raise UnexpectedModelBehavior( # pragma: no cover 'Streamed response ended without content or tool calls' ) # When using Azure OpenAI and a content filter is enabled, the first chunk will contain a `''` model name, # so we set it from a later chunk in `OpenAIChatStreamedResponse`. model_name = first_chunk.model or self._model_name return OpenAIStreamedResponse( model_request_parameters=model_request_parameters, _model_name=model_name, _model_profile=self.profile, _response=peekable_response, _timestamp=number_to_datetime(first_chunk.created), _provider_name=self._provider.name, ) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[chat.ChatCompletionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_web_search_options(self, model_request_parameters: ModelRequestParameters) -> WebSearchOptions | None: for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): # pragma: no branch if not OpenAIModelProfile.from_profile(self.profile).openai_chat_supports_web_search: raise UserError( f'WebSearchTool is not supported with `OpenAIChatModel` and model {self.model_name!r}. ' f'Please use `OpenAIResponsesModel` instead.' ) if tool.user_location: return WebSearchOptions( search_context_size=tool.search_context_size, user_location=WebSearchOptionsUserLocation( type='approximate', approximate=WebSearchOptionsUserLocationApproximate(**tool.user_location), ), ) return WebSearchOptions(search_context_size=tool.search_context_size) else: raise UserError( f'`{tool.__class__.__name__}` is not supported by `OpenAIChatModel`. If it should be, please file an issue.' ) async def _map_messages(self, messages: list[ModelMessage]) -> list[chat.ChatCompletionMessageParam]: """Just maps a `pydantic_ai.Message` to a `openai.types.ChatCompletionMessageParam`.""" openai_messages: list[chat.ChatCompletionMessageParam] = [] for message in messages: if isinstance(message, ModelRequest): async for item in self._map_user_message(message): openai_messages.append(item) elif isinstance(message, ModelResponse): texts: list[str] = [] tool_calls: list[ChatCompletionMessageFunctionToolCallParam] = [] for item in message.parts: if isinstance(item, TextPart): texts.append(item.content) elif isinstance(item, ThinkingPart): # NOTE: DeepSeek `reasoning_content` field should NOT be sent back per https://api-docs.deepseek.com/guides/reasoning_model, # but we currently just send it in `` tags anyway as we don't want DeepSeek-specific checks here. # If you need this changed, please file an issue. start_tag, end_tag = self.profile.thinking_tags texts.append('\n'.join([start_tag, item.content, end_tag])) elif isinstance(item, ToolCallPart): tool_calls.append(self._map_tool_call(item)) # OpenAI doesn't return built-in tool calls elif isinstance(item, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover pass elif isinstance(item, FilePart): # pragma: no cover # Files generated by models are not sent back to models that don't themselves generate files. pass else: assert_never(item) message_param = chat.ChatCompletionAssistantMessageParam(role='assistant') if texts: # Note: model responses from this model should only have one text item, so the following # shouldn't merge multiple texts into one unless you switch models between runs: message_param['content'] = '\n\n'.join(texts) else: message_param['content'] = None if tool_calls: message_param['tool_calls'] = tool_calls openai_messages.append(message_param) else: assert_never(message) if instructions := self._get_instructions(messages): openai_messages.insert(0, chat.ChatCompletionSystemMessageParam(content=instructions, role='system')) return openai_messages @staticmethod def _map_tool_call(t: ToolCallPart) -> ChatCompletionMessageFunctionToolCallParam: return ChatCompletionMessageFunctionToolCallParam( id=_guard_tool_call_id(t=t), type='function', function={'name': t.tool_name, 'arguments': t.args_as_json_str()}, ) def _map_json_schema(self, o: OutputObjectDefinition) -> chat.completion_create_params.ResponseFormat: response_format_param: chat.completion_create_params.ResponseFormatJSONSchema = { # pyright: ignore[reportPrivateImportUsage] 'type': 'json_schema', 'json_schema': {'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema}, } if o.description: response_format_param['json_schema']['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['json_schema']['strict'] = o.strict return response_format_param def _map_tool_definition(self, f: ToolDefinition) -> chat.ChatCompletionToolParam: tool_param: chat.ChatCompletionToolParam = { 'type': 'function', 'function': { 'name': f.name, 'description': f.description or '', 'parameters': f.parameters_json_schema, }, } if f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: tool_param['function']['strict'] = f.strict return tool_param async def _map_user_message(self, message: ModelRequest) -> AsyncIterable[chat.ChatCompletionMessageParam]: for part in message.parts: if isinstance(part, SystemPromptPart): system_prompt_role = OpenAIModelProfile.from_profile(self.profile).openai_system_prompt_role if system_prompt_role == 'developer': yield chat.ChatCompletionDeveloperMessageParam(role='developer', content=part.content) elif system_prompt_role == 'user': yield chat.ChatCompletionUserMessageParam(role='user', content=part.content) else: yield chat.ChatCompletionSystemMessageParam(role='system', content=part.content) elif isinstance(part, UserPromptPart): yield await self._map_user_prompt(part) elif isinstance(part, ToolReturnPart): yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response_str(), ) elif isinstance(part, RetryPromptPart): if part.tool_name is None: yield chat.ChatCompletionUserMessageParam(role='user', content=part.model_response()) else: yield chat.ChatCompletionToolMessageParam( role='tool', tool_call_id=_guard_tool_call_id(t=part), content=part.model_response(), ) else: assert_never(part) async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUserMessageParam: # noqa: C901 content: str | list[ChatCompletionContentPartParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(ChatCompletionContentPartTextParam(text=item, type='text')) elif isinstance(item, ImageUrl): image_url: ImageURL = {'url': item.url} if metadata := item.vendor_metadata: image_url['detail'] = metadata.get('detail', 'auto') if item.force_download: image_content = await download_item(item, data_format='base64_uri', type_format='extension') image_url['url'] = image_content['data'] content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif isinstance(item, BinaryContent): if self._is_text_like_media_type(item.media_type): # Inline text-like binary content as a text block content.append( self._inline_text_file_part( item.data.decode('utf-8'), media_type=item.media_type, identifier=item.identifier, ) ) elif item.is_image: image_url = ImageURL(url=item.data_uri) if metadata := item.vendor_metadata: image_url['detail'] = metadata.get('detail', 'auto') content.append(ChatCompletionContentPartImageParam(image_url=image_url, type='image_url')) elif item.is_audio: assert item.format in ('wav', 'mp3') audio = InputAudio(data=base64.b64encode(item.data).decode('utf-8'), format=item.format) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif item.is_document: content.append( File( file=FileFile( file_data=item.data_uri, filename=f'filename.{item.format}', ), type='file', ) ) else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, AudioUrl): downloaded_item = await download_item(item, data_format='base64', type_format='extension') assert downloaded_item['data_type'] in ( 'wav', 'mp3', ), f'Unsupported audio format: {downloaded_item["data_type"]}' audio = InputAudio(data=downloaded_item['data'], format=downloaded_item['data_type']) content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio')) elif isinstance(item, DocumentUrl): if self._is_text_like_media_type(item.media_type): downloaded_text = await download_item(item, data_format='text') content.append( self._inline_text_file_part( downloaded_text['data'], media_type=item.media_type, identifier=item.identifier, ) ) else: downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( File( file=FileFile( file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ), type='file', ) ) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI') else: assert_never(item) return chat.ChatCompletionUserMessageParam(role='user', content=content) @staticmethod def _is_text_like_media_type(media_type: str) -> bool: return ( media_type.startswith('text/') or media_type == 'application/json' or media_type.endswith('+json') or media_type == 'application/xml' or media_type.endswith('+xml') or media_type in ('application/x-yaml', 'application/yaml') ) @staticmethod def _inline_text_file_part(text: str, *, media_type: str, identifier: str) -> ChatCompletionContentPartTextParam: text = '\n'.join( [ f'-----BEGIN FILE id="{identifier}" type="{media_type}"-----', text, f'-----END FILE id="{identifier}"-----', ] ) return ChatCompletionContentPartTextParam(text=text, type='text') ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", "nebius", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", "nebius", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) -> None ``` ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "azure", "deepseek", "cerebras", "fireworks", "github", "grok", "heroku", "moonshotai", "ollama", "openai", "openai-chat", "openrouter", "together", "vercel", "litellm", "nebius", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, system_prompt_role: ( OpenAISystemPromptRole | None ) = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. List of model names available here (Unfortunately, despite being ask to do so, OpenAI do not provide .inv files for their API). | *required* | | `provider` | `Literal['azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', 'nebius'] | Provider[AsyncOpenAI]` | The provider to use. Defaults to 'openai'. | `'openai'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `system_prompt_role` | `OpenAISystemPromptRole | None` | The role to use for the system prompt message. If not provided, defaults to 'system'. In the future, this may be inferred from the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: Literal[ 'azure', 'deepseek', 'cerebras', 'fireworks', 'github', 'grok', 'heroku', 'moonshotai', 'ollama', 'openai', 'openai-chat', 'openrouter', 'together', 'vercel', 'litellm', 'nebius', ] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, system_prompt_role: OpenAISystemPromptRole | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI model. Args: model_name: The name of the OpenAI model to use. List of model names available [here](https://github.com/openai/openai-python/blob/v1.54.3/src/openai/types/chat_model.py#L7) (Unfortunately, despite being ask to do so, OpenAI do not provide `.inv` files for their API). provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. system_prompt_role: The role to use for the system prompt message. If not provided, defaults to `'system'`. In the future, this may be inferred from the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) if system_prompt_role is not None: self.profile = OpenAIModelProfile(openai_system_prompt_role=system_prompt_role).update(self.profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. ### OpenAIModel Bases: `OpenAIChatModel` Deprecated `OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or require a feature that the Responses API doesn't support yet like audio. Deprecated alias for `OpenAIChatModel`. Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @deprecated( '`OpenAIModel` was renamed to `OpenAIChatModel` to clearly distinguish it from `OpenAIResponsesModel` which ' "uses OpenAI's newer Responses API. Use that unless you're using an OpenAI Chat Completions-compatible API, or " "require a feature that the Responses API doesn't support yet like audio." ) @dataclass(init=False) class OpenAIModel(OpenAIChatModel): """Deprecated alias for `OpenAIChatModel`.""" ``` ### OpenAIResponsesModel Bases: `Model` A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python @dataclass(init=False) class OpenAIResponsesModel(Model): """A model that uses the OpenAI Responses API. The [OpenAI Responses API](https://platform.openai.com/docs/api-reference/responses) is the new API for OpenAI models. If you are interested in the differences between the Responses API and the Chat Completions API, see the [OpenAI API docs](https://platform.openai.com/docs/guides/responses-vs-chat-completions). """ client: AsyncOpenAI = field(repr=False) _model_name: OpenAIModelName = field(repr=False) _provider: Provider[AsyncOpenAI] = field(repr=False) def __init__( self, model_name: OpenAIModelName, *, provider: Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together', 'nebius'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) @property def model_name(self) -> OpenAIModelName: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._provider.name async def request( self, messages: list[ModelRequest | ModelResponse], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._responses_create( messages, False, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) return self._process_response(response, model_request_parameters) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: check_allow_model_requests() model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) response = await self._responses_create( messages, True, cast(OpenAIResponsesModelSettings, model_settings or {}), model_request_parameters ) async with response: yield await self._process_streamed_response(response, model_request_parameters) def _process_response( # noqa: C901 self, response: responses.Response, model_request_parameters: ModelRequestParameters ) -> ModelResponse: """Process a non-streamed response, and prepare a message to return.""" timestamp = number_to_datetime(response.created_at) items: list[ModelResponsePart] = [] for item in response.output: if isinstance(item, responses.ResponseReasoningItem): signature = item.encrypted_content if item.summary: for summary in item.summary: # We use the same id for all summaries so that we can merge them on the round trip. items.append( ThinkingPart( content=summary.text, id=item.id, signature=signature, provider_name=self.system if signature else None, ) ) # We only need to store the signature once. signature = None elif signature: items.append( ThinkingPart( content='', id=item.id, signature=signature, provider_name=self.system, ) ) # NOTE: We don't currently handle the raw CoT from gpt-oss `reasoning_text`: https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot # If you need this, please file an issue. elif isinstance(item, responses.ResponseOutputMessage): for content in item.content: if isinstance(content, responses.ResponseOutputText): # pragma: no branch items.append(TextPart(content.text, id=item.id)) elif isinstance(item, responses.ResponseFunctionToolCall): items.append( ToolCallPart( item.name, item.arguments, tool_call_id=item.call_id, id=item.id, ) ) elif isinstance(item, responses.ResponseCodeInterpreterToolCall): call_part, return_part, file_parts = _map_code_interpreter_tool_call(item, self.system) items.append(call_part) if file_parts: items.extend(file_parts) items.append(return_part) elif isinstance(item, responses.ResponseFunctionWebSearch): call_part, return_part = _map_web_search_tool_call(item, self.system) items.append(call_part) items.append(return_part) elif isinstance(item, responses.response_output_item.ImageGenerationCall): call_part, return_part, file_part = _map_image_generation_tool_call(item, self.system) items.append(call_part) if file_part: # pragma: no branch items.append(file_part) items.append(return_part) elif isinstance(item, responses.ResponseComputerToolCall): # pragma: no cover # Pydantic AI doesn't yet support the ComputerUse built-in tool pass elif isinstance(item, responses.ResponseCustomToolCall): # pragma: no cover # Support is being implemented in https://github.com/pydantic/pydantic-ai/pull/2572 pass elif isinstance(item, responses.response_output_item.LocalShellCall): # pragma: no cover # Pydantic AI doesn't yet support the `codex-mini-latest` LocalShell built-in tool pass elif isinstance(item, responses.ResponseFileSearchToolCall): # pragma: no cover # Pydantic AI doesn't yet support the FileSearch built-in tool pass elif isinstance( # pragma: no cover item, responses.response_output_item.McpCall | responses.response_output_item.McpListTools | responses.response_output_item.McpApprovalRequest, ): # Pydantic AI supports MCP natively pass finish_reason: FinishReason | None = None provider_details: dict[str, Any] | None = None raw_finish_reason = details.reason if (details := response.incomplete_details) else response.status if raw_finish_reason: provider_details = {'finish_reason': raw_finish_reason} finish_reason = _RESPONSES_FINISH_REASON_MAP.get(raw_finish_reason) return ModelResponse( parts=items, usage=_map_usage(response), model_name=response.model, provider_response_id=response.id, timestamp=timestamp, provider_name=self._provider.name, finish_reason=finish_reason, provider_details=provider_details, ) async def _process_streamed_response( self, response: AsyncStream[responses.ResponseStreamEvent], model_request_parameters: ModelRequestParameters, ) -> OpenAIResponsesStreamedResponse: """Process a streamed response, and prepare a streaming response to return.""" peekable_response = _utils.PeekableAsyncStream(response) first_chunk = await peekable_response.peek() if isinstance(first_chunk, _utils.Unset): # pragma: no cover raise UnexpectedModelBehavior('Streamed response ended without content or tool calls') assert isinstance(first_chunk, responses.ResponseCreatedEvent) return OpenAIResponsesStreamedResponse( model_request_parameters=model_request_parameters, _model_name=first_chunk.response.model, _response=peekable_response, _timestamp=number_to_datetime(first_chunk.response.created_at), _provider_name=self._provider.name, ) @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[False], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response: ... @overload async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: Literal[True], model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> AsyncStream[responses.ResponseStreamEvent]: ... async def _responses_create( self, messages: list[ModelRequest | ModelResponse], stream: bool, model_settings: OpenAIResponsesModelSettings, model_request_parameters: ModelRequestParameters, ) -> responses.Response | AsyncStream[responses.ResponseStreamEvent]: tools = ( self._get_builtin_tools(model_request_parameters) + list(model_settings.get('openai_builtin_tools', [])) + self._get_tools(model_request_parameters) ) if not tools: tool_choice: Literal['none', 'required', 'auto'] | None = None elif not model_request_parameters.allow_text_output: tool_choice = 'required' else: tool_choice = 'auto' previous_response_id = model_settings.get('openai_previous_response_id') if previous_response_id == 'auto': previous_response_id, messages = self._get_previous_response_id_and_new_messages(messages) instructions, openai_messages = await self._map_messages(messages, model_settings) reasoning = self._get_reasoning(model_settings) text: responses.ResponseTextConfigParam | None = None if model_request_parameters.output_mode == 'native': output_object = model_request_parameters.output_object assert output_object is not None text = {'format': self._map_json_schema(output_object)} elif ( model_request_parameters.output_mode == 'prompted' and self.profile.supports_json_object_output ): # pragma: no branch text = {'format': {'type': 'json_object'}} # Without this trick, we'd hit this error: # > Response input messages must contain the word 'json' in some form to use 'text.format' of type 'json_object'. # Apparently they're only checking input messages for "JSON", not instructions. assert isinstance(instructions, str) openai_messages.insert(0, responses.EasyInputMessageParam(role='system', content=instructions)) instructions = NOT_GIVEN if verbosity := model_settings.get('openai_text_verbosity'): text = text or {} text['verbosity'] = verbosity profile = OpenAIModelProfile.from_profile(self.profile) unsupported_model_settings = profile.openai_unsupported_model_settings for setting in unsupported_model_settings: model_settings.pop(setting, None) include: list[responses.ResponseIncludable] = [] if profile.openai_supports_encrypted_reasoning_content: include.append('reasoning.encrypted_content') if model_settings.get('openai_include_code_execution_outputs'): include.append('code_interpreter_call.outputs') if model_settings.get('openai_include_web_search_sources'): include.append('web_search_call.action.sources') # pyright: ignore[reportArgumentType] try: extra_headers = model_settings.get('extra_headers', {}) extra_headers.setdefault('User-Agent', get_user_agent()) return await self.client.responses.create( input=openai_messages, model=self._model_name, instructions=instructions, parallel_tool_calls=model_settings.get('parallel_tool_calls', NOT_GIVEN), tools=tools or NOT_GIVEN, tool_choice=tool_choice or NOT_GIVEN, max_output_tokens=model_settings.get('max_tokens', NOT_GIVEN), stream=stream, temperature=model_settings.get('temperature', NOT_GIVEN), top_p=model_settings.get('top_p', NOT_GIVEN), truncation=model_settings.get('openai_truncation', NOT_GIVEN), timeout=model_settings.get('timeout', NOT_GIVEN), service_tier=model_settings.get('openai_service_tier', NOT_GIVEN), previous_response_id=previous_response_id or NOT_GIVEN, reasoning=reasoning, user=model_settings.get('openai_user', NOT_GIVEN), text=text or NOT_GIVEN, include=include or NOT_GIVEN, extra_headers=extra_headers, extra_body=model_settings.get('extra_body'), ) except APIStatusError as e: if (status_code := e.status_code) >= 400: raise ModelHTTPError(status_code=status_code, model_name=self.model_name, body=e.body) from e raise # pragma: lax no cover def _get_reasoning(self, model_settings: OpenAIResponsesModelSettings) -> Reasoning | NotGiven: reasoning_effort = model_settings.get('openai_reasoning_effort', None) reasoning_summary = model_settings.get('openai_reasoning_summary', None) reasoning_generate_summary = model_settings.get('openai_reasoning_generate_summary', None) if reasoning_summary and reasoning_generate_summary: # pragma: no cover raise ValueError('`openai_reasoning_summary` and `openai_reasoning_generate_summary` cannot both be set.') if reasoning_generate_summary is not None: # pragma: no cover warnings.warn( '`openai_reasoning_generate_summary` is deprecated, use `openai_reasoning_summary` instead', DeprecationWarning, ) reasoning_summary = reasoning_generate_summary if reasoning_effort is None and reasoning_summary is None: return NOT_GIVEN return Reasoning(effort=reasoning_effort, summary=reasoning_summary) def _get_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.FunctionToolParam]: return [self._map_tool_definition(r) for r in model_request_parameters.tool_defs.values()] def _get_builtin_tools(self, model_request_parameters: ModelRequestParameters) -> list[responses.ToolParam]: tools: list[responses.ToolParam] = [] has_image_generating_tool = False for tool in model_request_parameters.builtin_tools: if isinstance(tool, WebSearchTool): web_search_tool = responses.WebSearchToolParam( type='web_search', search_context_size=tool.search_context_size ) if tool.user_location: web_search_tool['user_location'] = responses.web_search_tool_param.UserLocation( type='approximate', **tool.user_location ) tools.append(web_search_tool) elif isinstance(tool, CodeExecutionTool): has_image_generating_tool = True tools.append({'type': 'code_interpreter', 'container': {'type': 'auto'}}) elif isinstance(tool, ImageGenerationTool): # pragma: no branch has_image_generating_tool = True tools.append( responses.tool_param.ImageGeneration( type='image_generation', background=tool.background, input_fidelity=tool.input_fidelity, moderation=tool.moderation, output_compression=tool.output_compression, output_format=tool.output_format or 'png', partial_images=tool.partial_images, quality=tool.quality, size=tool.size, ) ) else: raise UserError( # pragma: no cover f'`{tool.__class__.__name__}` is not supported by `OpenAIResponsesModel`. If it should be, please file an issue.' ) if model_request_parameters.allow_image_output and not has_image_generating_tool: tools.append({'type': 'image_generation'}) return tools def _map_tool_definition(self, f: ToolDefinition) -> responses.FunctionToolParam: return { 'name': f.name, 'parameters': f.parameters_json_schema, 'type': 'function', 'description': f.description, 'strict': bool( f.strict and OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition ), } def _get_previous_response_id_and_new_messages( self, messages: list[ModelMessage] ) -> tuple[str | None, list[ModelMessage]]: # When `openai_previous_response_id` is set to 'auto', the most recent # `provider_response_id` from the message history is selected and all # earlier messages are omitted. This allows the OpenAI SDK to reuse # server-side history for efficiency. The returned tuple contains the # `previous_response_id` (if found) and the trimmed list of messages. previous_response_id = None trimmed_messages: list[ModelMessage] = [] for m in reversed(messages): if isinstance(m, ModelResponse) and m.provider_name == self.system: previous_response_id = m.provider_response_id break else: trimmed_messages.append(m) if previous_response_id and trimmed_messages: return previous_response_id, list(reversed(trimmed_messages)) else: return None, messages async def _map_messages( # noqa: C901 self, messages: list[ModelMessage], model_settings: OpenAIResponsesModelSettings ) -> tuple[str | NotGiven, list[responses.ResponseInputItemParam]]: """Just maps a `pydantic_ai.Message` to a `openai.types.responses.ResponseInputParam`.""" profile = OpenAIModelProfile.from_profile(self.profile) send_item_ids = model_settings.get( 'openai_send_reasoning_ids', profile.openai_supports_encrypted_reasoning_content ) openai_messages: list[responses.ResponseInputItemParam] = [] for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, SystemPromptPart): openai_messages.append(responses.EasyInputMessageParam(role='system', content=part.content)) elif isinstance(part, UserPromptPart): openai_messages.append(await self._map_user_prompt(part)) elif isinstance(part, ToolReturnPart): call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response_str(), ) openai_messages.append(item) elif isinstance(part, RetryPromptPart): if part.tool_name is None: openai_messages.append( Message(role='user', content=[{'type': 'input_text', 'text': part.model_response()}]) ) else: call_id = _guard_tool_call_id(t=part) call_id, _ = _split_combined_tool_call_id(call_id) item = FunctionCallOutput( type='function_call_output', call_id=call_id, output=part.model_response(), ) openai_messages.append(item) else: assert_never(part) elif isinstance(message, ModelResponse): send_item_ids = send_item_ids and message.provider_name == self.system message_item: responses.ResponseOutputMessageParam | None = None reasoning_item: responses.ResponseReasoningItemParam | None = None web_search_item: responses.ResponseFunctionWebSearchParam | None = None code_interpreter_item: responses.ResponseCodeInterpreterToolCallParam | None = None for item in message.parts: if isinstance(item, TextPart): if item.id and send_item_ids: if message_item is None or message_item['id'] != item.id: # pragma: no branch message_item = responses.ResponseOutputMessageParam( role='assistant', id=item.id, content=[], type='message', status='completed', ) openai_messages.append(message_item) message_item['content'] = [ *message_item['content'], responses.ResponseOutputTextParam( text=item.content, type='output_text', annotations=[] ), ] else: openai_messages.append( responses.EasyInputMessageParam(role='assistant', content=item.content) ) elif isinstance(item, ToolCallPart): call_id = _guard_tool_call_id(t=item) call_id, id = _split_combined_tool_call_id(call_id) id = id or item.id param = responses.ResponseFunctionToolCallParam( name=item.tool_name, arguments=item.args_as_json_str(), call_id=call_id, type='function_call', ) if id and send_item_ids: # pragma: no branch param['id'] = id openai_messages.append(param) elif isinstance(item, BuiltinToolCallPart): if item.provider_name == self.system and send_item_ids: if ( item.tool_name == CodeExecutionTool.kind and item.tool_call_id and (args := item.args_as_dict()) and (container_id := args.get('container_id')) ): code_interpreter_item = responses.ResponseCodeInterpreterToolCallParam( id=item.tool_call_id, code=args.get('code'), container_id=container_id, outputs=None, # These can be read server-side status='completed', type='code_interpreter_call', ) openai_messages.append(code_interpreter_item) elif ( item.tool_name == WebSearchTool.kind and item.tool_call_id and (args := item.args_as_dict()) ): web_search_item = responses.ResponseFunctionWebSearchParam( id=item.tool_call_id, action=cast(responses.response_function_web_search_param.Action, args), status='completed', type='web_search_call', ) openai_messages.append(web_search_item) elif item.tool_name == ImageGenerationTool.kind and item.tool_call_id: # pragma: no branch # The cast is necessary because of https://github.com/openai/openai-python/issues/2648 image_generation_item = cast( responses.response_input_item_param.ImageGenerationCall, { 'id': item.tool_call_id, 'type': 'image_generation_call', }, ) openai_messages.append(image_generation_item) elif isinstance(item, BuiltinToolReturnPart): if item.provider_name == self.system and send_item_ids: if ( item.tool_name == CodeExecutionTool.kind and code_interpreter_item is not None and isinstance(item.content, dict) and (content := cast(dict[str, Any], item.content)) # pyright: ignore[reportUnknownMemberType] and (status := content.get('status')) ): code_interpreter_item['status'] = status elif ( item.tool_name == WebSearchTool.kind and web_search_item is not None and isinstance(item.content, dict) # pyright: ignore[reportUnknownMemberType] and (content := cast(dict[str, Any], item.content)) # pyright: ignore[reportUnknownMemberType] and (status := content.get('status')) ): web_search_item['status'] = status elif item.tool_name == ImageGenerationTool.kind: # pragma: no branch # Image generation result does not need to be sent back, just the `id` off of `BuiltinToolCallPart`. pass elif isinstance(item, FilePart): # This was generated by the `ImageGenerationTool` or `CodeExecutionTool`, # and does not need to be sent back separately from the corresponding `BuiltinToolReturnPart`. # If `send_item_ids` is false, we won't send the `BuiltinToolReturnPart`, but OpenAI does not have a type for files from the assistant. pass elif isinstance(item, ThinkingPart): if item.id and send_item_ids: signature: str | None = None if ( item.signature and item.provider_name == self.system and profile.openai_supports_encrypted_reasoning_content ): signature = item.signature if (reasoning_item is None or reasoning_item['id'] != item.id) and ( signature or item.content ): # pragma: no branch reasoning_item = responses.ResponseReasoningItemParam( id=item.id, summary=[], encrypted_content=signature, type='reasoning', ) openai_messages.append(reasoning_item) if item.content: # The check above guarantees that `reasoning_item` is not None assert reasoning_item is not None reasoning_item['summary'] = [ *reasoning_item['summary'], Summary(text=item.content, type='summary_text'), ] else: start_tag, end_tag = profile.thinking_tags openai_messages.append( responses.EasyInputMessageParam( role='assistant', content='\n'.join([start_tag, item.content, end_tag]) ) ) else: assert_never(item) else: assert_never(message) instructions = self._get_instructions(messages) or NOT_GIVEN return instructions, openai_messages def _map_json_schema(self, o: OutputObjectDefinition) -> responses.ResponseFormatTextJSONSchemaConfigParam: response_format_param: responses.ResponseFormatTextJSONSchemaConfigParam = { 'type': 'json_schema', 'name': o.name or DEFAULT_OUTPUT_TOOL_NAME, 'schema': o.json_schema, } if o.description: response_format_param['description'] = o.description if OpenAIModelProfile.from_profile(self.profile).openai_supports_strict_tool_definition: # pragma: no branch response_format_param['strict'] = o.strict return response_format_param @staticmethod async def _map_user_prompt(part: UserPromptPart) -> responses.EasyInputMessageParam: content: str | list[responses.ResponseInputContentParam] if isinstance(part.content, str): content = part.content else: content = [] for item in part.content: if isinstance(item, str): content.append(responses.ResponseInputTextParam(text=item, type='input_text')) elif isinstance(item, BinaryContent): if item.is_image: detail: Literal['auto', 'low', 'high'] = 'auto' if metadata := item.vendor_metadata: detail = cast( Literal['auto', 'low', 'high'], metadata.get('detail', 'auto'), ) content.append( responses.ResponseInputImageParam( image_url=item.data_uri, type='input_image', detail=detail, ) ) elif item.is_document: content.append( responses.ResponseInputFileParam( type='input_file', file_data=item.data_uri, # NOTE: Type wise it's not necessary to include the filename, but it's required by the # API itself. If we add empty string, the server sends a 500 error - which OpenAI needs # to fix. In any case, we add a placeholder name. filename=f'filename.{item.format}', ) ) elif item.is_audio: raise NotImplementedError('Audio as binary content is not supported for OpenAI Responses API.') else: # pragma: no cover raise RuntimeError(f'Unsupported binary content type: {item.media_type}') elif isinstance(item, ImageUrl): detail: Literal['auto', 'low', 'high'] = 'auto' image_url = item.url if metadata := item.vendor_metadata: detail = cast(Literal['auto', 'low', 'high'], metadata.get('detail', 'auto')) if item.force_download: downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') image_url = downloaded_item['data'] content.append( responses.ResponseInputImageParam( image_url=image_url, type='input_image', detail=detail, ) ) elif isinstance(item, AudioUrl): # pragma: no cover downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( responses.ResponseInputFileParam( type='input_file', file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ) ) elif isinstance(item, DocumentUrl): downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension') content.append( responses.ResponseInputFileParam( type='input_file', file_data=downloaded_item['data'], filename=f'filename.{downloaded_item["data_type"]}', ) ) elif isinstance(item, VideoUrl): # pragma: no cover raise NotImplementedError('VideoUrl is not supported for OpenAI.') else: assert_never(item) return responses.EasyInputMessageParam(role='user', content=content) ``` #### __init__ ```python __init__( model_name: OpenAIModelName, *, provider: ( Literal[ "openai", "deepseek", "azure", "openrouter", "grok", "fireworks", "together", "nebius", ] | Provider[AsyncOpenAI] ) = "openai", profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize an OpenAI Responses model. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `model_name` | `OpenAIModelName` | The name of the OpenAI model to use. | *required* | | `provider` | `Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together', 'nebius'] | Provider[AsyncOpenAI]` | The provider to use. Defaults to 'openai'. | `'openai'` | | `profile` | `ModelProfileSpec | None` | The model profile to use. Defaults to a profile picked by the provider based on the model name. | `None` | | `settings` | `ModelSettings | None` | Default model settings for this model instance. | `None` | Source code in `pydantic_ai_slim/pydantic_ai/models/openai.py` ```python def __init__( self, model_name: OpenAIModelName, *, provider: Literal['openai', 'deepseek', 'azure', 'openrouter', 'grok', 'fireworks', 'together', 'nebius'] | Provider[AsyncOpenAI] = 'openai', profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize an OpenAI Responses model. Args: model_name: The name of the OpenAI model to use. provider: The provider to use. Defaults to `'openai'`. profile: The model profile to use. Defaults to a profile picked by the provider based on the model name. settings: Default model settings for this model instance. """ self._model_name = model_name if isinstance(provider, str): provider = infer_provider(provider) self._provider = provider self.client = provider.client super().__init__(settings=settings, profile=profile or provider.model_profile) ``` #### model_name ```python model_name: OpenAIModelName ``` The model name. #### system ```python system: str ``` The model provider. # `pydantic_ai.models.test` Utility model for quickly testing apps built with Pydantic AI. Here's a minimal example: test_model_usage.py ```py from pydantic_ai import Agent from pydantic_ai.models.test import TestModel my_agent = Agent('openai:gpt-4o', system_prompt='...') async def test_my_agent(): """Unit test for my_agent, to be run by pytest.""" m = TestModel() with my_agent.override(model=m): result = await my_agent.run('Testing my agent...') assert result.output == 'success (no tool calls)' assert m.last_model_request_parameters.function_tools == [] ``` See [Unit testing with `TestModel`](../../../testing/#unit-testing-with-testmodel) for detailed documentation. ### TestModel Bases: `Model` A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass(init=False) class TestModel(Model): """A model specifically for testing purposes. This will (by default) call all tools in the agent, then return a tool response if possible, otherwise a plain response. How useful this model is will vary significantly. Apart from `__init__` derived by the `dataclass` decorator, all methods are private or match those of the base class. """ # NOTE: Avoid test discovery by pytest. __test__ = False call_tools: list[str] | Literal['all'] = 'all' """List of tools to call. If `'all'`, all tools will be called.""" custom_output_text: str | None = None """If set, this text is returned as the final output.""" custom_output_args: Any | None = None """If set, these args will be passed to the output tool.""" seed: int = 0 """Seed for generating random data.""" last_model_request_parameters: ModelRequestParameters | None = field(default=None, init=False) """The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. """ _model_name: str = field(default='test', repr=False) _system: str = field(default='test', repr=False) def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = 'test' self._system = 'test' super().__init__(settings=settings, profile=profile) async def request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) model_response.usage = _estimate_usage([*messages, model_response]) return model_response @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: model_settings, model_request_parameters = self.prepare_request( model_settings, model_request_parameters, ) self.last_model_request_parameters = model_request_parameters model_response = self._request(messages, model_settings, model_request_parameters) yield TestStreamedResponse( model_request_parameters=model_request_parameters, _model_name=self._model_name, _structured_response=model_response, _messages=messages, _provider_name=self._system, ) @property def model_name(self) -> str: """The model name.""" return self._model_name @property def system(self) -> str: """The model provider.""" return self._system def gen_tool_args(self, tool_def: ToolDefinition) -> Any: return _JsonSchemaTestData(tool_def.parameters_json_schema, self.seed).generate() def _get_tool_calls(self, model_request_parameters: ModelRequestParameters) -> list[tuple[str, ToolDefinition]]: if self.call_tools == 'all': return [(r.name, r) for r in model_request_parameters.function_tools] else: function_tools_lookup = {t.name: t for t in model_request_parameters.function_tools} tools_to_call = (function_tools_lookup[name] for name in self.call_tools) return [(r.name, r) for r in tools_to_call] def _get_output(self, model_request_parameters: ModelRequestParameters) -> _WrappedTextOutput | _WrappedToolOutput: if self.custom_output_text is not None: assert model_request_parameters.output_mode != 'tool', ( 'Plain response not allowed, but `custom_output_text` is set.' ) assert self.custom_output_args is None, 'Cannot set both `custom_output_text` and `custom_output_args`.' return _WrappedTextOutput(self.custom_output_text) elif self.custom_output_args is not None: assert model_request_parameters.output_tools is not None, ( 'No output tools provided, but `custom_output_args` is set.' ) output_tool = model_request_parameters.output_tools[0] if k := output_tool.outer_typed_dict_key: return _WrappedToolOutput({k: self.custom_output_args}) else: return _WrappedToolOutput(self.custom_output_args) elif model_request_parameters.allow_text_output: return _WrappedTextOutput(None) elif model_request_parameters.output_tools: return _WrappedToolOutput(None) else: return _WrappedTextOutput(None) def _request( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> ModelResponse: if model_request_parameters.builtin_tools: raise UserError('TestModel does not support built-in tools') tool_calls = self._get_tool_calls(model_request_parameters) output_wrapper = self._get_output(model_request_parameters) output_tools = model_request_parameters.output_tools # if there are tools, the first thing we want to do is call all of them if tool_calls and not any(isinstance(m, ModelResponse) for m in messages): return ModelResponse( parts=[ ToolCallPart(name, self.gen_tool_args(args), tool_call_id=f'pyd_ai_tool_call_id__{name}') for name, args in tool_calls ], model_name=self._model_name, ) if messages: # pragma: no branch last_message = messages[-1] assert isinstance(last_message, ModelRequest), 'Expected last message to be a `ModelRequest`.' # check if there are any retry prompts, if so retry them new_retry_names = {p.tool_name for p in last_message.parts if isinstance(p, RetryPromptPart)} if new_retry_names: # Handle retries for both function tools and output tools # Check function tools first retry_parts: list[ModelResponsePart] = [ ToolCallPart(name, self.gen_tool_args(args)) for name, args in tool_calls if name in new_retry_names ] # Check output tools if output_tools: retry_parts.extend( [ ToolCallPart( tool.name, output_wrapper.value if isinstance(output_wrapper, _WrappedToolOutput) and output_wrapper.value is not None else self.gen_tool_args(tool), tool_call_id=f'pyd_ai_tool_call_id__{tool.name}', ) for tool in output_tools if tool.name in new_retry_names ] ) return ModelResponse(parts=retry_parts, model_name=self._model_name) if isinstance(output_wrapper, _WrappedTextOutput): if (response_text := output_wrapper.value) is None: # build up details of tool responses output: dict[str, Any] = {} for message in messages: if isinstance(message, ModelRequest): for part in message.parts: if isinstance(part, ToolReturnPart): output[part.tool_name] = part.content if output: return ModelResponse( parts=[TextPart(pydantic_core.to_json(output).decode())], model_name=self._model_name ) else: return ModelResponse(parts=[TextPart('success (no tool calls)')], model_name=self._model_name) else: return ModelResponse(parts=[TextPart(response_text)], model_name=self._model_name) else: assert output_tools, 'No output tools provided' custom_output_args = output_wrapper.value output_tool = output_tools[self.seed % len(output_tools)] if custom_output_args is not None: return ModelResponse( parts=[ ToolCallPart( output_tool.name, custom_output_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) else: response_args = self.gen_tool_args(output_tool) return ModelResponse( parts=[ ToolCallPart( output_tool.name, response_args, tool_call_id=f'pyd_ai_tool_call_id__{output_tool.name}', ) ], model_name=self._model_name, ) ``` #### __init__ ```python __init__( *, call_tools: list[str] | Literal["all"] = "all", custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None ) ``` Initialize TestModel with optional settings and profile. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python def __init__( self, *, call_tools: list[str] | Literal['all'] = 'all', custom_output_text: str | None = None, custom_output_args: Any | None = None, seed: int = 0, profile: ModelProfileSpec | None = None, settings: ModelSettings | None = None, ): """Initialize TestModel with optional settings and profile.""" self.call_tools = call_tools self.custom_output_text = custom_output_text self.custom_output_args = custom_output_args self.seed = seed self.last_model_request_parameters = None self._model_name = 'test' self._system = 'test' super().__init__(settings=settings, profile=profile) ``` #### call_tools ```python call_tools: list[str] | Literal['all'] = call_tools ``` List of tools to call. If `'all'`, all tools will be called. #### custom_output_text ```python custom_output_text: str | None = custom_output_text ``` If set, this text is returned as the final output. #### custom_output_args ```python custom_output_args: Any | None = custom_output_args ``` If set, these args will be passed to the output tool. #### seed ```python seed: int = seed ``` Seed for generating random data. #### last_model_request_parameters ```python last_model_request_parameters: ( ModelRequestParameters | None ) = None ``` The last ModelRequestParameters passed to the model in a request. The ModelRequestParameters contains information about the function and output tools available during request handling. This is set when a request is made, so will reflect the function tools from the last step of the last run. #### model_name ```python model_name: str ``` The model name. #### system ```python system: str ``` The model provider. ### TestStreamedResponse Bases: `StreamedResponse` A structured response that streams test data. Source code in `pydantic_ai_slim/pydantic_ai/models/test.py` ```python @dataclass class TestStreamedResponse(StreamedResponse): """A structured response that streams test data.""" _model_name: str _structured_response: ModelResponse _messages: InitVar[Iterable[ModelMessage]] _provider_name: str _timestamp: datetime = field(default_factory=_utils.now_utc, init=False) def __post_init__(self, _messages: Iterable[ModelMessage]): self._usage = _estimate_usage(_messages) async def _get_event_iterator(self) -> AsyncIterator[ModelResponseStreamEvent]: for i, part in enumerate(self._structured_response.parts): if isinstance(part, TextPart): text = part.content *words, last_word = text.split(' ') words = [f'{word} ' for word in words] words.append(last_word) if len(words) == 1 and len(text) > 2: mid = len(text) // 2 words = [text[:mid], text[mid:]] self._usage += _get_string_usage('') maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=i, content='') if maybe_event is not None: # pragma: no branch yield maybe_event for word in words: self._usage += _get_string_usage(word) maybe_event = self._parts_manager.handle_text_delta(vendor_part_id=i, content=word) if maybe_event is not None: # pragma: no branch yield maybe_event elif isinstance(part, ToolCallPart): yield self._parts_manager.handle_tool_call_part( vendor_part_id=i, tool_name=part.tool_name, args=part.args, tool_call_id=part.tool_call_id ) elif isinstance(part, BuiltinToolCallPart | BuiltinToolReturnPart): # pragma: no cover # NOTE: These parts are not generated by TestModel, but we need to handle them for type checking assert False, f'Unexpected part type in TestModel: {type(part).__name__}' elif isinstance(part, ThinkingPart): # pragma: no cover # NOTE: There's no way to reach this part of the code, since we don't generate ThinkingPart on TestModel. assert False, "This should be unreachable — we don't generate ThinkingPart on TestModel." elif isinstance(part, FilePart): # pragma: no cover # NOTE: There's no way to reach this part of the code, since we don't generate FilePart on TestModel. assert False, "This should be unreachable — we don't generate FilePart on TestModel." else: assert_never(part) @property def model_name(self) -> str: """Get the model name of the response.""" return self._model_name @property def provider_name(self) -> str: """Get the provider name.""" return self._provider_name @property def timestamp(self) -> datetime: """Get the timestamp of the response.""" return self._timestamp ``` #### model_name ```python model_name: str ``` Get the model name of the response. #### provider_name ```python provider_name: str ``` Get the provider name. #### timestamp ```python timestamp: datetime ``` Get the timestamp of the response. # pydantic_ai.models.wrapper ### WrapperModel Bases: `Model` Model which wraps another model. Does nothing on its own, used as a base class. Source code in `pydantic_ai_slim/pydantic_ai/models/wrapper.py` ```python @dataclass(init=False) class WrapperModel(Model): """Model which wraps another model. Does nothing on its own, used as a base class. """ wrapped: Model """The underlying model being wrapped.""" def __init__(self, wrapped: Model | KnownModelName): super().__init__() self.wrapped = infer_model(wrapped) async def request(self, *args: Any, **kwargs: Any) -> ModelResponse: return await self.wrapped.request(*args, **kwargs) @asynccontextmanager async def request_stream( self, messages: list[ModelMessage], model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, run_context: RunContext[Any] | None = None, ) -> AsyncIterator[StreamedResponse]: async with self.wrapped.request_stream( messages, model_settings, model_request_parameters, run_context ) as response_stream: yield response_stream def customize_request_parameters(self, model_request_parameters: ModelRequestParameters) -> ModelRequestParameters: return self.wrapped.customize_request_parameters(model_request_parameters) def prepare_request( self, model_settings: ModelSettings | None, model_request_parameters: ModelRequestParameters, ) -> tuple[ModelSettings | None, ModelRequestParameters]: return self.wrapped.prepare_request(model_settings, model_request_parameters) @property def model_name(self) -> str: return self.wrapped.model_name @property def system(self) -> str: return self.wrapped.system @cached_property def profile(self) -> ModelProfile: return self.wrapped.profile @property def settings(self) -> ModelSettings | None: """Get the settings from the wrapped model.""" return self.wrapped.settings def __getattr__(self, item: str): return getattr(self.wrapped, item) ``` #### wrapped ```python wrapped: Model = infer_model(wrapped) ``` The underlying model being wrapped. #### settings ```python settings: ModelSettings | None ``` Get the settings from the wrapped model. # `pydantic_evals.dataset` Dataset management for pydantic evals. This module provides functionality for creating, loading, saving, and evaluating datasets of test cases. Each case must have inputs, and can optionally have a name, expected output, metadata, and case-specific evaluators. Datasets can be loaded from and saved to YAML or JSON files, and can be evaluated against a task function to produce an evaluation report. ### Case Bases: `Generic[InputsT, OutputT, MetadataT]` A single row of a Dataset. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python @dataclass(init=False) class Case(Generic[InputsT, OutputT, MetadataT]): """A single row of a [`Dataset`][pydantic_evals.Dataset]. Each case represents a single test scenario with inputs to test. A case may optionally specify a name, expected outputs to compare against, and arbitrary metadata. Cases can also have their own specific evaluators which are run in addition to dataset-level evaluators. Example: ```python from pydantic_evals import Case case = Case( name='Simple addition', inputs={'a': 1, 'b': 2}, expected_output=3, metadata={'description': 'Tests basic addition'}, ) ``` """ name: str | None """Name of the case. This is used to identify the case in the report and can be used to filter cases.""" inputs: InputsT """Inputs to the task. This is the input to the task that will be evaluated.""" metadata: MetadataT | None = None """Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. """ expected_output: OutputT | None = None """Expected output of the task. This is the expected output of the task that will be evaluated.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = field(default_factory=list) """Evaluators to be used just on this case.""" def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ```` #### __init__ ```python __init__( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) ``` Initialize a new test case. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | None` | Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. | `None` | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | `MetadataT | None` | Optional metadata for the case, which can be used by evaluators. | `None` | | `expected_output` | `OutputT | None` | Optional expected output of the task, used for comparison in evaluators. | `None` | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ): """Initialize a new test case. Args: name: Optional name for the case. If not provided, a generic name will be assigned when added to a dataset. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: Optional expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case. These are in addition to any dataset-level evaluators. """ # Note: `evaluators` must be a tuple instead of Sequence due to misbehavior with pyright's generic parameter # inference if it has type `Sequence` self.name = name self.inputs = inputs self.metadata = metadata self.expected_output = expected_output self.evaluators = list(evaluators) ``` #### name ```python name: str | None = name ``` Name of the case. This is used to identify the case in the report and can be used to filter cases. #### inputs ```python inputs: InputsT = inputs ``` Inputs to the task. This is the input to the task that will be evaluated. #### metadata ```python metadata: MetadataT | None = metadata ``` Metadata to be used in the evaluation. This can be used to provide additional information about the case to the evaluators. #### expected_output ```python expected_output: OutputT | None = expected_output ``` Expected output of the task. This is the expected output of the task that will be evaluated. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( list(evaluators) ) ``` Evaluators to be used just on this case. ### Dataset Bases: `BaseModel`, `Generic[InputsT, OutputT, MetadataT]` A dataset of test cases. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` Source code in `pydantic_evals/pydantic_evals/dataset.py` ````python class Dataset(BaseModel, Generic[InputsT, OutputT, MetadataT], extra='forbid', arbitrary_types_allowed=True): """A dataset of test [cases][pydantic_evals.Case]. Datasets allow you to organize a collection of test cases and evaluate them against a task function. They can be loaded from and saved to YAML or JSON files, and can have dataset-level evaluators that apply to all cases. Example: ```python # Create a dataset with two test cases from dataclasses import dataclass from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output dataset = Dataset( cases=[ Case(name='test1', inputs={'text': 'Hello'}, expected_output='HELLO'), Case(name='test2', inputs={'text': 'World'}, expected_output='WORLD'), ], evaluators=[ExactMatch()], ) # Evaluate the dataset against a task function async def uppercase(inputs: dict) -> str: return inputs['text'].upper() async def main(): report = await dataset.evaluate(uppercase) report.print() ''' Evaluation Summary: uppercase ┏━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ test1 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ test2 │ ✔ │ 10ms │ ├──────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └──────────┴────────────┴──────────┘ ''' ``` """ name: str | None = None """Optional name of the dataset.""" cases: list[Case[InputsT, OutputT, MetadataT]] """List of test cases in the dataset.""" evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = [] """List of evaluators to be used on all cases in the dataset.""" def __init__( self, *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: name: Optional name for the dataset. cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( name=name, cases=cases, evaluators=list(evaluators), ) # TODO in v2: Make everything not required keyword-only async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. Returns: A report containing the results of the evaluation. """ task_name = task_name or get_unwrapped_function_name(task) name = name or task_name total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() with ( logfire_span( 'evaluate {name}', name=name, task_name=task_name, dataset_name=self.name, n_cases=len(self.cases), **{'gen_ai.operation.name': 'experiment'}, # pyright: ignore[reportArgumentType] ) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {task_name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, span_id=span_id, trace_id=trace_id, ) if (averages := report.averages()) is not None and averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) return report def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to True. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate( task, task_name=name, max_concurrency=max_concurrency, progress=progress, retry_task=retry_task, retry_evaluators=retry_evaluators, ) ) def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') @classmethod @functools.cache def _params(cls) -> tuple[type[InputsT], type[OutputT], type[MetadataT]]: """Get the type parameters for the Dataset class. Returns: A tuple of (InputsT, OutputT, MetadataT) types. """ for c in cls.__mro__: metadata = getattr(c, '__pydantic_generic_metadata__', {}) if len(args := (metadata.get('args', ()) or getattr(c, '__args__', ()))) == 3: # pragma: no branch return args else: # pragma: no cover warnings.warn( f'Could not determine the generic parameters for {cls}; using `Any` for each.' f' You should explicitly set the generic parameters via `Dataset[MyInputs, MyOutput, MyMetadata]`' f' when serializing or deserializing.', UserWarning, ) return Any, Any, Any # type: ignore @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text() try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types, default_name=path.stem) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the serialized contents. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types, default_name=default_name) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the data. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) @classmethod def _from_dataset_model( cls, dataset_model: _DatasetModel[InputsT, OutputT, MetadataT], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), default_name: str | None = None, ) -> Self: """Create a Dataset from a _DatasetModel. Args: dataset_model: The _DatasetModel to convert. custom_evaluator_types: Custom evaluator classes to register for deserialization. default_name: Default name of the dataset, to be used if the value is `None` in the provided model. Returns: A new Dataset instance created from the _DatasetModel. """ registry = _get_registry(custom_evaluator_types) cases: list[Case[InputsT, OutputT, MetadataT]] = [] errors: list[ValueError] = [] dataset_evaluators: list[Evaluator] = [] for spec in dataset_model.evaluators: try: dataset_evaluator = _load_evaluator_from_registry(registry, None, spec) except ValueError as e: errors.append(e) continue dataset_evaluators.append(dataset_evaluator) for row in dataset_model.cases: evaluators: list[Evaluator] = [] for spec in row.evaluators: try: evaluator = _load_evaluator_from_registry(registry, row.name, spec) except ValueError as e: errors.append(e) continue evaluators.append(evaluator) row = Case[InputsT, OutputT, MetadataT]( name=row.name, inputs=row.inputs, metadata=row.metadata, expected_output=row.expected_output, ) row.evaluators = evaluators cases.append(row) if errors: raise ExceptionGroup(f'{len(errors)} error(s) loading evaluators from registry', errors[:3]) result = cls(name=dataset_model.name, cases=cases) if result.name is None: result.name = default_name result.evaluators = dataset_evaluators return result def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content) else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, context=context) path.write_text(json_data + '\n') @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 class Dataset(BaseModel, extra='forbid'): name: str | None = None cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema @classmethod def _save_schema( cls, path: Path | str, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = () ): """Save the JSON schema for this dataset type to a file. Args: path: Path to save the schema to. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) json_schema = cls.model_json_schema_with_evaluators(custom_evaluator_types) schema_content = to_json(json_schema, indent=2).decode() + '\n' if not path.exists() or path.read_text() != schema_content: # pragma: no branch path.write_text(schema_content) @classmethod @functools.cache def _serialization_type(cls) -> type[_DatasetModel[InputsT, OutputT, MetadataT]]: """Get the serialization type for this dataset class. Returns: A _DatasetModel type with the same generic parameters as this Dataset class. """ input_type, output_type, metadata_type = cls._params() return _DatasetModel[input_type, output_type, metadata_type] @classmethod def _infer_fmt(cls, path: Path, fmt: Literal['yaml', 'json'] | None) -> Literal['yaml', 'json']: """Infer the format to use for a file based on its extension. Args: path: The path to infer the format for. fmt: The explicitly provided format, if any. Returns: The inferred format ('yaml' or 'json'). Raises: ValueError: If the format cannot be inferred from the file extension. """ if fmt is not None: return fmt suffix = path.suffix.lower() if suffix in {'.yaml', '.yml'}: return 'yaml' elif suffix == '.json': return 'json' raise ValueError( f'Could not infer format for filename {path.name!r}. Use the `fmt` argument to specify the format.' ) @model_serializer(mode='wrap') def _add_json_schema(self, nxt: SerializerFunctionWrapHandler, info: SerializationInfo) -> dict[str, Any]: """Add the JSON schema path to the serialized output. See for context, that seems to be the nearest there is to a spec for this. """ context = cast(dict[str, Any] | None, info.context) if isinstance(context, dict) and (schema := context.get('$schema')): return {'$schema': schema} | nxt(self) else: return nxt(self) ```` #### name ```python name: str | None = None ``` Optional name of the dataset. #### cases ```python cases: list[Case[InputsT, OutputT, MetadataT]] ``` List of test cases in the dataset. #### evaluators ```python evaluators: list[Evaluator[InputsT, OutputT, MetadataT]] = ( [] ) ``` List of evaluators to be used on all cases in the dataset. #### __init__ ```python __init__( *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[ Evaluator[InputsT, OutputT, MetadataT] ] = () ) ``` Initialize a new dataset with test cases and optional evaluators. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | None` | Optional name for the dataset. | `None` | | `cases` | `Sequence[Case[InputsT, OutputT, MetadataT]]` | Sequence of test cases to include in the dataset. | *required* | | `evaluators` | `Sequence[Evaluator[InputsT, OutputT, MetadataT]]` | Optional sequence of evaluators to apply to all cases in the dataset. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def __init__( self, *, name: str | None = None, cases: Sequence[Case[InputsT, OutputT, MetadataT]], evaluators: Sequence[Evaluator[InputsT, OutputT, MetadataT]] = (), ): """Initialize a new dataset with test cases and optional evaluators. Args: name: Optional name for the dataset. cases: Sequence of test cases to include in the dataset. evaluators: Optional sequence of evaluators to apply to all cases in the dataset. """ case_names = set[str]() for case in cases: if case.name is None: continue if case.name in case_names: raise ValueError(f'Duplicate case name: {case.name!r}') case_names.add(case.name) super().__init__( name=name, cases=cases, evaluators=list(evaluators), ) ``` #### evaluate ```python evaluate( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `task` | `Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT]` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | *required* | | `name` | `str | None` | The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. | `None` | | `max_concurrency` | `int | None` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | `None` | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | | `retry_task` | `RetryConfig | None` | Optional retry configuration for the task execution. | `None` | | `retry_evaluators` | `RetryConfig | None` | Optional retry configuration for evaluator execution. | `None` | | `task_name` | `str | None` | Optional override to the name of the task being executed, otherwise the name of the task function will be used. | `None` | Returns: | Type | Description | | --- | --- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python async def evaluate( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, *, task_name: str | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This method runs the task on each case in the dataset, applies evaluators, and collects results into a report. Cases are run concurrently, limited by `max_concurrency` if specified. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the experiment being run, this is used to identify the experiment in the report. If omitted, the task_name will be used; if that is not specified, the name of the task function is used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to `True`. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. task_name: Optional override to the name of the task being executed, otherwise the name of the task function will be used. Returns: A report containing the results of the evaluation. """ task_name = task_name or get_unwrapped_function_name(task) name = name or task_name total_cases = len(self.cases) progress_bar = Progress() if progress else None limiter = anyio.Semaphore(max_concurrency) if max_concurrency is not None else AsyncExitStack() with ( logfire_span( 'evaluate {name}', name=name, task_name=task_name, dataset_name=self.name, n_cases=len(self.cases), **{'gen_ai.operation.name': 'experiment'}, # pyright: ignore[reportArgumentType] ) as eval_span, progress_bar or nullcontext(), ): task_id = progress_bar.add_task(f'Evaluating {task_name}', total=total_cases) if progress_bar else None async def _handle_case(case: Case[InputsT, OutputT, MetadataT], report_case_name: str): async with limiter: result = await _run_task_and_evaluators( task, case, report_case_name, self.evaluators, retry_task, retry_evaluators ) if progress_bar and task_id is not None: # pragma: no branch progress_bar.update(task_id, advance=1) return result if (context := eval_span.context) is None: # pragma: no cover trace_id = None span_id = None else: trace_id = f'{context.trace_id:032x}' span_id = f'{context.span_id:016x}' cases_and_failures = await task_group_gather( [ lambda case=case, i=i: _handle_case(case, case.name or f'Case {i}') for i, case in enumerate(self.cases, 1) ] ) cases: list[ReportCase] = [] failures: list[ReportCaseFailure] = [] for item in cases_and_failures: if isinstance(item, ReportCase): cases.append(item) else: failures.append(item) report = EvaluationReport( name=name, cases=cases, failures=failures, span_id=span_id, trace_id=trace_id, ) if (averages := report.averages()) is not None and averages.assertions is not None: eval_span.set_attribute('assertion_pass_rate', averages.assertions) return report ``` #### evaluate_sync ```python evaluate_sync( task: ( Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT] ), name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT] ``` Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around evaluate provided for convenience. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `task` | `Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT]` | The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. | *required* | | `name` | `str | None` | The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. | `None` | | `max_concurrency` | `int | None` | The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. | `None` | | `progress` | `bool` | Whether to show a progress bar for the evaluation. Defaults to True. | `True` | | `retry_task` | `RetryConfig | None` | Optional retry configuration for the task execution. | `None` | | `retry_evaluators` | `RetryConfig | None` | Optional retry configuration for evaluator execution. | `None` | Returns: | Type | Description | | --- | --- | | `EvaluationReport[InputsT, OutputT, MetadataT]` | A report containing the results of the evaluation. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def evaluate_sync( self, task: Callable[[InputsT], Awaitable[OutputT]] | Callable[[InputsT], OutputT], name: str | None = None, max_concurrency: int | None = None, progress: bool = True, retry_task: RetryConfig | None = None, retry_evaluators: RetryConfig | None = None, ) -> EvaluationReport[InputsT, OutputT, MetadataT]: """Evaluates the test cases in the dataset using the given task. This is a synchronous wrapper around [`evaluate`][pydantic_evals.Dataset.evaluate] provided for convenience. Args: task: The task to evaluate. This should be a callable that takes the inputs of the case and returns the output. name: The name of the task being evaluated, this is used to identify the task in the report. If omitted, the name of the task function will be used. max_concurrency: The maximum number of concurrent evaluations of the task to allow. If None, all cases will be evaluated concurrently. progress: Whether to show a progress bar for the evaluation. Defaults to True. retry_task: Optional retry configuration for the task execution. retry_evaluators: Optional retry configuration for evaluator execution. Returns: A report containing the results of the evaluation. """ return get_event_loop().run_until_complete( self.evaluate( task, task_name=name, max_concurrency=max_concurrency, progress=progress, retry_task=retry_task, retry_evaluators=retry_evaluators, ) ) ``` #### add_case ```python add_case( *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[ Evaluator[InputsT, OutputT, MetadataT], ... ] = () ) -> None ``` Adds a case to the dataset. This is a convenience method for creating a Case and adding it to the dataset. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str | None` | Optional name for the case. If not provided, a generic name will be assigned. | `None` | | `inputs` | `InputsT` | The inputs to the task being evaluated. | *required* | | `metadata` | `MetadataT | None` | Optional metadata for the case, which can be used by evaluators. | `None` | | `expected_output` | `OutputT | None` | The expected output of the task, used for comparison in evaluators. | `None` | | `evaluators` | `tuple[Evaluator[InputsT, OutputT, MetadataT], ...]` | Tuple of evaluators specific to this case, in addition to dataset-level evaluators. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_case( self, *, name: str | None = None, inputs: InputsT, metadata: MetadataT | None = None, expected_output: OutputT | None = None, evaluators: tuple[Evaluator[InputsT, OutputT, MetadataT], ...] = (), ) -> None: """Adds a case to the dataset. This is a convenience method for creating a [`Case`][pydantic_evals.Case] and adding it to the dataset. Args: name: Optional name for the case. If not provided, a generic name will be assigned. inputs: The inputs to the task being evaluated. metadata: Optional metadata for the case, which can be used by evaluators. expected_output: The expected output of the task, used for comparison in evaluators. evaluators: Tuple of evaluators specific to this case, in addition to dataset-level evaluators. """ if name in {case.name for case in self.cases}: raise ValueError(f'Duplicate case name: {name!r}') case = Case[InputsT, OutputT, MetadataT]( name=name, inputs=inputs, metadata=metadata, expected_output=expected_output, evaluators=evaluators, ) self.cases.append(case) ``` #### add_evaluator ```python add_evaluator( evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None ``` Adds an evaluator to the dataset or a specific case. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `evaluator` | `Evaluator[InputsT, OutputT, MetadataT]` | The evaluator to add. | *required* | | `specific_case` | `str | None` | If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. | `None` | Raises: | Type | Description | | --- | --- | | `ValueError` | If specific_case is provided but no case with that name exists in the dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def add_evaluator( self, evaluator: Evaluator[InputsT, OutputT, MetadataT], specific_case: str | None = None, ) -> None: """Adds an evaluator to the dataset or a specific case. Args: evaluator: The evaluator to add. specific_case: If provided, the evaluator will only be added to the case with this name. If None, the evaluator will be added to all cases in the dataset. Raises: ValueError: If `specific_case` is provided but no case with that name exists in the dataset. """ if specific_case is None: self.evaluators.append(evaluator) else: # If this is too slow, we could try to add a case lookup dict. # Note that if we do that, we'd need to make the cases list private to prevent modification. added = False for case in self.cases: if case.name == specific_case: case.evaluators.append(evaluator) added = True if not added: raise ValueError(f'Case {specific_case!r} not found in the dataset') ``` #### from_file ```python from_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> Self ``` Load a dataset from a file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | Path to the file to load. | *required* | | `fmt` | `Literal['yaml', 'json'] | None` | Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | `None` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance loaded from the file. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the file cannot be parsed as a valid dataset. | | `ValueError` | If the format cannot be inferred from the file extension. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_file( cls, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> Self: """Load a dataset from a file. Args: path: Path to the file to load. fmt: Format of the file. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. Returns: A new Dataset instance loaded from the file. Raises: ValidationError: If the file cannot be parsed as a valid dataset. ValueError: If the format cannot be inferred from the file extension. """ path = Path(path) fmt = cls._infer_fmt(path, fmt) raw = Path(path).read_text() try: return cls.from_text(raw, fmt=fmt, custom_evaluator_types=custom_evaluator_types, default_name=path.stem) except ValidationError as e: # pragma: no cover raise ValueError(f'{path} contains data that does not match the schema for {cls.__name__}:\n{e}.') from e ``` #### from_text ```python from_text( contents: str, fmt: Literal["yaml", "json"] = "yaml", custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), *, default_name: str | None = None ) -> Self ``` Load a dataset from a string. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `contents` | `str` | The string content to parse. | *required* | | `fmt` | `Literal['yaml', 'json']` | Format of the content. Must be either 'yaml' or 'json'. | `'yaml'` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | | `default_name` | `str | None` | Default name of the dataset, to be used if not specified in the serialized contents. | `None` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance parsed from the string. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the content cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_text( cls, contents: str, fmt: Literal['yaml', 'json'] = 'yaml', custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a string. Args: contents: The string content to parse. fmt: Format of the content. Must be either 'yaml' or 'json'. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the serialized contents. Returns: A new Dataset instance parsed from the string. Raises: ValidationError: If the content cannot be parsed as a valid dataset. """ if fmt == 'yaml': loaded = yaml.safe_load(contents) return cls.from_dict(loaded, custom_evaluator_types, default_name=default_name) else: dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate_json(contents) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) ``` #### from_dict ```python from_dict( data: dict[str, Any], custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), *, default_name: str | None = None ) -> Self ``` Load a dataset from a dictionary. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `data` | `dict[str, Any]` | Dictionary representation of the dataset. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. | `()` | | `default_name` | `str | None` | Default name of the dataset, to be used if not specified in the data. | `None` | Returns: | Type | Description | | --- | --- | | `Self` | A new Dataset instance created from the dictionary. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the dictionary cannot be converted to a valid dataset. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def from_dict( cls, data: dict[str, Any], custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), *, default_name: str | None = None, ) -> Self: """Load a dataset from a dictionary. Args: data: Dictionary representation of the dataset. custom_evaluator_types: Custom evaluator classes to use when deserializing the dataset. These are additional evaluators beyond the default ones. default_name: Default name of the dataset, to be used if not specified in the data. Returns: A new Dataset instance created from the dictionary. Raises: ValidationError: If the dictionary cannot be converted to a valid dataset. """ dataset_model_type = cls._serialization_type() dataset_model = dataset_model_type.model_validate(data) return cls._from_dataset_model(dataset_model, custom_evaluator_types, default_name) ``` #### to_file ```python to_file( path: Path | str, fmt: Literal["yaml", "json"] | None = None, schema_path: ( Path | str | None ) = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) ``` Save the dataset to a file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | Path to save the dataset to. | *required* | | `fmt` | `Literal['yaml', 'json'] | None` | Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. | `None` | | `schema_path` | `Path | str | None` | Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. | `DEFAULT_SCHEMA_PATH_TEMPLATE` | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def to_file( self, path: Path | str, fmt: Literal['yaml', 'json'] | None = None, schema_path: Path | str | None = DEFAULT_SCHEMA_PATH_TEMPLATE, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ): """Save the dataset to a file. Args: path: Path to save the dataset to. fmt: Format to use. If None, the format will be inferred from the file extension. Must be either 'yaml' or 'json'. schema_path: Path to save the JSON schema to. If None, no schema will be saved. Can be a string template with {stem} which will be replaced with the dataset filename stem. custom_evaluator_types: Custom evaluator classes to include in the schema. """ path = Path(path) fmt = self._infer_fmt(path, fmt) schema_ref: str | None = None if schema_path is not None: # pragma: no branch if isinstance(schema_path, str): # pragma: no branch schema_path = Path(schema_path.format(stem=path.stem)) if not schema_path.is_absolute(): schema_ref = str(schema_path) schema_path = path.parent / schema_path elif schema_path.is_relative_to(path): # pragma: no cover schema_ref = str(_get_relative_path_reference(schema_path, path)) else: # pragma: no cover schema_ref = str(schema_path) self._save_schema(schema_path, custom_evaluator_types) context: dict[str, Any] = {'use_short_form': True} if fmt == 'yaml': dumped_data = self.model_dump(mode='json', by_alias=True, context=context) content = yaml.dump(dumped_data, sort_keys=False) if schema_ref: # pragma: no branch yaml_language_server_line = f'{_YAML_SCHEMA_LINE_PREFIX}{schema_ref}' content = f'{yaml_language_server_line}\n{content}' path.write_text(content) else: context['$schema'] = schema_ref json_data = self.model_dump_json(indent=2, by_alias=True, context=context) path.write_text(json_data + '\n') ``` #### model_json_schema_with_evaluators ```python model_json_schema_with_evaluators( custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), ) -> dict[str, Any] ``` Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Custom evaluator classes to include in the schema. | `()` | Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary representing the JSON schema. | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python @classmethod def model_json_schema_with_evaluators( cls, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), ) -> dict[str, Any]: """Generate a JSON schema for this dataset type, including evaluator details. This is useful for generating a schema that can be used to validate YAML-format dataset files. Args: custom_evaluator_types: Custom evaluator classes to include in the schema. Returns: A dictionary representing the JSON schema. """ # Note: this function could maybe be simplified now that Evaluators are always dataclasses registry = _get_registry(custom_evaluator_types) evaluator_schema_types: list[Any] = [] for name, evaluator_class in registry.items(): type_hints = _typing_extra.get_function_type_hints(evaluator_class) type_hints.pop('return', None) required_type_hints: dict[str, Any] = {} for p in inspect.signature(evaluator_class).parameters.values(): type_hints.setdefault(p.name, Any) if p.default is not p.empty: type_hints[p.name] = NotRequired[type_hints[p.name]] else: required_type_hints[p.name] = type_hints[p.name] def _make_typed_dict(cls_name_prefix: str, fields: dict[str, Any]) -> Any: td = TypedDict(f'{cls_name_prefix}_{name}', fields) # pyright: ignore[reportArgumentType] config = ConfigDict(extra='forbid', arbitrary_types_allowed=True) # TODO: Replace with pydantic.with_config once pydantic 2.11 is the min supported version td.__pydantic_config__ = config # pyright: ignore[reportAttributeAccessIssue] return td # Shortest form: just the call name if len(type_hints) == 0 or not required_type_hints: evaluator_schema_types.append(Literal[name]) # Short form: can be called with only one parameter if len(type_hints) == 1: [type_hint_type] = type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) elif len(required_type_hints) == 1: # pragma: no branch [type_hint_type] = required_type_hints.values() evaluator_schema_types.append(_make_typed_dict('short_evaluator', {name: type_hint_type})) # Long form: multiple parameters, possibly required if len(type_hints) > 1: params_td = _make_typed_dict('evaluator_params', type_hints) evaluator_schema_types.append(_make_typed_dict('evaluator', {name: params_td})) in_type, out_type, meta_type = cls._params() # Note: we shadow the `Case` and `Dataset` class names here to generate a clean JSON schema class Case(BaseModel, extra='forbid'): # pyright: ignore[reportUnusedClass] # this _is_ used below, but pyright doesn't seem to notice.. name: str | None = None inputs: in_type # pyright: ignore[reportInvalidTypeForm] metadata: meta_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] expected_output: out_type | None = None # pyright: ignore[reportInvalidTypeForm,reportUnknownVariableType] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 class Dataset(BaseModel, extra='forbid'): name: str | None = None cases: list[Case] if evaluator_schema_types: # pragma: no branch evaluators: list[Union[tuple(evaluator_schema_types)]] = [] # pyright: ignore # noqa UP007 json_schema = Dataset.model_json_schema() # See `_add_json_schema` below, since `$schema` is added to the JSON, it has to be supported in the JSON json_schema['properties']['$schema'] = {'type': 'string'} return json_schema ``` ### set_eval_attribute ```python set_eval_attribute(name: str, value: Any) -> None ``` Set an attribute on the current task run. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the attribute. | *required* | | `value` | `Any` | The value of the attribute. | *required* | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def set_eval_attribute(name: str, value: Any) -> None: """Set an attribute on the current task run. Args: name: The name of the attribute. value: The value of the attribute. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.record_attribute(name, value) ``` ### increment_eval_metric ```python increment_eval_metric( name: str, amount: int | float ) -> None ``` Increment a metric on the current task run. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the metric. | *required* | | `amount` | `int | float` | The amount to increment by. | *required* | Source code in `pydantic_evals/pydantic_evals/dataset.py` ```python def increment_eval_metric(name: str, amount: int | float) -> None: """Increment a metric on the current task run. Args: name: The name of the metric. amount: The amount to increment by. """ current_case = _CURRENT_TASK_RUN.get() if current_case is not None: # pragma: no branch current_case.increment_metric(name, amount) ``` # `pydantic_evals.evaluators` ### Contains Bases: `Evaluator[object, object, object]` Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Contains(Evaluator[object, object, object]): """Check if the output contains the expected output. For strings, checks if expected_output is a substring of output. For lists/tuples, checks if expected_output is in output. For dicts, checks if all key-value pairs in expected_output are in output. Note: case_sensitive only applies when both the value and output are strings. """ value: Any case_sensitive: bool = True as_strings: bool = False evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluationReason: # Convert objects to strings if requested failure_reason: str | None = None as_strings = self.as_strings or (isinstance(self.value, str) and isinstance(ctx.output, str)) if as_strings: output_str = str(ctx.output) expected_str = str(self.value) if not self.case_sensitive: output_str = output_str.lower() expected_str = expected_str.lower() failure_reason: str | None = None if expected_str not in output_str: output_trunc = _truncated_repr(output_str, max_length=100) expected_trunc = _truncated_repr(expected_str, max_length=100) failure_reason = f'Output string {output_trunc} does not contain expected string {expected_trunc}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) try: # Handle different collection types if isinstance(ctx.output, dict): if isinstance(self.value, dict): # Cast to Any to avoid type checking issues output_dict = cast(dict[Any, Any], ctx.output) # pyright: ignore[reportUnknownMemberType] expected_dict = cast(dict[Any, Any], self.value) # pyright: ignore[reportUnknownMemberType] for k in expected_dict: if k not in output_dict: k_trunc = _truncated_repr(k, max_length=30) failure_reason = f'Output dictionary does not contain expected key {k_trunc}' break elif output_dict[k] != expected_dict[k]: k_trunc = _truncated_repr(k, max_length=30) output_v_trunc = _truncated_repr(output_dict[k], max_length=100) expected_v_trunc = _truncated_repr(expected_dict[k], max_length=100) failure_reason = f'Output dictionary has different value for key {k_trunc}: {output_v_trunc} != {expected_v_trunc}' break else: if self.value not in ctx.output: # pyright: ignore[reportUnknownMemberType] output_trunc = _truncated_repr(ctx.output, max_length=200) # pyright: ignore[reportUnknownMemberType] failure_reason = f'Output {output_trunc} does not contain provided value as a key' elif self.value not in ctx.output: # pyright: ignore[reportOperatorIssue] # will be handled by except block output_trunc = _truncated_repr(ctx.output, max_length=200) failure_reason = f'Output {output_trunc} does not contain provided value' except (TypeError, ValueError) as e: failure_reason = f'Containment check failed: {e}' return EvaluationReason(value=failure_reason is None, reason=failure_reason) ``` ### Equals Bases: `Evaluator[object, object, object]` Check if the output exactly equals the provided value. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class Equals(Evaluator[object, object, object]): """Check if the output exactly equals the provided value.""" value: Any evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: return ctx.output == self.value ``` ### EqualsExpected Bases: `Evaluator[object, object, object]` Check if the output exactly equals the expected output. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class EqualsExpected(Evaluator[object, object, object]): """Check if the output exactly equals the expected output.""" evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool | dict[str, bool]: if ctx.expected_output is None: return {} # Only compare if expected output is provided return ctx.output == ctx.expected_output ``` ### HasMatchingSpan Bases: `Evaluator[object, object, object]` Check if the span tree contains a span that matches the specified query. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class HasMatchingSpan(Evaluator[object, object, object]): """Check if the span tree contains a span that matches the specified query.""" query: SpanQuery evaluation_name: str | None = field(default=None) def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> bool: return ctx.span_tree.any(self.query) ``` ### IsInstance Bases: `Evaluator[object, object, object]` Check if the output is an instance of a type with the given name. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class IsInstance(Evaluator[object, object, object]): """Check if the output is an instance of a type with the given name.""" type_name: str evaluation_name: str | None = field(default=None) def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> EvaluationReason: output = ctx.output for cls in type(output).__mro__: if cls.__name__ == self.type_name or cls.__qualname__ == self.type_name: return EvaluationReason(value=True) reason = f'output is of type {type(output).__name__}' if type(output).__qualname__ != type(output).__name__: reason += f' (qualname: {type(output).__qualname__})' return EvaluationReason(value=False, reason=reason) ``` ### LLMJudge Bases: `Evaluator[object, object, object]` Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-4o', but can be overridden by calling set_default_judge_model. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class LLMJudge(Evaluator[object, object, object]): """Judge whether the output of a language model meets the criteria of a provided rubric. If you do not specify a model, it uses the default model for judging. This starts as 'openai:gpt-4o', but can be overridden by calling [`set_default_judge_model`][pydantic_evals.evaluators.llm_as_a_judge.set_default_judge_model]. """ rubric: str model: models.Model | models.KnownModelName | None = None include_input: bool = False include_expected_output: bool = False model_settings: ModelSettings | None = None score: OutputConfig | Literal[False] = False assertion: OutputConfig | Literal[False] = field(default_factory=lambda: OutputConfig(include_reason=True)) async def evaluate( self, ctx: EvaluatorContext[object, object, object], ) -> EvaluatorOutput: if self.include_input: if self.include_expected_output: from .llm_as_a_judge import judge_input_output_expected grading_output = await judge_input_output_expected( ctx.inputs, ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_input_output grading_output = await judge_input_output( ctx.inputs, ctx.output, self.rubric, self.model, self.model_settings ) else: if self.include_expected_output: from .llm_as_a_judge import judge_output_expected grading_output = await judge_output_expected( ctx.output, ctx.expected_output, self.rubric, self.model, self.model_settings ) else: from .llm_as_a_judge import judge_output grading_output = await judge_output(ctx.output, self.rubric, self.model, self.model_settings) output: dict[str, EvaluationScalar | EvaluationReason] = {} include_both = self.score is not False and self.assertion is not False evaluation_name = self.get_default_evaluation_name() if self.score is not False: default_name = f'{evaluation_name}_score' if include_both else evaluation_name _update_combined_output(output, grading_output.score, grading_output.reason, self.score, default_name) if self.assertion is not False: default_name = f'{evaluation_name}_pass' if include_both else evaluation_name _update_combined_output(output, grading_output.pass_, grading_output.reason, self.assertion, default_name) return output def build_serialization_arguments(self): result = super().build_serialization_arguments() # always serialize the model as a string when present; use its name if it's a KnownModelName if (model := result.get('model')) and isinstance(model, models.Model): # pragma: no branch result['model'] = f'{model.system}:{model.model_name}' # Note: this may lead to confusion if you try to serialize-then-deserialize with a custom model. # I expect that is rare enough to be worth not solving yet, but common enough that we probably will want to # solve it eventually. I'm imagining some kind of model registry, but don't want to work out the details yet. return result ``` ### MaxDuration Bases: `Evaluator[object, object, object]` Check if the execution time is under the specified maximum. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python @dataclass(repr=False) class MaxDuration(Evaluator[object, object, object]): """Check if the execution time is under the specified maximum.""" seconds: float | timedelta def evaluate(self, ctx: EvaluatorContext[object, object, object]) -> bool: duration = timedelta(seconds=ctx.duration) seconds = self.seconds if not isinstance(seconds, timedelta): seconds = timedelta(seconds=seconds) return duration <= seconds ``` ### OutputConfig Bases: `TypedDict` Configuration for the score and assertion outputs of the LLMJudge evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/common.py` ```python class OutputConfig(TypedDict, total=False): """Configuration for the score and assertion outputs of the LLMJudge evaluator.""" evaluation_name: str include_reason: bool ``` ### EvaluatorContext Bases: `Generic[InputsT, OutputT, MetadataT]` Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/context.py` ````python @dataclass(kw_only=True) class EvaluatorContext(Generic[InputsT, OutputT, MetadataT]): """Context for evaluating a task execution. An instance of this class is the sole input to all Evaluators. It contains all the information needed to evaluate the task execution, including inputs, outputs, metadata, and telemetry data. Evaluators use this context to access the task inputs, actual output, expected output, and other information when evaluating the result of the task execution. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: # Use the context to access task inputs, outputs, and expected outputs return ctx.output == ctx.expected_output ``` """ name: str | None """The name of the case.""" inputs: InputsT """The inputs provided to the task for this case.""" metadata: MetadataT | None """Metadata associated with the case, if provided. May be None if no metadata was specified.""" expected_output: OutputT | None """The expected output for the case, if provided. May be None if no expected output was specified.""" output: OutputT """The actual output produced by the task for this case.""" duration: float """The duration of the task run for this case.""" _span_tree: SpanTree | SpanTreeRecordingError = field(repr=False) """The span tree for the task run for this case. This will be `None` if `logfire.configure` has not been called. """ attributes: dict[str, Any] """Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task.""" metrics: dict[str, int | float] """Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task.""" @property def span_tree(self) -> SpanTree: """Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: The span tree for the task execution. Raises: SpanTreeRecordingError: If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. """ if isinstance(self._span_tree, SpanTreeRecordingError): # In this case, there was a reason we couldn't record the SpanTree. We raise that now raise self._span_tree return self._span_tree ```` #### name ```python name: str | None ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs provided to the task for this case. #### metadata ```python metadata: MetadataT | None ``` Metadata associated with the case, if provided. May be None if no metadata was specified. #### expected_output ```python expected_output: OutputT | None ``` The expected output for the case, if provided. May be None if no expected output was specified. #### output ```python output: OutputT ``` The actual output produced by the task for this case. #### duration ```python duration: float ``` The duration of the task run for this case. #### attributes ```python attributes: dict[str, Any] ``` Attributes associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.set_eval_attribute` in any code executed during the evaluation task. #### metrics ```python metrics: dict[str, int | float] ``` Metrics associated with the task run for this case. These can be set by calling `pydantic_evals.dataset.increment_eval_metric` in any code executed during the evaluation task. #### span_tree ```python span_tree: SpanTree ``` Get the `SpanTree` for this task execution. The span tree is a graph where each node corresponds to an OpenTelemetry span recorded during the task execution, including timing information and any custom spans created during execution. Returns: | Type | Description | | --- | --- | | `SpanTree` | The span tree for the task execution. | Raises: | Type | Description | | --- | --- | | `SpanTreeRecordingError` | If spans were not captured during execution of the task, e.g. due to not having the necessary dependencies installed. | ### EvaluationReason The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `value` | `EvaluationScalar` | The scalar result of the evaluation (boolean, integer, float, or string). | *required* | | `reason` | `str | None` | An optional explanation of the evaluation result. | `None` | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationReason: """The result of running an evaluator with an optional explanation. Contains a scalar value and an optional "reason" explaining the value. Args: value: The scalar result of the evaluation (boolean, integer, float, or string). reason: An optional explanation of the evaluation result. """ value: EvaluationScalar reason: str | None = None ``` ### EvaluationResult Bases: `Generic[EvaluationScalarT]` The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `name` | `str` | The name of the evaluation. | *required* | | `value` | `EvaluationScalarT` | The scalar result of the evaluation. | *required* | | `reason` | `str | None` | An optional explanation of the evaluation result. | *required* | | `source` | `EvaluatorSpec` | The spec of the evaluator that produced this result. | *required* | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluationResult(Generic[EvaluationScalarT]): """The details of an individual evaluation result. Contains the name, value, reason, and source evaluator for a single evaluation. Args: name: The name of the evaluation. value: The scalar result of the evaluation. reason: An optional explanation of the evaluation result. source: The spec of the evaluator that produced this result. """ name: str value: EvaluationScalarT reason: str | None source: EvaluatorSpec def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` #### downcast ```python downcast( *value_types: type[T], ) -> EvaluationResult[T] | None ``` Attempt to downcast this result to a more specific type. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `*value_types` | `type[T]` | The types to check the value against. | `()` | Returns: | Type | Description | | --- | --- | | `EvaluationResult[T] | None` | A downcast version of this result if the value is an instance of one of the given types, | | `EvaluationResult[T] | None` | otherwise None. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def downcast(self, *value_types: type[T]) -> EvaluationResult[T] | None: """Attempt to downcast this result to a more specific type. Args: *value_types: The types to check the value against. Returns: A downcast version of this result if the value is an instance of one of the given types, otherwise None. """ # Check if value matches any of the target types, handling bool as a special case for value_type in value_types: if isinstance(self.value, value_type): # Only match bool with explicit bool type if isinstance(self.value, bool) and value_type is not bool: continue return cast(EvaluationResult[T], self) return None ``` ### Evaluator Bases: `Generic[InputsT, OutputT, MetadataT]` Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ````python @dataclass(repr=False) class Evaluator(Generic[InputsT, OutputT, MetadataT], metaclass=_StrictABCMeta): """Base class for all evaluators. Evaluators can assess the performance of a task in a variety of ways, as a function of the EvaluatorContext. Subclasses must implement the `evaluate` method. Note it can be defined with either `def` or `async def`. Example: ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext @dataclass class ExactMatch(Evaluator): def evaluate(self, ctx: EvaluatorContext) -> bool: return ctx.output == ctx.expected_output ``` """ __pydantic_config__ = ConfigDict(arbitrary_types_allowed=True) @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) def as_spec(self) -> EvaluatorSpec: raw_arguments = self.build_serialization_arguments() arguments: None | tuple[Any,] | dict[str, Any] if len(raw_arguments) == 0: arguments = None elif len(raw_arguments) == 1: arguments = (next(iter(raw_arguments.values())),) else: arguments = raw_arguments return EvaluatorSpec(name=self.get_serialization_name(), arguments=arguments) def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments __repr__ = _utils.dataclasses_no_defaults_repr ```` #### get_serialization_name ```python get_serialization_name() -> str ``` Return the 'name' of this Evaluator to use during serialization. Returns: | Type | Description | | --- | --- | | `str` | The name of the Evaluator, which is typically the class name. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod def get_serialization_name(cls) -> str: """Return the 'name' of this Evaluator to use during serialization. Returns: The name of the Evaluator, which is typically the class name. """ return cls.__name__ ``` #### name ```python name() -> str ``` Deprecated `name` has been renamed, use `get_serialization_name` instead. `name` has been renamed, use `get_serialization_name` instead. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @classmethod @deprecated('`name` has been renamed, use `get_serialization_name` instead.') def name(cls) -> str: """`name` has been renamed, use `get_serialization_name` instead.""" return cls.get_serialization_name() ``` #### get_default_evaluation_name ```python get_default_evaluation_name() -> str ``` Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def get_default_evaluation_name(self) -> str: """Return the default name to use in reports for the output of this evaluator. By default, if the evaluator has an attribute called `evaluation_name` of type string, that will be used. Otherwise, the serialization name of the evaluator (which is usually the class name) will be used. This can be overridden to get a more descriptive name in evaluation reports, e.g. using instance information. Note that evaluators that return a mapping of results will always use the keys of that mapping as the names of the associated evaluation results. """ evaluation_name = getattr(self, 'evaluation_name', None) if isinstance(evaluation_name, str): # If the evaluator has an attribute `name` of type string, use that return evaluation_name return self.get_serialization_name() ``` #### evaluate ```python evaluate( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput | Awaitable[EvaluatorOutput] ``` Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | of evaluation names to either of those. Can be returned either synchronously or as an | | `EvaluatorOutput | Awaitable[EvaluatorOutput]` | awaitable for asynchronous evaluation. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @abstractmethod def evaluate( self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT] ) -> EvaluatorOutput | Awaitable[EvaluatorOutput]: # pragma: no cover """Evaluate the task output in the given context. This is the main evaluation method that subclasses must implement. It can be either synchronous or asynchronous, returning either an EvaluatorOutput directly or an Awaitable[EvaluatorOutput]. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. Can be returned either synchronously or as an awaitable for asynchronous evaluation. """ raise NotImplementedError('You must implement `evaluate`.') ``` #### evaluate_sync ```python evaluate_sync( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def evaluate_sync(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator synchronously, handling both sync and async implementations. This method ensures synchronous execution by running any async evaluate implementation to completion using run_until_complete. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ output = self.evaluate(ctx) if inspect.iscoroutine(output): # pragma: no cover return get_event_loop().run_until_complete(output) else: return cast(EvaluatorOutput, output) ``` #### evaluate_async ```python evaluate_async( ctx: EvaluatorContext[InputsT, OutputT, MetadataT], ) -> EvaluatorOutput ``` Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `EvaluatorContext[InputsT, OutputT, MetadataT]` | The context containing the inputs, outputs, and metadata for evaluation. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorOutput` | The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping | | `EvaluatorOutput` | of evaluation names to either of those. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python async def evaluate_async(self, ctx: EvaluatorContext[InputsT, OutputT, MetadataT]) -> EvaluatorOutput: """Run the evaluator asynchronously, handling both sync and async implementations. This method ensures asynchronous execution by properly awaiting any async evaluate implementation. For synchronous implementations, it returns the result directly. Args: ctx: The context containing the inputs, outputs, and metadata for evaluation. Returns: The evaluation result, which can be a scalar value, an EvaluationReason, or a mapping of evaluation names to either of those. """ # Note: If self.evaluate is synchronous, but you need to prevent this from blocking, override this method with: # return await anyio.to_thread.run_sync(self.evaluate, ctx) output = self.evaluate(ctx) if inspect.iscoroutine(output): return await output else: return cast(EvaluatorOutput, output) ``` #### serialize ```python serialize(info: SerializationInfo) -> Any ``` Serialize this Evaluator to a JSON-serializable form. Returns: | Type | Description | | --- | --- | | `Any` | A JSON-serializable representation of this evaluator as an EvaluatorSpec. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @model_serializer(mode='plain') def serialize(self, info: SerializationInfo) -> Any: """Serialize this Evaluator to a JSON-serializable form. Returns: A JSON-serializable representation of this evaluator as an EvaluatorSpec. """ return to_jsonable_python( self.as_spec(), context=info.context, serialize_unknown=True, ) ``` #### build_serialization_arguments ```python build_serialization_arguments() -> dict[str, Any] ``` Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary of arguments to be used during serialization. | Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python def build_serialization_arguments(self) -> dict[str, Any]: """Build the arguments for serialization. Evaluators are serialized for inclusion as the "source" in an `EvaluationResult`. If you want to modify how the evaluator is serialized for that or other purposes, you can override this method. Returns: A dictionary of arguments to be used during serialization. """ raw_arguments: dict[str, Any] = {} for field in fields(self): value = getattr(self, field.name) # always exclude defaults: if field.default is not MISSING: if value == field.default: continue if field.default_factory is not MISSING: if value == field.default_factory(): # pragma: no branch continue raw_arguments[field.name] = value return raw_arguments ``` ### EvaluatorFailure Represents a failure raised during the execution of an evaluator. Source code in `pydantic_evals/pydantic_evals/evaluators/evaluator.py` ```python @dataclass class EvaluatorFailure: """Represents a failure raised during the execution of an evaluator.""" name: str error_message: str error_stacktrace: str source: EvaluatorSpec ``` ### EvaluatorOutput ```python EvaluatorOutput = ( EvaluationScalar | EvaluationReason | Mapping[str, EvaluationScalar | EvaluationReason] ) ``` Type for the output of an evaluator, which can be a scalar, an EvaluationReason, or a mapping of names to either. ### EvaluatorSpec Bases: `BaseModel` The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python class EvaluatorSpec(BaseModel): """The specification of an evaluator to be run. This class is used to represent evaluators in a serializable format, supporting various short forms for convenience when defining evaluators in YAML or JSON dataset files. In particular, each of the following forms is supported for specifying an evaluator with name `MyEvaluator`: * `'MyEvaluator'` - Just the (string) name of the Evaluator subclass is used if its `__init__` takes no arguments * `{'MyEvaluator': first_arg}` - A single argument is passed as the first positional argument to `MyEvaluator.__init__` * `{'MyEvaluator': {k1: v1, k2: v2}}` - Multiple kwargs are passed to `MyEvaluator.__init__` """ name: str """The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()`""" arguments: None | tuple[Any] | dict[str, Any] """The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). """ @property def args(self) -> tuple[Any, ...]: """Get the positional arguments for the evaluator. Returns: A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. """ if isinstance(self.arguments, tuple): return self.arguments return () @property def kwargs(self) -> dict[str, Any]: """Get the keyword arguments for the evaluator. Returns: A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. """ if isinstance(self.arguments, dict): return self.arguments return {} @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` #### name ```python name: str ``` The name of the evaluator class; should be the value returned by `EvaluatorClass.get_serialization_name()` #### arguments ```python arguments: None | tuple[Any] | dict[str, Any] ``` The arguments to pass to the evaluator's constructor. Can be None (no arguments), a tuple (a single positional argument), or a dict (keyword arguments). #### args ```python args: tuple[Any, ...] ``` Get the positional arguments for the evaluator. Returns: | Type | Description | | --- | --- | | `tuple[Any, ...]` | A tuple of positional arguments if arguments is a tuple, otherwise an empty tuple. | #### kwargs ```python kwargs: dict[str, Any] ``` Get the keyword arguments for the evaluator. Returns: | Type | Description | | --- | --- | | `dict[str, Any]` | A dictionary of keyword arguments if arguments is a dict, otherwise an empty dict. | #### deserialize ```python deserialize( value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec], ) -> EvaluatorSpec ``` Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `value` | `Any` | The value to deserialize. | *required* | | `handler` | `ModelWrapValidatorHandler[EvaluatorSpec]` | The validator handler. | *required* | Returns: | Type | Description | | --- | --- | | `EvaluatorSpec` | The deserialized EvaluatorSpec. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the value cannot be deserialized. | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_validator(mode='wrap') @classmethod def deserialize(cls, value: Any, handler: ModelWrapValidatorHandler[EvaluatorSpec]) -> EvaluatorSpec: """Deserialize an EvaluatorSpec from various formats. This validator handles the various short forms of evaluator specifications, converting them to a consistent EvaluatorSpec instance. Args: value: The value to deserialize. handler: The validator handler. Returns: The deserialized EvaluatorSpec. Raises: ValidationError: If the value cannot be deserialized. """ try: result = handler(value) return result except ValidationError as exc: try: deserialized = _SerializedEvaluatorSpec.model_validate(value) except ValidationError: raise exc # raise the original error return deserialized.to_evaluator_spec() ``` #### serialize ```python serialize( handler: SerializerFunctionWrapHandler, info: SerializationInfo, ) -> Any ``` Serialize using the appropriate short-form if possible. Returns: | Type | Description | | --- | --- | | `Any` | The serialized evaluator specification, using the shortest form possible: | | `Any` | Just the name if there are no arguments | | `Any` | {name: first_arg} if there's a single positional argument | | `Any` | {name: {kwargs}} if there are multiple (keyword) arguments | Source code in `pydantic_evals/pydantic_evals/evaluators/spec.py` ```python @model_serializer(mode='wrap') def serialize(self, handler: SerializerFunctionWrapHandler, info: SerializationInfo) -> Any: """Serialize using the appropriate short-form if possible. Returns: The serialized evaluator specification, using the shortest form possible: - Just the name if there are no arguments - {name: first_arg} if there's a single positional argument - {name: {kwargs}} if there are multiple (keyword) arguments """ if isinstance(info.context, dict) and info.context.get('use_short_form'): # pyright: ignore[reportUnknownMemberType] if self.arguments is None: return self.name elif isinstance(self.arguments, tuple): return {self.name: self.arguments[0]} else: return {self.name: self.arguments} else: return handler(self) ``` ### GradingOutput Bases: `BaseModel` The output of a grading operation. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python class GradingOutput(BaseModel, populate_by_name=True): """The output of a grading operation.""" reason: str pass_: bool = Field(validation_alias='pass', serialization_alias='pass') score: float ``` ### judge_output ```python judge_output( output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output( output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric) return ( await _judge_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output ```python judge_input_output( inputs: Any, output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output( inputs: Any, output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric) return ( await _judge_input_output_agent.run(user_prompt, model=model or _default_model, model_settings=model_settings) ).output ``` ### judge_input_output_expected ```python judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_input_output_expected( inputs: Any, output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the inputs and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(inputs=inputs, output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_input_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### judge_output_expected ```python judge_output_expected( output: Any, expected_output: Any, rubric: str, model: Model | KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput ``` Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python async def judge_output_expected( output: Any, expected_output: Any, rubric: str, model: models.Model | models.KnownModelName | None = None, model_settings: ModelSettings | None = None, ) -> GradingOutput: """Judge the output of a model based on the expected output, output, and a rubric. If the model is not specified, a default model is used. The default model starts as 'openai:gpt-4o', but this can be changed using the `set_default_judge_model` function. """ user_prompt = _build_prompt(output=output, rubric=rubric, expected_output=expected_output) return ( await _judge_output_expected_agent.run( user_prompt, model=model or _default_model, model_settings=model_settings ) ).output ``` ### set_default_judge_model ```python set_default_judge_model( model: Model | KnownModelName, ) -> None ``` Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. Source code in `pydantic_evals/pydantic_evals/evaluators/llm_as_a_judge.py` ```python def set_default_judge_model(model: models.Model | models.KnownModelName) -> None: # pragma: no cover """Set the default model used for judging. This model is used if `None` is passed to the `model` argument of `judge_output` and `judge_input_output`. """ global _default_model _default_model = model ``` # `pydantic_evals.generation` Utilities for generating example datasets for pydantic_evals. This module provides functions for generating sample datasets for testing and examples, using LLMs to create realistic test data with proper structure. ### generate_dataset ```python generate_dataset( *, dataset_type: type[ Dataset[InputsT, OutputT, MetadataT] ], path: Path | str | None = None, custom_evaluator_types: Sequence[ type[Evaluator[InputsT, OutputT, MetadataT]] ] = (), model: Model | KnownModelName = "openai:gpt-4o", n_examples: int = 3, extra_instructions: str | None = None ) -> Dataset[InputsT, OutputT, MetadataT] ``` Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str | None` | Optional path to save the generated dataset. If provided, the dataset will be saved to this location. | `None` | | `dataset_type` | `type[Dataset[InputsT, OutputT, MetadataT]]` | The type of dataset to generate, with the desired input, output, and metadata types. | *required* | | `custom_evaluator_types` | `Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]` | Optional sequence of custom evaluator classes to include in the schema. | `()` | | `model` | `Model | KnownModelName` | The Pydantic AI model to use for generation. Defaults to 'gpt-4o'. | `'openai:gpt-4o'` | | `n_examples` | `int` | Number of examples to generate. Defaults to 3. | `3` | | `extra_instructions` | `str | None` | Optional additional instructions to provide to the LLM. | `None` | Returns: | Type | Description | | --- | --- | | `Dataset[InputsT, OutputT, MetadataT]` | A properly structured Dataset object with generated test cases. | Raises: | Type | Description | | --- | --- | | `ValidationError` | If the LLM's response cannot be parsed as a valid dataset. | Source code in `pydantic_evals/pydantic_evals/generation.py` ```python async def generate_dataset( *, dataset_type: type[Dataset[InputsT, OutputT, MetadataT]], path: Path | str | None = None, custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (), model: models.Model | models.KnownModelName = 'openai:gpt-4o', n_examples: int = 3, extra_instructions: str | None = None, ) -> Dataset[InputsT, OutputT, MetadataT]: """Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata. This function creates a properly structured dataset with the specified input, output, and metadata types. It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas. Args: path: Optional path to save the generated dataset. If provided, the dataset will be saved to this location. dataset_type: The type of dataset to generate, with the desired input, output, and metadata types. custom_evaluator_types: Optional sequence of custom evaluator classes to include in the schema. model: The Pydantic AI model to use for generation. Defaults to 'gpt-4o'. n_examples: Number of examples to generate. Defaults to 3. extra_instructions: Optional additional instructions to provide to the LLM. Returns: A properly structured Dataset object with generated test cases. Raises: ValidationError: If the LLM's response cannot be parsed as a valid dataset. """ output_schema = dataset_type.model_json_schema_with_evaluators(custom_evaluator_types) # TODO: Use `output_type=StructuredDict(output_schema)` (and `from_dict` below) once https://github.com/pydantic/pydantic/issues/12145 # is fixed and `StructuredDict` no longer needs to use `InlineDefsJsonSchemaTransformer`. agent = Agent( model, system_prompt=( f'Generate an object that is in compliance with this JSON schema:\n{output_schema}\n\n' f'Include {n_examples} example cases.' ' You must not include any characters in your response before the opening { of the JSON object, or after the closing }.' ), output_type=str, retries=1, ) result = await agent.run(extra_instructions or 'Please generate the object.') try: result = dataset_type.from_text(result.output, fmt='json', custom_evaluator_types=custom_evaluator_types) except ValidationError as e: # pragma: no cover print(f'Raw response from model:\n{result.output}') raise e if path is not None: result.to_file(path, custom_evaluator_types=custom_evaluator_types) # pragma: no cover return result ``` # `pydantic_evals.otel` ### SpanNode A node in the span tree; provides references to parents/children for easy traversal and queries. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanNode: """A node in the span tree; provides references to parents/children for easy traversal and queries.""" name: str trace_id: int span_id: int parent_span_id: int | None start_timestamp: datetime end_timestamp: datetime attributes: dict[str, AttributeValue] @property def duration(self) -> timedelta: """Return the span's duration as a timedelta, or None if start/end not set.""" return self.end_timestamp - self.start_timestamp @property def children(self) -> list[SpanNode]: return list(self.children_by_id.values()) @property def descendants(self) -> list[SpanNode]: """Return all descendants of this node in DFS order.""" return self.find_descendants(lambda _: True) @property def ancestors(self) -> list[SpanNode]: """Return all ancestors of this node.""" return self.find_ancestors(lambda _: True) @property def node_key(self) -> str: return f'{self.trace_id:032x}:{self.span_id:016x}' @property def parent_node_key(self) -> str | None: return None if self.parent_span_id is None else f'{self.trace_id:032x}:{self.parent_span_id:016x}' # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self.parent: SpanNode | None = None self.children_by_id: dict[str, SpanNode] = {} @staticmethod def from_readable_span(span: ReadableSpan) -> SpanNode: assert span.context is not None, 'Span has no context' assert span.start_time is not None, 'Span has no start time' assert span.end_time is not None, 'Span has no end time' return SpanNode( name=span.name, trace_id=span.context.trace_id, span_id=span.context.span_id, parent_span_id=span.parent.span_id if span.parent else None, start_timestamp=datetime.fromtimestamp(span.start_time / 1e9, tz=timezone.utc), end_timestamp=datetime.fromtimestamp(span.end_time / 1e9, tz=timezone.utc), attributes=dict(span.attributes or {}), ) def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self # ------------------------------------------------------------------------- # Child queries # ------------------------------------------------------------------------- def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None def _filter_children(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: return (child for child in self.children if child.matches(predicate)) # ------------------------------------------------------------------------- # Descendant queries (DFS) # ------------------------------------------------------------------------- def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None def _filter_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: stack = list(self.children) while stack: node = stack.pop() if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): continue stack.extend(node.children) # ------------------------------------------------------------------------- # Ancestor queries (DFS "up" the chain) # ------------------------------------------------------------------------- def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None def _filter_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None ) -> Iterator[SpanNode]: node = self.parent while node: if node.matches(predicate): yield node if stop_recursing_when is not None and node.matches(stop_recursing_when): break node = node.parent # ------------------------------------------------------------------------- # Query matching # ------------------------------------------------------------------------- def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) def _matches_query(self, query: SpanQuery) -> bool: # noqa C901 """Check if the span matches the query conditions.""" # Logical combinations if or_ := query.get('or_'): if len(query) > 1: raise ValueError("Cannot combine 'or_' conditions with other conditions at the same level") return any(self._matches_query(q) for q in or_) if not_ := query.get('not_'): if self._matches_query(not_): return False if and_ := query.get('and_'): results = [self._matches_query(q) for q in and_] if not all(results): return False # At this point, all existing ANDs and no existing ORs have passed, so it comes down to this condition # Name conditions if (name_equals := query.get('name_equals')) and self.name != name_equals: return False if (name_contains := query.get('name_contains')) and name_contains not in self.name: return False if (name_matches_regex := query.get('name_matches_regex')) and not re.match(name_matches_regex, self.name): return False # Attribute conditions if (has_attributes := query.get('has_attributes')) and not all( self.attributes.get(key) == value for key, value in has_attributes.items() ): return False if (has_attributes_keys := query.get('has_attribute_keys')) and not all( key in self.attributes for key in has_attributes_keys ): return False # Timing conditions if (min_duration := query.get('min_duration')) is not None: if not isinstance(min_duration, timedelta): min_duration = timedelta(seconds=min_duration) if self.duration < min_duration: return False if (max_duration := query.get('max_duration')) is not None: if not isinstance(max_duration, timedelta): max_duration = timedelta(seconds=max_duration) if self.duration > max_duration: return False # Children conditions if (min_child_count := query.get('min_child_count')) and len(self.children) < min_child_count: return False if (max_child_count := query.get('max_child_count')) and len(self.children) > max_child_count: return False if (some_child_has := query.get('some_child_has')) and not any( child._matches_query(some_child_has) for child in self.children ): return False if (all_children_have := query.get('all_children_have')) and not all( child._matches_query(all_children_have) for child in self.children ): return False if (no_child_has := query.get('no_child_has')) and any( child._matches_query(no_child_has) for child in self.children ): return False # Descendant conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def descendants(): return self.descendants @cache def pruned_descendants(): stop_recursing_when = query.get('stop_recursing_when') return ( self._filter_descendants(lambda _: True, stop_recursing_when) if stop_recursing_when else descendants() ) if (min_descendant_count := query.get('min_descendant_count')) and len(descendants()) < min_descendant_count: return False if (max_descendant_count := query.get('max_descendant_count')) and len(descendants()) > max_descendant_count: return False if (some_descendant_has := query.get('some_descendant_has')) and not any( descendant._matches_query(some_descendant_has) for descendant in pruned_descendants() ): return False if (all_descendants_have := query.get('all_descendants_have')) and not all( descendant._matches_query(all_descendants_have) for descendant in pruned_descendants() ): return False if (no_descendant_has := query.get('no_descendant_has')) and any( descendant._matches_query(no_descendant_has) for descendant in pruned_descendants() ): return False # Ancestor conditions # The following local functions with cache decorators are used to avoid repeatedly evaluating these properties @cache def ancestors(): return self.ancestors @cache def pruned_ancestors(): stop_recursing_when = query.get('stop_recursing_when') return self._filter_ancestors(lambda _: True, stop_recursing_when) if stop_recursing_when else ancestors() if (min_depth := query.get('min_depth')) and len(ancestors()) < min_depth: return False if (max_depth := query.get('max_depth')) and len(ancestors()) > max_depth: return False if (some_ancestor_has := query.get('some_ancestor_has')) and not any( ancestor._matches_query(some_ancestor_has) for ancestor in pruned_ancestors() ): return False if (all_ancestors_have := query.get('all_ancestors_have')) and not all( ancestor._matches_query(all_ancestors_have) for ancestor in pruned_ancestors() ): return False if (no_ancestor_has := query.get('no_ancestor_has')) and any( ancestor._matches_query(no_ancestor_has) for ancestor in pruned_ancestors() ): return False return True # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) def __str__(self) -> str: if self.children: return f"..." else: return f"" def __repr__(self) -> str: return self.repr_xml() ``` #### duration ```python duration: timedelta ``` Return the span's duration as a timedelta, or None if start/end not set. #### descendants ```python descendants: list[SpanNode] ``` Return all descendants of this node in DFS order. #### ancestors ```python ancestors: list[SpanNode] ``` Return all ancestors of this node. #### add_child ```python add_child(child: SpanNode) -> None ``` Attach a child node to this node's list of children. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_child(self, child: SpanNode) -> None: """Attach a child node to this node's list of children.""" assert child.trace_id == self.trace_id, f"traces don't match: {child.trace_id:032x} != {self.trace_id:032x}" assert child.parent_span_id == self.span_id, ( f'parent span mismatch: {child.parent_span_id:016x} != {self.span_id:016x}' ) self.children_by_id[child.node_key] = child child.parent = self ``` #### find_children ```python find_children( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Return all immediate children that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_children(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Return all immediate children that satisfy the given predicate.""" return list(self._filter_children(predicate)) ``` #### first_child ```python first_child( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Return the first immediate child that satisfies the given predicate, or None if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_child(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Return the first immediate child that satisfies the given predicate, or None if none match.""" return next(self._filter_children(predicate), None) ``` #### any_child ```python any_child(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if there is at least one child that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_child(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if there is at least one child that satisfies the predicate.""" return self.first_child(predicate) is not None ``` #### find_descendants ```python find_descendants( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all descendant nodes that satisfy the given predicate in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_descendants( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all descendant nodes that satisfy the given predicate in DFS order.""" return list(self._filter_descendants(predicate, stop_recursing_when)) ``` #### first_descendant ```python first_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """DFS: Return the first descendant (in DFS order) that satisfies the given predicate, or `None` if none match.""" return next(self._filter_descendants(predicate, stop_recursing_when), None) ``` #### any_descendant ```python any_descendant( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns `True` if there is at least one descendant that satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_descendant( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns `True` if there is at least one descendant that satisfies the predicate.""" return self.first_descendant(predicate, stop_recursing_when) is not None ``` #### find_ancestors ```python find_ancestors( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> list[SpanNode] ``` Return all ancestors that satisfy the given predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find_ancestors( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> list[SpanNode]: """Return all ancestors that satisfy the given predicate.""" return list(self._filter_ancestors(predicate, stop_recursing_when)) ``` #### first_ancestor ```python first_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> SpanNode | None ``` Return the closest ancestor that satisfies the given predicate, or `None` if none match. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> SpanNode | None: """Return the closest ancestor that satisfies the given predicate, or `None` if none match.""" return next(self._filter_ancestors(predicate, stop_recursing_when), None) ``` #### any_ancestor ```python any_ancestor( predicate: SpanQuery | SpanPredicate, stop_recursing_when: ( SpanQuery | SpanPredicate | None ) = None, ) -> bool ``` Returns True if any ancestor satisfies the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any_ancestor( self, predicate: SpanQuery | SpanPredicate, stop_recursing_when: SpanQuery | SpanPredicate | None = None ) -> bool: """Returns True if any ancestor satisfies the predicate.""" return self.first_ancestor(predicate, stop_recursing_when) is not None ``` #### matches ```python matches(query: SpanQuery | SpanPredicate) -> bool ``` Check if the span node matches the query conditions or predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def matches(self, query: SpanQuery | SpanPredicate) -> bool: """Check if the span node matches the query conditions or predicate.""" if callable(query): return query(self) return self._matches_query(query) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the node. Optionally includes children, trace_id, span_id, start_timestamp, and duration. """ first_line_parts = [f'') for child in self.children: extra_lines.append( indent( child.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) ) extra_lines.append('') else: if self.children: first_line_parts.append('children=...') first_line_parts.append('/>') return '\n'.join([' '.join(first_line_parts), *extra_lines]) ``` ### SpanQuery Bases: `TypedDict` A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python class SpanQuery(TypedDict, total=False): """A serializable query for filtering SpanNodes based on various conditions. All fields are optional and combined with AND logic by default. """ # These fields are ordered to match the implementation of SpanNode.matches_query for easy review. # * Individual span conditions come first because these are generally the cheapest to evaluate # * Logical combinations come next because they may just be combinations of individual span conditions # * Related-span conditions come last because they may require the most work to evaluate # Individual span conditions ## Name conditions name_equals: str name_contains: str name_matches_regex: str # regex pattern ## Attribute conditions has_attributes: dict[str, Any] has_attribute_keys: list[str] ## Timing conditions min_duration: timedelta | float max_duration: timedelta | float # Logical combinations of conditions not_: SpanQuery and_: list[SpanQuery] or_: list[SpanQuery] # Child conditions min_child_count: int max_child_count: int some_child_has: SpanQuery all_children_have: SpanQuery no_child_has: SpanQuery # Recursive conditions stop_recursing_when: SpanQuery """If present, stop recursing through ancestors or descendants at nodes that match this condition.""" ## Descendant conditions min_descendant_count: int max_descendant_count: int some_descendant_has: SpanQuery all_descendants_have: SpanQuery no_descendant_has: SpanQuery ## Ancestor conditions min_depth: int # depth is equivalent to ancestor count; roots have depth 0 max_depth: int some_ancestor_has: SpanQuery all_ancestors_have: SpanQuery no_ancestor_has: SpanQuery ``` #### stop_recursing_when ```python stop_recursing_when: SpanQuery ``` If present, stop recursing through ancestors or descendants at nodes that match this condition. ### SpanTree A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python @dataclass(repr=False, kw_only=True) class SpanTree: """A container that builds a hierarchy of SpanNode objects from a list of finished spans. You can then search or iterate the tree to make your assertions (using DFS for traversal). """ roots: list[SpanNode] = field(default_factory=list) nodes_by_id: dict[str, SpanNode] = field(default_factory=dict) # ------------------------------------------------------------------------- # Construction # ------------------------------------------------------------------------- def __post_init__(self): self._rebuild_tree() def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() def add_readable_spans(self, readable_spans: list[ReadableSpan]): self.add_spans([SpanNode.from_readable_span(span) for span in readable_spans]) def _rebuild_tree(self): # Ensure spans are ordered by start_timestamp so that roots and children end up in the right order nodes = list(self.nodes_by_id.values()) nodes.sort(key=lambda node: node.start_timestamp or datetime.min) self.nodes_by_id = {node.node_key: node for node in nodes} # Build the parent/child relationships for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is not None: parent_node = self.nodes_by_id.get(parent_node_key) if parent_node is not None: parent_node.add_child(node) # Determine the roots # A node is a "root" if its parent is None or if its parent's span_id is not in the current set of spans. self.roots = [] for node in self.nodes_by_id.values(): parent_node_key = node.parent_node_key if parent_node_key is None or parent_node_key not in self.nodes_by_id: self.roots.append(node) # ------------------------------------------------------------------------- # Node filtering and iteration # ------------------------------------------------------------------------- def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None def _filter(self, predicate: SpanQuery | SpanPredicate) -> Iterator[SpanNode]: for node in self: if node.matches(predicate): yield node def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) # ------------------------------------------------------------------------- # String representation # ------------------------------------------------------------------------- def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) def __str__(self): return f'' def __repr__(self): return self.repr_xml() ``` #### add_spans ```python add_spans(spans: list[SpanNode]) -> None ``` Add a list of spans to the tree, rebuilding the tree structure. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def add_spans(self, spans: list[SpanNode]) -> None: """Add a list of spans to the tree, rebuilding the tree structure.""" for span in spans: self.nodes_by_id[span.node_key] = span self._rebuild_tree() ``` #### find ```python find( predicate: SpanQuery | SpanPredicate, ) -> list[SpanNode] ``` Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def find(self, predicate: SpanQuery | SpanPredicate) -> list[SpanNode]: """Find all nodes in the entire tree that match the predicate, scanning from each root in DFS order.""" return list(self._filter(predicate)) ``` #### first ```python first( predicate: SpanQuery | SpanPredicate, ) -> SpanNode | None ``` Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def first(self, predicate: SpanQuery | SpanPredicate) -> SpanNode | None: """Find the first node that matches a predicate, scanning from each root in DFS order. Returns `None` if not found.""" return next(self._filter(predicate), None) ``` #### any ```python any(predicate: SpanQuery | SpanPredicate) -> bool ``` Returns True if any node in the tree matches the predicate. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def any(self, predicate: SpanQuery | SpanPredicate) -> bool: """Returns True if any node in the tree matches the predicate.""" return self.first(predicate) is not None ``` #### __iter__ ```python __iter__() -> Iterator[SpanNode] ``` Return an iterator over all nodes in the tree. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def __iter__(self) -> Iterator[SpanNode]: """Return an iterator over all nodes in the tree.""" return iter(self.nodes_by_id.values()) ``` #### repr_xml ```python repr_xml( include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str ``` Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps. Source code in `pydantic_evals/pydantic_evals/otel/span_tree.py` ```python def repr_xml( self, include_children: bool = True, include_trace_id: bool = False, include_span_id: bool = False, include_start_timestamp: bool = False, include_duration: bool = False, ) -> str: """Return an XML-like string representation of the tree, optionally including children, trace_id, span_id, duration, and timestamps.""" if not self.roots: return '' repr_parts = [ '', *[ indent( root.repr_xml( include_children=include_children, include_trace_id=include_trace_id, include_span_id=include_span_id, include_start_timestamp=include_start_timestamp, include_duration=include_duration, ), ' ', ) for root in self.roots ], '', ] return '\n'.join(repr_parts) ``` # `pydantic_evals.reporting` ### ReportCase Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCase(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.Case.expected_output].""" output: OutputT """The output of the task execution.""" metrics: dict[str, float | int] attributes: dict[str, Any] scores: dict[str, EvaluationResult[int | float]] labels: dict[str, EvaluationResult[str]] assertions: dict[str, EvaluationResult[bool]] task_duration: float total_duration: float # includes evaluator execution time trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" evaluator_failures: list[EvaluatorFailure] = field(default_factory=list) ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### output ```python output: OutputT ``` The output of the task execution. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseFailure Bases: `Generic[InputsT, OutputT, MetadataT]` A single case in an evaluation report that failed due to an error during task execution. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class ReportCaseFailure(Generic[InputsT, OutputT, MetadataT]): """A single case in an evaluation report that failed due to an error during task execution.""" name: str """The name of the [case][pydantic_evals.Case].""" inputs: InputsT """The inputs to the task, from [`Case.inputs`][pydantic_evals.Case.inputs].""" metadata: MetadataT | None """Any metadata associated with the case, from [`Case.metadata`][pydantic_evals.Case.metadata].""" expected_output: OutputT | None """The expected output of the task, from [`Case.expected_output`][pydantic_evals.Case.expected_output].""" error_message: str """The message of the exception that caused the failure.""" error_stacktrace: str """The stacktrace of the exception that caused the failure.""" trace_id: str | None = None """The trace ID of the case span.""" span_id: str | None = None """The span ID of the case span.""" ``` #### name ```python name: str ``` The name of the case. #### inputs ```python inputs: InputsT ``` The inputs to the task, from Case.inputs. #### metadata ```python metadata: MetadataT | None ``` Any metadata associated with the case, from Case.metadata. #### expected_output ```python expected_output: OutputT | None ``` The expected output of the task, from Case.expected_output. #### error_message ```python error_message: str ``` The message of the exception that caused the failure. #### error_stacktrace ```python error_stacktrace: str ``` The stacktrace of the exception that caused the failure. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the case span. #### span_id ```python span_id: str | None = None ``` The span ID of the case span. ### ReportCaseAggregate Bases: `BaseModel` A synthetic case that summarizes a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class ReportCaseAggregate(BaseModel): """A synthetic case that summarizes a set of cases.""" name: str scores: dict[str, float | int] labels: dict[str, dict[str, float]] metrics: dict[str, float | int] assertions: float | None task_duration: float total_duration: float @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` #### average ```python average(cases: list[ReportCase]) -> ReportCaseAggregate ``` Produce a synthetic "summary" case by averaging quantitative attributes. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @staticmethod def average(cases: list[ReportCase]) -> ReportCaseAggregate: """Produce a synthetic "summary" case by averaging quantitative attributes.""" num_cases = len(cases) if num_cases == 0: return ReportCaseAggregate( name='Averages', scores={}, labels={}, metrics={}, assertions=None, task_duration=0.0, total_duration=0.0, ) def _scores_averages(scores_by_name: list[dict[str, int | float | bool]]) -> dict[str, float]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, float] = defaultdict(float) for sbn in scores_by_name: for name, score in sbn.items(): counts_by_name[name] += 1 sums_by_name[name] += score return {name: sums_by_name[name] / counts_by_name[name] for name in sums_by_name} def _labels_averages(labels_by_name: list[dict[str, str]]) -> dict[str, dict[str, float]]: counts_by_name: dict[str, int] = defaultdict(int) sums_by_name: dict[str, dict[str, float]] = defaultdict(lambda: defaultdict(float)) for lbn in labels_by_name: for name, label in lbn.items(): counts_by_name[name] += 1 sums_by_name[name][label] += 1 return { name: {value: count / counts_by_name[name] for value, count in sums_by_name[name].items()} for name in sums_by_name } average_task_duration = sum(case.task_duration for case in cases) / num_cases average_total_duration = sum(case.total_duration for case in cases) / num_cases # average_assertions: dict[str, float] = _scores_averages([{k: v.value for k, v in case.scores.items()} for case in cases]) average_scores: dict[str, float] = _scores_averages( [{k: v.value for k, v in case.scores.items()} for case in cases] ) average_labels: dict[str, dict[str, float]] = _labels_averages( [{k: v.value for k, v in case.labels.items()} for case in cases] ) average_metrics: dict[str, float] = _scores_averages([case.metrics for case in cases]) average_assertions: float | None = None n_assertions = sum(len(case.assertions) for case in cases) if n_assertions > 0: n_passing = sum(1 for case in cases for assertion in case.assertions.values() if assertion.value) average_assertions = n_passing / n_assertions return ReportCaseAggregate( name='Averages', scores=average_scores, labels=average_labels, metrics=average_metrics, assertions=average_assertions, task_duration=average_task_duration, total_duration=average_total_duration, ) ``` ### EvaluationReport Bases: `Generic[InputsT, OutputT, MetadataT]` A report of the results of evaluating a model on a set of cases. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationReport(Generic[InputsT, OutputT, MetadataT]): """A report of the results of evaluating a model on a set of cases.""" name: str """The name of the report.""" cases: list[ReportCase[InputsT, OutputT, MetadataT]] """The cases in the report.""" failures: list[ReportCaseFailure[InputsT, OutputT, MetadataT]] = field(default_factory=list) """The failures in the report. These are cases where task execution raised an exception.""" trace_id: str | None = None """The trace ID of the evaluation.""" span_id: str | None = None """The span ID of the evaluation.""" def averages(self) -> ReportCaseAggregate | None: if self.cases: return ReportCaseAggregate.average(self.cases) return None def render( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> str: # pragma: no cover """Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ io_file = StringIO() console = Console(width=width, file=io_file) self.print( width=width, baseline=baseline, console=console, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_errors=include_errors, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) Console(file=io_file) return io_file.getvalue() def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> None: # pragma: no cover """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ if console is None: console = Console(width=width) table = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) console.print(table) if include_errors and self.failures: failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> Table: """Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self) else: # pragma: no cover return renderer.build_diff_table(self, baseline) def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" return self.render() ``` #### name ```python name: str ``` The name of the report. #### cases ```python cases: list[ReportCase[InputsT, OutputT, MetadataT]] ``` The cases in the report. #### failures ```python failures: list[ ReportCaseFailure[InputsT, OutputT, MetadataT] ] = field(default_factory=list) ``` The failures in the report. These are cases where task execution raised an exception. #### trace_id ```python trace_id: str | None = None ``` The trace ID of the evaluation. #### span_id ```python span_id: str | None = None ``` The span ID of the evaluation. #### render ```python render( width: int | None = None, baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> str ``` Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def render( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> str: # pragma: no cover """Render this report to a nicely-formatted string, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ io_file = StringIO() console = Console(width=width, file=io_file) self.print( width=width, baseline=baseline, console=console, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_errors=include_errors, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) Console(file=io_file) return io_file.getvalue() ``` #### print ```python print( width: int | None = None, baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> None ``` Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def print( self, width: int | None = None, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, console: Console | None = None, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_errors: bool = True, include_error_stacktrace: bool = False, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> None: # pragma: no cover """Print this report to the console, optionally comparing it to a baseline report. If you want more control over the output, use `console_table` instead and pass it to `rich.Console.print`. """ if console is None: console = Console(width=width) table = self.console_table( baseline=baseline, include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_evaluator_failures=include_evaluator_failures, input_config=input_config, metadata_config=metadata_config, output_config=output_config, score_configs=score_configs, label_configs=label_configs, metric_configs=metric_configs, duration_config=duration_config, include_reasons=include_reasons, ) console.print(table) if include_errors and self.failures: failures_table = self.failures_table( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_error_message=True, include_error_stacktrace=include_error_stacktrace, input_config=input_config, metadata_config=metadata_config, ) console.print(failures_table, style='red') ``` #### console_table ```python console_table( baseline: ( EvaluationReport[InputsT, OutputT, MetadataT] | None ) = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: ( dict[str, RenderNumberConfig] | None ) = None, label_configs: ( dict[str, RenderValueConfig] | None ) = None, metric_configs: ( dict[str, RenderNumberConfig] | None ) = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False ) -> Table ``` Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def console_table( self, baseline: EvaluationReport[InputsT, OutputT, MetadataT] | None = None, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_output: bool = False, include_durations: bool = True, include_total_duration: bool = False, include_removed_cases: bool = False, include_averages: bool = True, include_evaluator_failures: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, output_config: RenderValueConfig | None = None, score_configs: dict[str, RenderNumberConfig] | None = None, label_configs: dict[str, RenderValueConfig] | None = None, metric_configs: dict[str, RenderNumberConfig] | None = None, duration_config: RenderNumberConfig | None = None, include_reasons: bool = False, ) -> Table: """Return a table containing the data from this report, or the diff between this report and a baseline report. Optionally include input and output details. """ renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=include_output, include_durations=include_durations, include_total_duration=include_total_duration, include_removed_cases=include_removed_cases, include_averages=include_averages, include_error_message=False, include_error_stacktrace=False, include_evaluator_failures=include_evaluator_failures, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=output_config or _DEFAULT_VALUE_CONFIG, score_configs=score_configs or {}, label_configs=label_configs or {}, metric_configs=metric_configs or {}, duration_config=duration_config or _DEFAULT_DURATION_CONFIG, include_reasons=include_reasons, ) if baseline is None: return renderer.build_table(self) else: # pragma: no cover return renderer.build_diff_table(self, baseline) ``` #### failures_table ```python failures_table( *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None ) -> Table ``` Return a table containing the failures in this report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def failures_table( self, *, include_input: bool = False, include_metadata: bool = False, include_expected_output: bool = False, include_error_message: bool = True, include_error_stacktrace: bool = True, input_config: RenderValueConfig | None = None, metadata_config: RenderValueConfig | None = None, ) -> Table: """Return a table containing the failures in this report.""" renderer = EvaluationRenderer( include_input=include_input, include_metadata=include_metadata, include_expected_output=include_expected_output, include_output=False, include_durations=False, include_total_duration=False, include_removed_cases=False, include_averages=False, input_config={**_DEFAULT_VALUE_CONFIG, **(input_config or {})}, metadata_config={**_DEFAULT_VALUE_CONFIG, **(metadata_config or {})}, output_config=_DEFAULT_VALUE_CONFIG, score_configs={}, label_configs={}, metric_configs={}, duration_config=_DEFAULT_DURATION_CONFIG, include_reasons=False, include_error_message=include_error_message, include_error_stacktrace=include_error_stacktrace, include_evaluator_failures=False, # Not applicable for failures table ) return renderer.build_failures_table(self) ``` #### __str__ ```python __str__() -> str ``` Return a string representation of the report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python def __str__(self) -> str: # pragma: lax no cover """Return a string representation of the report.""" return self.render() ``` ### RenderValueConfig Bases: `TypedDict` A configuration for rendering a values in an Evaluation report. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderValueConfig(TypedDict, total=False): """A configuration for rendering a values in an Evaluation report.""" value_formatter: str | Callable[[Any], str] diff_checker: Callable[[Any, Any], bool] | None diff_formatter: Callable[[Any, Any], str | None] | None diff_style: str ``` ### RenderNumberConfig Bases: `TypedDict` A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python class RenderNumberConfig(TypedDict, total=False): """A configuration for rendering a particular score or metric in an Evaluation report. See the implementation of `_RenderNumber` for more clarity on how these parameters affect the rendering. """ value_formatter: str | Callable[[float | int], str] """The logic to use for formatting values. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. * You can also use a custom string format spec, e.g. '{:.3f}' * You can also use a custom function, e.g. lambda x: f'{x:.3f}' """ diff_formatter: str | Callable[[float | int, float | int], str | None] | None """The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. * If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. * You can also use a custom string format spec, e.g. '{:+.3f}' * You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. * You can also use None to never generate extra diff text. """ diff_atol: float """The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. """ diff_rtol: float """The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. """ diff_increase_style: str """The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ diff_decrease_style: str """The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". """ ``` #### value_formatter ```python value_formatter: str | Callable[[float | int], str] ``` The logic to use for formatting values. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures. - You can also use a custom string format spec, e.g. '{:.3f}' - You can also use a custom function, e.g. lambda x: f'{x:.3f}' #### diff_formatter ```python diff_formatter: ( str | Callable[[float | int, float | int], str | None] | None ) ``` The logic to use for formatting details about the diff. The strings produced by the value_formatter will always be included in the reports, but the diff_formatter is used to produce additional text about the difference between the old and new values, such as the absolute or relative difference. - If not provided, format as ints if all values are ints, otherwise at least one decimal place and at least four significant figures, and will include the percentage change. - You can also use a custom string format spec, e.g. '{:+.3f}' - You can also use a custom function, e.g. lambda x: f'{x:+.3f}'. If this function returns None, no extra diff text will be added. - You can also use None to never generate extra diff text. #### diff_atol ```python diff_atol: float ``` The absolute tolerance for considering a difference "significant". A difference is "significant" if `abs(new - old) < self.diff_atol + self.diff_rtol * abs(old)`. If a difference is not significant, it will not have the diff styles applied. Note that we still show both the rendered before and after values in the diff any time they differ, even if the difference is not significant. (If the rendered values are exactly the same, we only show the value once.) If not provided, use 1e-6. #### diff_rtol ```python diff_rtol: float ``` The relative tolerance for considering a difference "significant". See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use 0.001 if all values are ints, otherwise 0.05. #### diff_increase_style ```python diff_increase_style: str ``` The style to apply to diffed values that have a significant increase. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use green for scores and red for metrics. You can also use arbitrary `rich` styles, such as "bold red". #### diff_decrease_style ```python diff_decrease_style: str ``` The style to apply to diffed values that have significant decrease. See the description of `diff_atol` for more details about what makes a difference "significant". If not provided, use red for scores and green for metrics. You can also use arbitrary `rich` styles, such as "bold red". ### EvaluationRenderer A class for rendering an EvalReport or the diff between two EvalReports. Source code in `pydantic_evals/pydantic_evals/reporting/__init__.py` ```python @dataclass(kw_only=True) class EvaluationRenderer: """A class for rendering an EvalReport or the diff between two EvalReports.""" # Columns to include include_input: bool include_metadata: bool include_expected_output: bool include_output: bool include_durations: bool include_total_duration: bool # Rows to include include_removed_cases: bool include_averages: bool input_config: RenderValueConfig metadata_config: RenderValueConfig output_config: RenderValueConfig score_configs: dict[str, RenderNumberConfig] label_configs: dict[str, RenderValueConfig] metric_configs: dict[str, RenderNumberConfig] duration_config: RenderNumberConfig # Data to include include_reasons: bool # only applies to reports, not to diffs include_error_message: bool include_error_stacktrace: bool include_evaluator_failures: bool def include_scores(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.scores for case in self._all_cases(report, baseline)) def include_labels(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.labels for case in self._all_cases(report, baseline)) def include_metrics(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.metrics for case in self._all_cases(report, baseline)) def include_assertions(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return any(case.assertions for case in self._all_cases(report, baseline)) def include_evaluator_failures_column(self, report: EvaluationReport, baseline: EvaluationReport | None = None): return self.include_evaluator_failures and any( case.evaluator_failures for case in self._all_cases(report, baseline) ) def _all_cases(self, report: EvaluationReport, baseline: EvaluationReport | None) -> list[ReportCase]: if not baseline: return report.cases else: return report.cases + self._baseline_cases_to_include(report, baseline) def _baseline_cases_to_include(self, report: EvaluationReport, baseline: EvaluationReport) -> list[ReportCase]: if self.include_removed_cases: return baseline.cases report_case_names = {case.name for case in report.cases} return [case for case in baseline.cases if case.name in report_case_names] def _get_case_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None = None ) -> ReportCaseRenderer: input_renderer = _ValueRenderer.from_config(self.input_config) metadata_renderer = _ValueRenderer.from_config(self.metadata_config) output_renderer = _ValueRenderer.from_config(self.output_config) score_renderers = self._infer_score_renderers(report, baseline) label_renderers = self._infer_label_renderers(report, baseline) metric_renderers = self._infer_metric_renderers(report, baseline) duration_renderer = _NumberRenderer.infer_from_config( self.duration_config, 'duration', [x.task_duration for x in self._all_cases(report, baseline)] ) return ReportCaseRenderer( include_input=self.include_input, include_metadata=self.include_metadata, include_expected_output=self.include_expected_output, include_output=self.include_output, include_scores=self.include_scores(report, baseline), include_labels=self.include_labels(report, baseline), include_metrics=self.include_metrics(report, baseline), include_assertions=self.include_assertions(report, baseline), include_reasons=self.include_reasons, include_durations=self.include_durations, include_total_duration=self.include_total_duration, include_error_message=self.include_error_message, include_error_stacktrace=self.include_error_stacktrace, include_evaluator_failures=self.include_evaluator_failures_column(report, baseline), input_renderer=input_renderer, metadata_renderer=metadata_renderer, output_renderer=output_renderer, score_renderers=score_renderers, label_renderers=label_renderers, metric_renderers=metric_renderers, duration_renderer=duration_renderer, ) def build_table(self, report: EvaluationReport) -> Table: case_renderer = self._get_case_renderer(report) table = case_renderer.build_base_table(f'Evaluation Summary: {report.name}') for case in report.cases: table.add_row(*case_renderer.build_row(case)) if self.include_averages: # pragma: no branch average = report.averages() if average: # pragma: no branch table.add_row(*case_renderer.build_aggregate_row(average)) return table def build_diff_table(self, report: EvaluationReport, baseline: EvaluationReport) -> Table: report_cases = report.cases baseline_cases = self._baseline_cases_to_include(report, baseline) report_cases_by_id = {case.name: case for case in report_cases} baseline_cases_by_id = {case.name: case for case in baseline_cases} diff_cases: list[tuple[ReportCase, ReportCase]] = [] removed_cases: list[ReportCase] = [] added_cases: list[ReportCase] = [] for case_id in sorted(set(baseline_cases_by_id.keys()) | set(report_cases_by_id.keys())): maybe_baseline_case = baseline_cases_by_id.get(case_id) maybe_report_case = report_cases_by_id.get(case_id) if maybe_baseline_case and maybe_report_case: diff_cases.append((maybe_baseline_case, maybe_report_case)) elif maybe_baseline_case: removed_cases.append(maybe_baseline_case) elif maybe_report_case: added_cases.append(maybe_report_case) else: # pragma: no cover assert False, 'This should be unreachable' case_renderer = self._get_case_renderer(report, baseline) diff_name = baseline.name if baseline.name == report.name else f'{baseline.name} → {report.name}' table = case_renderer.build_base_table(f'Evaluation Diff: {diff_name}') for baseline_case, new_case in diff_cases: table.add_row(*case_renderer.build_diff_row(new_case, baseline_case)) for case in added_cases: row = case_renderer.build_row(case) row[0] = f'[green]+ Added Case[/]\n{row[0]}' table.add_row(*row) for case in removed_cases: row = case_renderer.build_row(case) row[0] = f'[red]- Removed Case[/]\n{row[0]}' table.add_row(*row) if self.include_averages: # pragma: no branch report_average = ReportCaseAggregate.average(report_cases) baseline_average = ReportCaseAggregate.average(baseline_cases) table.add_row(*case_renderer.build_diff_aggregate_row(report_average, baseline_average)) return table def build_failures_table(self, report: EvaluationReport) -> Table: case_renderer = self._get_case_renderer(report) table = case_renderer.build_failures_table('Case Failures') for case in report.failures: table.add_row(*case_renderer.build_failure_row(case)) return table def _infer_score_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, score in case.scores.items(): values_by_name.setdefault(k, []).append(score.value) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.score_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'score', values) return all_renderers def _infer_label_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _ValueRenderer]: all_cases = self._all_cases(report, baseline) all_names: set[str] = set() for case in all_cases: for k in case.labels: all_names.add(k) all_renderers: dict[str, _ValueRenderer] = {} for name in all_names: merged_config = _DEFAULT_VALUE_CONFIG.copy() merged_config.update(self.label_configs.get(name, {})) all_renderers[name] = _ValueRenderer.from_config(merged_config) return all_renderers def _infer_metric_renderers( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> dict[str, _NumberRenderer]: all_cases = self._all_cases(report, baseline) values_by_name: dict[str, list[float | int]] = {} for case in all_cases: for k, v in case.metrics.items(): values_by_name.setdefault(k, []).append(v) all_renderers: dict[str, _NumberRenderer] = {} for name, values in values_by_name.items(): merged_config = _DEFAULT_NUMBER_CONFIG.copy() merged_config.update(self.metric_configs.get(name, {})) all_renderers[name] = _NumberRenderer.infer_from_config(merged_config, 'metric', values) return all_renderers def _infer_duration_renderer( self, report: EvaluationReport, baseline: EvaluationReport | None ) -> _NumberRenderer: # pragma: no cover all_cases = self._all_cases(report, baseline) all_durations = [x.task_duration for x in all_cases] if self.include_total_duration: all_durations += [x.total_duration for x in all_cases] return _NumberRenderer.infer_from_config(self.duration_config, 'duration', all_durations) ``` # `pydantic_graph.exceptions` ### GraphSetupError Bases: `TypeError` Error caused by an incorrectly configured graph. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphSetupError(TypeError): """Error caused by an incorrectly configured graph.""" message: str """Description of the mistake.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` Description of the mistake. ### GraphRuntimeError Bases: `RuntimeError` Error caused by an issue during graph execution. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphRuntimeError(RuntimeError): """Error caused by an issue during graph execution.""" message: str """The error message.""" def __init__(self, message: str): self.message = message super().__init__(message) ``` #### message ```python message: str = message ``` The error message. ### GraphNodeStatusError Bases: `GraphRuntimeError` Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python class GraphNodeStatusError(GraphRuntimeError): """Error caused by trying to run a node that already has status `'running'`, `'success'`, or `'error'`.""" def __init__(self, actual_status: 'SnapshotStatus'): self.actual_status = actual_status super().__init__(f"Incorrect snapshot status {actual_status!r}, must be 'created' or 'pending'.") @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` #### check ```python check(status: SnapshotStatus) -> None ``` Check if the status is valid. Source code in `pydantic_graph/pydantic_graph/exceptions.py` ```python @classmethod def check(cls, status: 'SnapshotStatus') -> None: """Check if the status is valid.""" if status not in {'created', 'pending'}: raise cls(status) ``` # `pydantic_graph` ### Graph Bases: `Generic[StateT, DepsT, RunEndT]` Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. never_42.py ```py from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` *(This example is complete, it can be run "as is")* See run For an example of running graph, and mermaid_code for an example of generating a mermaid diagram from the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python @dataclass(init=False) class Graph(Generic[StateT, DepsT, RunEndT]): """Definition of a graph. In `pydantic-graph`, a graph is a collection of nodes that can be run in sequence. The nodes define their outgoing edges — e.g. which nodes may be run next, and thereby the structure of the graph. Here's a very simple example of a graph which increments a number by 1, but makes sure the number is never 42 at the end. ```py {title="never_42.py" noqa="I001"} from __future__ import annotations from dataclasses import dataclass from pydantic_graph import BaseNode, End, Graph, GraphRunContext @dataclass class MyState: number: int @dataclass class Increment(BaseNode[MyState]): async def run(self, ctx: GraphRunContext) -> Check42: ctx.state.number += 1 return Check42() @dataclass class Check42(BaseNode[MyState, None, int]): async def run(self, ctx: GraphRunContext) -> Increment | End[int]: if ctx.state.number == 42: return Increment() else: return End(ctx.state.number) never_42_graph = Graph(nodes=(Increment, Check42)) ``` _(This example is complete, it can be run "as is")_ See [`run`][pydantic_graph.graph.Graph.run] For an example of running graph, and [`mermaid_code`][pydantic_graph.graph.Graph.mermaid_code] for an example of generating a mermaid diagram from the graph. """ name: str | None node_defs: dict[str, NodeDef[StateT, DepsT, RunEndT]] _state_type: type[StateT] | _utils.Unset = field(repr=False) _run_end_type: type[RunEndT] | _utils.Unset = field(repr=False) auto_instrument: bool = field(repr=False) def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] @cached_property def inferred_types(self) -> tuple[type[StateT], type[RunEndT]]: # Get the types of the state and run end from the graph. if _utils.is_set(self._state_type) and _utils.is_set(self._run_end_type): return self._state_type, self._run_end_type state_type = self._state_type run_end_type = self._run_end_type for node_def in self.node_defs.values(): for base in typing_extensions.get_original_bases(node_def.node): if typing_extensions.get_origin(base) is BaseNode: args = typing_extensions.get_args(base) if not _utils.is_set(state_type) and args: state_type = args[0] if not _utils.is_set(run_end_type) and len(args) == 3: t = args[2] if not typing_objects.is_never(t): run_end_type = t if _utils.is_set(state_type) and _utils.is_set(run_end_type): return state_type, run_end_type # pyright: ignore[reportReturnType] # break the inner (bases) loop break if not _utils.is_set(state_type): # pragma: no branch # state defaults to None, so use that if we can't infer it state_type = None if not _utils.is_set(run_end_type): # this happens if a graph has no return nodes, use None so any downstream errors are clear run_end_type = None return state_type, run_end_type # pyright: ignore[reportReturnType] def _register_node( self, node: type[BaseNode[StateT, DepsT, RunEndT]], parent_namespace: dict[str, Any] | None, ) -> None: node_id = node.get_node_id() if existing_node := self.node_defs.get(node_id): raise exceptions.GraphSetupError( f'Node ID `{node_id}` is not unique — found on {existing_node.node} and {node}' ) else: self.node_defs[node_id] = node.get_node_def(parent_namespace) def _validate_edges(self): known_node_ids = self.node_defs.keys() bad_edges: dict[str, list[str]] = {} for node_id, node_def in self.node_defs.items(): for edge in node_def.next_node_edges.keys(): if edge not in known_node_ids: bad_edges.setdefault(edge, []).append(f'`{node_id}`') if bad_edges: bad_edges_list = [f'`{k}` is referenced by {_utils.comma_and(v)}' for k, v in bad_edges.items()] if len(bad_edges_list) == 1: raise exceptions.GraphSetupError(f'{bad_edges_list[0]} but not included in the graph.') else: b = '\n'.join(f' {be}' for be in bad_edges_list) raise exceptions.GraphSetupError( f'Nodes are referenced in the graph but not included in the graph:\n{b}' ) def _infer_name(self, function_frame: types.FrameType | None) -> None: """Infer the agent name from the call frame. Usage should be `self._infer_name(inspect.currentframe())`. Copied from `Agent`. """ assert self.name is None, 'Name already set' if function_frame is not None and (parent_frame := function_frame.f_back): # pragma: no branch for name, item in parent_frame.f_locals.items(): if item is self: self.name = name return if parent_frame.f_locals != parent_frame.f_globals: # pragma: no branch # if we couldn't find the agent in locals and globals are a different dict, try globals for name, item in parent_frame.f_globals.items(): # pragma: no branch if item is self: self.name = name return ```` #### __init__ ```python __init__( *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | Unset = UNSET, run_end_type: type[RunEndT] | Unset = UNSET, auto_instrument: bool = True ) ``` Create a graph from a sequence of nodes. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `nodes` | `Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]` | The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. | *required* | | `name` | `str | None` | Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. | `None` | | `state_type` | `type[StateT] | Unset` | The type of the state for the graph, this can generally be inferred from nodes. | `UNSET` | | `run_end_type` | `type[RunEndT] | Unset` | The type of the result of running the graph, this can generally be inferred from nodes. | `UNSET` | | `auto_instrument` | `bool` | Whether to create a span for the graph run and the execution of each node's run method. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, nodes: Sequence[type[BaseNode[StateT, DepsT, RunEndT]]], name: str | None = None, state_type: type[StateT] | _utils.Unset = _utils.UNSET, run_end_type: type[RunEndT] | _utils.Unset = _utils.UNSET, auto_instrument: bool = True, ): """Create a graph from a sequence of nodes. Args: nodes: The nodes which make up the graph, nodes need to be unique and all be generic in the same state type. name: Optional name for the graph, if not provided the name will be inferred from the calling frame on the first call to a graph method. state_type: The type of the state for the graph, this can generally be inferred from `nodes`. run_end_type: The type of the result of running the graph, this can generally be inferred from `nodes`. auto_instrument: Whether to create a span for the graph run and the execution of each node's run method. """ self.name = name self._state_type = state_type self._run_end_type = run_end_type self.auto_instrument = auto_instrument parent_namespace = _utils.get_parent_namespace(inspect.currentframe()) self.node_defs = {} for node in nodes: self._register_node(node, parent_namespace) self._validate_edges() ``` #### run ```python run( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Run the graph from a starting node until it ends. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --- | --- | | `GraphRunResult[StateT, RunEndT]` | A GraphRunResult containing information about the run, including its final result. | Here's an example of running the graph from above: run_never_42.py ```py from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def run( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Run the graph from a starting node until it ends. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: A `GraphRunResult` containing information about the run, including its final result. Here's an example of running the graph from [above][pydantic_graph.graph.Graph]: ```py {title="run_never_42.py" noqa="I001" requires="never_42.py"} from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=2) state = MyState(41) await never_42_graph.run(Increment(), state=state) print(state) #> MyState(number=43) ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) async with self.iter( start_node, state=state, deps=deps, persistence=persistence, infer_name=False ) as graph_run: async for _node in graph_run: pass result = graph_run.result assert result is not None, 'GraphRun should have a result' return result ```` #### run_sync ```python run_sync( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, infer_name: bool = True ) -> GraphRunResult[StateT, RunEndT] ``` Synchronously run the graph. This is a convenience method that wraps self.run with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: | Type | Description | | --- | --- | | `GraphRunResult[StateT, RunEndT]` | The result type from ending the run and the history of the run. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def run_sync( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, infer_name: bool = True, ) -> GraphRunResult[StateT, RunEndT]: """Synchronously run the graph. This is a convenience method that wraps [`self.run`][pydantic_graph.Graph.run] with `loop.run_until_complete(...)`. You therefore can't use this method inside async code or if there's an active event loop. Args: start_node: the first node to run, since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. infer_name: Whether to infer the graph name from the calling frame. Returns: The result type from ending the run and the history of the run. """ if infer_name and self.name is None: # pragma: no branch self._infer_name(inspect.currentframe()) return _utils.get_event_loop().run_until_complete( self.run(start_node, state=state, deps=deps, persistence=persistence, infer_name=False) ) ``` #### iter ```python iter( start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: ( BaseStatePersistence[StateT, RunEndT] | None ) = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling GraphRun.next. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of GraphRun. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. | *required* | | `state` | `StateT` | The initial state of the graph. | `None` | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `persistence` | `BaseStatePersistence[StateT, RunEndT] | None` | State persistence interface, defaults to SimpleStatePersistence if None. | `None` | | `span` | `AbstractContextManager[AbstractSpan] | None` | The span to use for the graph run. If not provided, a new span will be created. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter( self, start_node: BaseNode[StateT, DepsT, RunEndT], *, state: StateT = None, deps: DepsT = None, persistence: BaseStatePersistence[StateT, RunEndT] | None = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager which can be used to iterate over the graph's nodes as they are executed. This method returns a `GraphRun` object which can be used to async-iterate over the nodes of this `Graph` as they are executed. This is the API to use if you want to record or interact with the nodes as the graph execution unfolds. The `GraphRun` can also be used to manually drive the graph execution by calling [`GraphRun.next`][pydantic_graph.graph.GraphRun.next]. The `GraphRun` provides access to the full run history, state, deps, and the final result of the run once it has completed. For more details, see the API documentation of [`GraphRun`][pydantic_graph.graph.GraphRun]. Args: start_node: the first node to run. Since the graph definition doesn't define the entry point in the graph, you need to provide the starting node. state: The initial state of the graph. deps: The dependencies of the graph. persistence: State persistence interface, defaults to [`SimpleStatePersistence`][pydantic_graph.SimpleStatePersistence] if `None`. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) if persistence is None: persistence = SimpleStatePersistence() persistence.set_graph_types(self) with ExitStack() as stack: entered_span: AbstractSpan | None = None if span is None: if self.auto_instrument: entered_span = stack.enter_context(logfire_span('run graph {graph.name}', graph=self)) else: entered_span = stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=start_node, persistence=persistence, state=state, deps=deps, traceparent=traceparent, ) ``` #### iter_from_persistence ```python iter_from_persistence( persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: ( AbstractContextManager[AbstractSpan] | None ) = None, infer_name: bool = True ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]] ``` A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to iter, but instead of passing the node to run, it will restore the node and state from state persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | The state persistence interface to use. | *required* | | `deps` | `DepsT` | The dependencies of the graph. | `None` | | `span` | `AbstractContextManager[AbstractSpan] | None` | The span to use for the graph run. If not provided, a new span will be created. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Returns: A GraphRun that can be async iterated over to drive the graph to completion. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @asynccontextmanager async def iter_from_persistence( self, persistence: BaseStatePersistence[StateT, RunEndT], *, deps: DepsT = None, span: AbstractContextManager[AbstractSpan] | None = None, infer_name: bool = True, ) -> AsyncIterator[GraphRun[StateT, DepsT, RunEndT]]: """A contextmanager to iterate over the graph's nodes as they are executed, created from a persistence object. This method has similar functionality to [`iter`][pydantic_graph.graph.Graph.iter], but instead of passing the node to run, it will restore the node and state from state persistence. Args: persistence: The state persistence interface to use. deps: The dependencies of the graph. span: The span to use for the graph run. If not provided, a new span will be created. infer_name: Whether to infer the graph name from the calling frame. Returns: A GraphRun that can be async iterated over to drive the graph to completion. """ if infer_name and self.name is None: # f_back because `asynccontextmanager` adds one frame if frame := inspect.currentframe(): # pragma: no branch self._infer_name(frame.f_back) persistence.set_graph_types(self) snapshot = await persistence.load_next() if snapshot is None: raise exceptions.GraphRuntimeError('Unable to restore snapshot from state persistence.') snapshot.node.set_snapshot_id(snapshot.id) if self.auto_instrument and span is None: # pragma: no branch span = logfire_span('run graph {graph.name}', graph=self) with ExitStack() as stack: entered_span = None if span is None else stack.enter_context(span) traceparent = None if entered_span is None else get_traceparent(entered_span) yield GraphRun[StateT, DepsT, RunEndT]( graph=self, start_node=snapshot.node, persistence=persistence, state=snapshot.state, deps=deps, snapshot_id=snapshot.id, traceparent=traceparent, ) ``` #### initialize ```python initialize( node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True ) -> None ``` Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via iter_from_persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `BaseNode[StateT, DepsT, RunEndT]` | The node to run first. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | The start state of the graph. | `None` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def initialize( self, node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], *, state: StateT = None, infer_name: bool = True, ) -> None: """Initialize a new graph run in persistence without running it. This is useful if you want to set up a graph run to be run later, e.g. via [`iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence]. Args: node: The node to run first. persistence: State persistence interface. state: The start state of the graph. infer_name: Whether to infer the graph name from the calling frame. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) persistence.set_graph_types(self) await persistence.snapshot_node(state, node) ``` #### mermaid_code ```python mermaid_code( *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, title: str | None | Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: StateDiagramDirection | None = None ) -> str ``` Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls pydantic_graph.mermaid.generate_code. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `start_node` | `Sequence[NodeIdent] | NodeIdent | None` | The node or nodes which can start the graph. | `None` | | `title` | `str | None | Literal[False]` | The title of the diagram, use False to not include a title. | `None` | | `edge_labels` | `bool` | Whether to include edge labels. | `True` | | `notes` | `bool` | Whether to include notes on each node. | `True` | | `highlighted_nodes` | `Sequence[NodeIdent] | NodeIdent | None` | Optional node or nodes to highlight. | `None` | | `highlight_css` | `str` | The CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `direction` | `StateDiagramDirection | None` | The direction of flow. | `None` | Returns: | Type | Description | | --- | --- | | `str` | The mermaid code for the graph, which can then be rendered as a diagram. | Here's an example of generating a diagram for the graph from above: mermaid_never_42.py ```py from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ``` --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` Source code in `pydantic_graph/pydantic_graph/graph.py` ````python def mermaid_code( self, *, start_node: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, title: str | None | typing_extensions.Literal[False] = None, edge_labels: bool = True, notes: bool = True, highlighted_nodes: Sequence[mermaid.NodeIdent] | mermaid.NodeIdent | None = None, highlight_css: str = mermaid.DEFAULT_HIGHLIGHT_CSS, infer_name: bool = True, direction: mermaid.StateDiagramDirection | None = None, ) -> str: """Generate a diagram representing the graph as [mermaid](https://mermaid.js.org/) diagram. This method calls [`pydantic_graph.mermaid.generate_code`][pydantic_graph.mermaid.generate_code]. Args: start_node: The node or nodes which can start the graph. title: The title of the diagram, use `False` to not include a title. edge_labels: Whether to include edge labels. notes: Whether to include notes on each node. highlighted_nodes: Optional node or nodes to highlight. highlight_css: The CSS to use for highlighting nodes. infer_name: Whether to infer the graph name from the calling frame. direction: The direction of flow. Returns: The mermaid code for the graph, which can then be rendered as a diagram. Here's an example of generating a diagram for the graph from [above][pydantic_graph.graph.Graph]: ```py {title="mermaid_never_42.py" requires="never_42.py"} from never_42 import Increment, never_42_graph print(never_42_graph.mermaid_code(start_node=Increment)) ''' --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ''' ``` The rendered diagram will look like this: ```mermaid --- title: never_42_graph --- stateDiagram-v2 [*] --> Increment Increment --> Check42 Check42 --> Increment Check42 --> [*] ``` """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if title is None and self.name: title = self.name return mermaid.generate_code( self, start_node=start_node, highlighted_nodes=highlighted_nodes, highlight_css=highlight_css, title=title or None, edge_labels=edge_labels, notes=notes, direction=direction, ) ```` #### mermaid_image ```python mermaid_image( infer_name: bool = True, **kwargs: Unpack[MermaidConfig] ) -> bytes ``` Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.request_image. | `{}` | Returns: | Type | Description | | --- | --- | | `bytes` | The image bytes. | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_image( self, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> bytes: """Generate a diagram representing the graph as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.request_image`. Returns: The image bytes. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name return mermaid.request_image(self, **kwargs) ``` #### mermaid_save ```python mermaid_save( path: Path | str, /, *, infer_name: bool = True, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see pydantic_graph.mermaid.MermaidConfig. Uses external service This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | The path to save the image to. | *required* | | `infer_name` | `bool` | Whether to infer the graph name from the calling frame. | `True` | | `**kwargs` | `Unpack[MermaidConfig]` | Additional arguments to pass to mermaid.save_image. | `{}` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def mermaid_save( self, path: Path | str, /, *, infer_name: bool = True, **kwargs: typing_extensions.Unpack[mermaid.MermaidConfig] ) -> None: """Generate a diagram representing the graph and save it as an image. The format and diagram can be customized using `kwargs`, see [`pydantic_graph.mermaid.MermaidConfig`][pydantic_graph.mermaid.MermaidConfig]. !!! note "Uses external service" This method makes a request to [mermaid.ink](https://mermaid.ink) to render the image, `mermaid.ink` is a free service not affiliated with Pydantic. Args: path: The path to save the image to. infer_name: Whether to infer the graph name from the calling frame. **kwargs: Additional arguments to pass to `mermaid.save_image`. """ if infer_name and self.name is None: self._infer_name(inspect.currentframe()) if 'title' not in kwargs and self.name: kwargs['title'] = self.name mermaid.save_image(path, self, **kwargs) ``` #### get_nodes ```python get_nodes() -> ( Sequence[type[BaseNode[StateT, DepsT, RunEndT]]] ) ``` Get the nodes in the graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def get_nodes(self) -> Sequence[type[BaseNode[StateT, DepsT, RunEndT]]]: """Get the nodes in the graph.""" return [node_def.node for node_def in self.node_defs.values()] ``` ### GraphRun Bases: `Generic[StateT, DepsT, RunEndT]` A stateful, async-iterable run of a Graph. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from above: iter_never_42.py ```py from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the GraphRun.next documentation for an example of how to manually drive the graph run. Source code in `pydantic_graph/pydantic_graph/graph.py` ````python class GraphRun(Generic[StateT, DepsT, RunEndT]): """A stateful, async-iterable run of a [`Graph`][pydantic_graph.graph.Graph]. You typically get a `GraphRun` instance from calling `async with [my_graph.iter(...)][pydantic_graph.graph.Graph.iter] as graph_run:`. That gives you the ability to iterate through nodes as they run, either by `async for` iteration or by repeatedly calling `.next(...)`. Here's an example of iterating over the graph from [above][pydantic_graph.graph.Graph]: ```py {title="iter_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(1) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=1)), (Increment(), MyState(number=1)), (Check42(), MyState(number=2)), (End(data=2), MyState(number=2)), ] ''' state = MyState(41) async with never_42_graph.iter(Increment(), state=state) as graph_run: node_states = [(graph_run.next_node, deepcopy(graph_run.state))] async for node in graph_run: node_states.append((node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=41)), (Increment(), MyState(number=41)), (Check42(), MyState(number=42)), (Increment(), MyState(number=42)), (Check42(), MyState(number=43)), (End(data=43), MyState(number=43)), ] ''' ``` See the [`GraphRun.next` documentation][pydantic_graph.graph.GraphRun.next] for an example of how to manually drive the graph run. """ def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: if self.__traceparent is None and required: # pragma: no cover raise exceptions.GraphRuntimeError('No span was created for this graph run') return self.__traceparent @property def next_node(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. """ return self._next_node @property def result(self) -> GraphRunResult[StateT, RunEndT] | None: """The final result of the graph run if the run is completed, otherwise `None`.""" if not isinstance(self._next_node, End): return None # The GraphRun has not finished running return GraphRunResult[StateT, RunEndT]( self._next_node.data, state=self.state, persistence=self.persistence, traceparent=self._traceparent(required=False), ) async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: stack.enter_context(logfire_span('run node {node_id}', node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node def __aiter__(self) -> AsyncIterator[BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]]: return self async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) def __repr__(self) -> str: return f'' ```` #### __init__ ```python __init__( *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None ) ``` Create a new run for a given graph, starting at the specified node. Typically, you'll use Graph.iter rather than calling this directly. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[StateT, DepsT, RunEndT]` | The Graph to run. | *required* | | `start_node` | `BaseNode[StateT, DepsT, RunEndT]` | The node where execution will begin. | *required* | | `persistence` | `BaseStatePersistence[StateT, RunEndT]` | State persistence interface. | *required* | | `state` | `StateT` | A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via ctx.state. | *required* | | `deps` | `DepsT` | Optional dependencies that each node can access via ctx.deps, e.g. database connections, configuration, or logging clients. | *required* | | `traceparent` | `str | None` | The traceparent for the span used for the graph run. | *required* | | `snapshot_id` | `str | None` | The ID of the snapshot the node came from. | `None` | Source code in `pydantic_graph/pydantic_graph/graph.py` ```python def __init__( self, *, graph: Graph[StateT, DepsT, RunEndT], start_node: BaseNode[StateT, DepsT, RunEndT], persistence: BaseStatePersistence[StateT, RunEndT], state: StateT, deps: DepsT, traceparent: str | None, snapshot_id: str | None = None, ): """Create a new run for a given graph, starting at the specified node. Typically, you'll use [`Graph.iter`][pydantic_graph.graph.Graph.iter] rather than calling this directly. Args: graph: The [`Graph`][pydantic_graph.graph.Graph] to run. start_node: The node where execution will begin. persistence: State persistence interface. state: A shared state object or primitive (like a counter, dataclass, etc.) that is available to all nodes via `ctx.state`. deps: Optional dependencies that each node can access via `ctx.deps`, e.g. database connections, configuration, or logging clients. traceparent: The traceparent for the span used for the graph run. snapshot_id: The ID of the snapshot the node came from. """ self.graph = graph self.persistence = persistence self._snapshot_id: str | None = snapshot_id self.state = state self.deps = deps self.__traceparent = traceparent self._next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] = start_node self._is_started: bool = False ``` #### next_node ```python next_node: BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` The next node that will be run in the graph. This is the next node that will be used during async iteration, or if a node is not passed to `self.next(...)`. #### result ```python result: GraphRunResult[StateT, RunEndT] | None ``` The final result of the graph run if the run is completed, otherwise `None`. #### next ```python next( node: BaseNode[StateT, DepsT, RunEndT] | None = None, ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ``` Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an End node. Here's an example of using `next` to drive the graph from above: next_never_42.py ```py from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `node` | `BaseNode[StateT, DepsT, RunEndT] | None` | The node to run next in the graph. If not specified, uses self.next_node, which is initialized to the start_node of the run and updated each time a new node is returned. | `None` | Returns: | Type | Description | | --- | --- | | `BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]` | The next node returned by the graph logic, or an End node if | | `BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]` | the run has completed. | Source code in `pydantic_graph/pydantic_graph/graph.py` ````python async def next( self, node: BaseNode[StateT, DepsT, RunEndT] | None = None ) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Manually drive the graph run by passing in the node you want to run next. This lets you inspect or mutate the node before continuing execution, or skip certain nodes under dynamic conditions. The graph run should stop when you return an [`End`][pydantic_graph.nodes.End] node. Here's an example of using `next` to drive the graph from [above][pydantic_graph.graph.Graph]: ```py {title="next_never_42.py" noqa="I001" requires="never_42.py"} from copy import deepcopy from pydantic_graph import End from never_42 import Increment, MyState, never_42_graph async def main(): state = MyState(48) async with never_42_graph.iter(Increment(), state=state) as graph_run: next_node = graph_run.next_node # start with the first node node_states = [(next_node, deepcopy(graph_run.state))] while not isinstance(next_node, End): if graph_run.state.number == 50: graph_run.state.number = 42 next_node = await graph_run.next(next_node) node_states.append((next_node, deepcopy(graph_run.state))) print(node_states) ''' [ (Increment(), MyState(number=48)), (Check42(), MyState(number=49)), (End(data=49), MyState(number=49)), ] ''' ``` Args: node: The node to run next in the graph. If not specified, uses `self.next_node`, which is initialized to the `start_node` of the run and updated each time a new node is returned. Returns: The next node returned by the graph logic, or an [`End`][pydantic_graph.nodes.End] node if the run has completed. """ if node is None: # This cast is necessary because self._next_node could be an `End`. You'll get a runtime error if that's # the case, but if it is, the only way to get there would be to have tried calling next manually after # the run finished. Either way, maybe it would be better to not do this cast... node = cast(BaseNode[StateT, DepsT, RunEndT], self._next_node) node_snapshot_id = node.get_snapshot_id() else: node_snapshot_id = node.get_snapshot_id() if node_snapshot_id != self._snapshot_id: await self.persistence.snapshot_node_if_new(node_snapshot_id, self.state, node) self._snapshot_id = node_snapshot_id if not isinstance(node, BaseNode): # While technically this is not compatible with the documented method signature, it's an easy mistake to # make, and we should eagerly provide a more helpful error message than you'd get otherwise. raise TypeError(f'`next` must be called with a `BaseNode` instance, got {node!r}.') node_id = node.get_node_id() if node_id not in self.graph.node_defs: raise exceptions.GraphRuntimeError(f'Node `{node}` is not in the graph.') with ExitStack() as stack: if self.graph.auto_instrument: stack.enter_context(logfire_span('run node {node_id}', node_id=node_id, node=node)) async with self.persistence.record_run(node_snapshot_id): ctx = GraphRunContext(state=self.state, deps=self.deps) self._next_node = await node.run(ctx) if isinstance(self._next_node, End): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_end(self.state, self._next_node) elif isinstance(self._next_node, BaseNode): self._snapshot_id = self._next_node.get_snapshot_id() await self.persistence.snapshot_node(self.state, self._next_node) else: raise exceptions.GraphRuntimeError( f'Invalid node return type: `{type(self._next_node).__name__}`. Expected `BaseNode` or `End`.' ) return self._next_node ```` #### __anext__ ```python __anext__() -> ( BaseNode[StateT, DepsT, RunEndT] | End[RunEndT] ) ``` Use the last returned node as the input to `Graph.next`. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python async def __anext__(self) -> BaseNode[StateT, DepsT, RunEndT] | End[RunEndT]: """Use the last returned node as the input to `Graph.next`.""" if not self._is_started: self._is_started = True return self._next_node if isinstance(self._next_node, End): raise StopAsyncIteration return await self.next(self._next_node) ``` ### GraphRunResult Bases: `Generic[StateT, RunEndT]` The final result of running a graph. Source code in `pydantic_graph/pydantic_graph/graph.py` ```python @dataclass(init=False) class GraphRunResult(Generic[StateT, RunEndT]): """The final result of running a graph.""" output: RunEndT state: StateT persistence: BaseStatePersistence[StateT, RunEndT] = field(repr=False) def __init__( self, output: RunEndT, state: StateT, persistence: BaseStatePersistence[StateT, RunEndT], traceparent: str | None = None, ): self.output = output self.state = state self.persistence = persistence self.__traceparent = traceparent @overload def _traceparent(self, *, required: typing_extensions.Literal[False]) -> str | None: ... @overload def _traceparent(self) -> str: ... def _traceparent(self, *, required: bool = True) -> str | None: # pragma: no cover if self.__traceparent is None and required: raise exceptions.GraphRuntimeError('No span was created for this graph run.') return self.__traceparent ``` # `pydantic_graph.mermaid` ### DEFAULT_HIGHLIGHT_CSS ```python DEFAULT_HIGHLIGHT_CSS = 'fill:#fdff32' ``` The default CSS to use for highlighting nodes. ### StateDiagramDirection ```python StateDiagramDirection = Literal['TB', 'LR', 'RL', 'BT'] ``` Used to specify the direction of the state diagram generated by mermaid. - `'TB'`: Top to bottom, this is the default for mermaid charts. - `'LR'`: Left to right - `'RL'`: Right to left - `'BT'`: Bottom to top ### generate_code ```python generate_code( graph: Graph[Any, Any, Any], /, *, start_node: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlighted_nodes: ( Sequence[NodeIdent] | NodeIdent | None ) = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str ``` Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `start_node` | `Sequence[NodeIdent] | NodeIdent | None` | Identifiers of nodes that start the graph. | `None` | | `highlighted_nodes` | `Sequence[NodeIdent] | NodeIdent | None` | Identifiers of nodes to highlight. | `None` | | `highlight_css` | `str` | CSS to use for highlighting nodes. | `DEFAULT_HIGHLIGHT_CSS` | | `title` | `str | None` | The title of the diagram. | `None` | | `edge_labels` | `bool` | Whether to include edge labels in the diagram. | `True` | | `notes` | `bool` | Whether to include notes in the diagram. | `True` | | `direction` | `StateDiagramDirection | None` | The direction of flow. | *required* | Returns: | Type | Description | | --- | --- | | `str` | The Mermaid code for the graph. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def generate_code( # noqa: C901 graph: Graph[Any, Any, Any], /, *, start_node: Sequence[NodeIdent] | NodeIdent | None = None, highlighted_nodes: Sequence[NodeIdent] | NodeIdent | None = None, highlight_css: str = DEFAULT_HIGHLIGHT_CSS, title: str | None = None, edge_labels: bool = True, notes: bool = True, direction: StateDiagramDirection | None, ) -> str: """Generate [Mermaid state diagram](https://mermaid.js.org/syntax/stateDiagram.html) code for a graph. Args: graph: The graph to generate the image for. start_node: Identifiers of nodes that start the graph. highlighted_nodes: Identifiers of nodes to highlight. highlight_css: CSS to use for highlighting nodes. title: The title of the diagram. edge_labels: Whether to include edge labels in the diagram. notes: Whether to include notes in the diagram. direction: The direction of flow. Returns: The Mermaid code for the graph. """ start_node_ids = set(_node_ids(start_node or ())) for node_id in start_node_ids: if node_id not in graph.node_defs: raise LookupError(f'Start node "{node_id}" is not in the graph.') lines: list[str] = [] if title: lines = ['---', f'title: {title}', '---'] lines.append('stateDiagram-v2') if direction is not None: lines.append(f' direction {direction}') for node_id, node_def in graph.node_defs.items(): # we use round brackets (rounded box) for nodes other than the start and end if node_id in start_node_ids: lines.append(f' [*] --> {node_id}') if node_def.returns_base_node: for next_node_id in graph.node_defs: lines.append(f' {node_id} --> {next_node_id}') else: for next_node_id, edge in node_def.next_node_edges.items(): line = f' {node_id} --> {next_node_id}' if edge_labels and edge.label: line += f': {edge.label}' lines.append(line) if end_edge := node_def.end_edge: line = f' {node_id} --> [*]' if edge_labels and end_edge.label: line += f': {end_edge.label}' lines.append(line) if notes and node_def.note: lines.append(f' note right of {node_id}') # mermaid doesn't like multiple paragraphs in a note, and shows if so clean_docs = re.sub('\n{2,}', '\n', node_def.note) lines.append(indent(clean_docs, ' ')) lines.append(' end note') if highlighted_nodes: lines.append('') lines.append(f'classDef highlighted {highlight_css}') for node_id in _node_ids(highlighted_nodes): if node_id not in graph.node_defs: raise LookupError(f'Highlighted node "{node_id}" is not in the graph.') lines.append(f'class {node_id} highlighted') return '\n'.join(lines) ``` ### request_image ```python request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Returns: | Type | Description | | --- | --- | | `bytes` | The image data. | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def request_image( graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> bytes: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink). Args: graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. Returns: The image data. """ code = generate_code( graph, start_node=kwargs.get('start_node'), highlighted_nodes=kwargs.get('highlighted_nodes'), highlight_css=kwargs.get('highlight_css', DEFAULT_HIGHLIGHT_CSS), title=kwargs.get('title'), edge_labels=kwargs.get('edge_labels', True), notes=kwargs.get('notes', True), direction=kwargs.get('direction'), ) code_base64 = base64.b64encode(code.encode()).decode() params: dict[str, str | float] = {} if kwargs.get('image_type') == 'pdf': url = f'https://mermaid.ink/pdf/{code_base64}' if kwargs.get('pdf_fit'): params['fit'] = '' if kwargs.get('pdf_landscape'): params['landscape'] = '' if pdf_paper := kwargs.get('pdf_paper'): params['paper'] = pdf_paper elif kwargs.get('image_type') == 'svg': url = f'https://mermaid.ink/svg/{code_base64}' else: url = f'https://mermaid.ink/img/{code_base64}' if image_type := kwargs.get('image_type'): params['type'] = image_type if background_color := kwargs.get('background_color'): params['bgColor'] = background_color if theme := kwargs.get('theme'): params['theme'] = theme if width := kwargs.get('width'): params['width'] = width if height := kwargs.get('height'): params['height'] = height if scale := kwargs.get('scale'): params['scale'] = scale httpx_client = kwargs.get('httpx_client') or httpx.Client() response = httpx_client.get(url, params=params) if not response.is_success: raise httpx.HTTPStatusError( f'{response.status_code} error generating image:\n{response.text}', request=response.request, response=response, ) return response.content ``` ### save_image ```python save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None ``` Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `path` | `Path | str` | The path to save the image to. | *required* | | `graph` | `Graph[Any, Any, Any]` | The graph to generate the image for. | *required* | | `**kwargs` | `Unpack[MermaidConfig]` | Additional parameters to configure mermaid chart generation. | `{}` | Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python def save_image( path: Path | str, graph: Graph[Any, Any, Any], /, **kwargs: Unpack[MermaidConfig], ) -> None: """Generate an image of a Mermaid diagram using [mermaid.ink](https://mermaid.ink) and save it to a local file. Args: path: The path to save the image to. graph: The graph to generate the image for. **kwargs: Additional parameters to configure mermaid chart generation. """ if isinstance(path, str): path = Path(path) if 'image_type' not in kwargs: ext = path.suffix.lower()[1:] # no need to check for .jpeg/.jpg, as it is the default if ext in ('png', 'webp', 'svg', 'pdf'): kwargs['image_type'] = ext image_data = request_image(graph, **kwargs) path.write_bytes(image_data) ``` ### MermaidConfig Bases: `TypedDict` Parameters to configure mermaid chart generation. Source code in `pydantic_graph/pydantic_graph/mermaid.py` ```python class MermaidConfig(TypedDict, total=False): """Parameters to configure mermaid chart generation.""" start_node: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes that start the graph.""" highlighted_nodes: Sequence[NodeIdent] | NodeIdent """Identifiers of nodes to highlight.""" highlight_css: str """CSS to use for highlighting nodes.""" title: str | None """The title of the diagram.""" edge_labels: bool """Whether to include edge labels in the diagram.""" notes: bool """Whether to include notes on nodes in the diagram, defaults to true.""" image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] """The image type to generate. If unspecified, the default behavior is `'jpeg'`.""" pdf_fit: bool """When using image_type='pdf', whether to fit the diagram to the PDF page.""" pdf_landscape: bool """When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. """ pdf_paper: Literal['letter', 'legal', 'tabloid', 'ledger', 'a0', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6'] """When using image_type='pdf', the paper size of the PDF.""" background_color: str """The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. """ theme: Literal['default', 'neutral', 'dark', 'forest'] """The theme of the diagram. Defaults to 'default'.""" width: int """The width of the diagram.""" height: int """The height of the diagram.""" scale: Annotated[float, Ge(1), Le(3)] """The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. """ httpx_client: httpx.Client """An HTTPX client to use for requests, mostly for testing purposes.""" direction: StateDiagramDirection """The direction of the state diagram.""" ``` #### start_node ```python start_node: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes that start the graph. #### highlighted_nodes ```python highlighted_nodes: Sequence[NodeIdent] | NodeIdent ``` Identifiers of nodes to highlight. #### highlight_css ```python highlight_css: str ``` CSS to use for highlighting nodes. #### title ```python title: str | None ``` The title of the diagram. #### edge_labels ```python edge_labels: bool ``` Whether to include edge labels in the diagram. #### notes ```python notes: bool ``` Whether to include notes on nodes in the diagram, defaults to true. #### image_type ```python image_type: Literal['jpeg', 'png', 'webp', 'svg', 'pdf'] ``` The image type to generate. If unspecified, the default behavior is `'jpeg'`. #### pdf_fit ```python pdf_fit: bool ``` When using image_type='pdf', whether to fit the diagram to the PDF page. #### pdf_landscape ```python pdf_landscape: bool ``` When using image_type='pdf', whether to use landscape orientation for the PDF. This has no effect if using `pdf_fit`. #### pdf_paper ```python pdf_paper: Literal[ "letter", "legal", "tabloid", "ledger", "a0", "a1", "a2", "a3", "a4", "a5", "a6", ] ``` When using image_type='pdf', the paper size of the PDF. #### background_color ```python background_color: str ``` The background color of the diagram. If None, the default transparent background is used. The color value is interpreted as a hexadecimal color code by default (and should not have a leading '#'), but you can also use named colors by prefixing the value with `'!'`. For example, valid choices include `background_color='!white'` or `background_color='FF0000'`. #### theme ```python theme: Literal['default', 'neutral', 'dark', 'forest'] ``` The theme of the diagram. Defaults to 'default'. #### width ```python width: int ``` The width of the diagram. #### height ```python height: int ``` The height of the diagram. #### scale ```python scale: Annotated[float, Ge(1), Le(3)] ``` The scale of the diagram. The scale must be a number between 1 and 3, and you can only set a scale if one or both of width and height are set. #### httpx_client ```python httpx_client: Client ``` An HTTPX client to use for requests, mostly for testing purposes. #### direction ```python direction: StateDiagramDirection ``` The direction of the state diagram. ### NodeIdent ```python NodeIdent: TypeAlias = ( "type[BaseNode[Any, Any, Any]] | BaseNode[Any, Any, Any] | str" ) ``` A type alias for a node identifier. This can be: - A node instance (instance of a subclass of BaseNode). - A node class (subclass of BaseNode). - A string representing the node ID. # `pydantic_graph.nodes` ### StateT ```python StateT = TypeVar('StateT', default=None) ``` Type variable for the state in a graph. ### GraphRunContext Bases: `Generic[StateT, DepsT]` Context for a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(kw_only=True) class GraphRunContext(Generic[StateT, DepsT]): """Context for a graph.""" # TODO: Can we get rid of this struct and just pass both these things around..? state: StateT """The state of the graph.""" deps: DepsT """Dependencies for the graph.""" ``` #### state ```python state: StateT ``` The state of the graph. #### deps ```python deps: DepsT ``` Dependencies for the graph. ### BaseNode Bases: `ABC`, `Generic[StateT, DepsT, NodeRunEndT]` Base class for a node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python class BaseNode(ABC, Generic[StateT, DepsT, NodeRunEndT]): """Base class for a node.""" docstring_notes: ClassVar[bool] = False """Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the [`get_note`][pydantic_graph.nodes.BaseNode.get_note] method. """ @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: self.__dict__['__snapshot_id'] = snapshot_id = generate_snapshot_id(self.get_node_id()) return snapshot_id def set_snapshot_id(self, snapshot_id: str) -> None: self.__dict__['__snapshot_id'] = snapshot_id @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` #### docstring_notes ```python docstring_notes: bool = False ``` Set to `True` to generate mermaid diagram notes from the class's docstring. While this can add valuable information to the diagram, it can make diagrams harder to view, hence it is disabled by default. You can also customise notes overriding the get_note method. #### run ```python run( ctx: GraphRunContext[StateT, DepsT], ) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT] ``` Run the node. This is an abstract method that must be implemented by subclasses. Return types used at runtime The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](../mermaid/) and enforced when running the graph. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `ctx` | `GraphRunContext[StateT, DepsT]` | The graph context. | *required* | Returns: | Type | Description | | --- | --- | | `BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]` | The next node to run or End to signal the end of the graph. | Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @abstractmethod async def run(self, ctx: GraphRunContext[StateT, DepsT]) -> BaseNode[StateT, DepsT, Any] | End[NodeRunEndT]: """Run the node. This is an abstract method that must be implemented by subclasses. !!! note "Return types used at runtime" The return type of this method are read by `pydantic_graph` at runtime and used to define which nodes can be called next in the graph. This is displayed in [mermaid diagrams](mermaid.md) and enforced when running the graph. Args: ctx: The graph context. Returns: The next node to run or [`End`][pydantic_graph.nodes.End] to signal the end of the graph. """ ... ``` #### get_node_id ```python get_node_id() -> str ``` Get the ID of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod @cache def get_node_id(cls) -> str: """Get the ID of the node.""" return cls.__name__ ``` #### get_note ```python get_note() -> str | None ``` Get a note about the node to render on mermaid charts. By default, this returns a note only if docstring_notes is `True`. You can override this method to customise the node notes. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_note(cls) -> str | None: """Get a note about the node to render on mermaid charts. By default, this returns a note only if [`docstring_notes`][pydantic_graph.nodes.BaseNode.docstring_notes] is `True`. You can override this method to customise the node notes. """ if not cls.docstring_notes: return None docstring = cls.__doc__ # dataclasses get an automatic docstring which is just their signature, we don't want that if docstring and is_dataclass(cls) and docstring.startswith(f'{cls.__name__}('): docstring = None # pragma: no cover if docstring: # pragma: no branch # remove indentation from docstring import inspect docstring = inspect.cleandoc(docstring) return docstring ``` #### get_node_def ```python get_node_def( local_ns: dict[str, Any] | None, ) -> NodeDef[StateT, DepsT, NodeRunEndT] ``` Get the node definition. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @classmethod def get_node_def(cls, local_ns: dict[str, Any] | None) -> NodeDef[StateT, DepsT, NodeRunEndT]: """Get the node definition.""" type_hints = get_type_hints(cls.run, localns=local_ns, include_extras=True) try: return_hint = type_hints['return'] except KeyError as e: raise exceptions.GraphSetupError(f'Node {cls} is missing a return type hint on its `run` method') from e next_node_edges: dict[str, Edge] = {} end_edge: Edge | None = None returns_base_node: bool = False for return_type in _utils.get_union_args(return_hint): return_type, annotations = _utils.unpack_annotated(return_type) edge = next((a for a in annotations if isinstance(a, Edge)), Edge(None)) return_type_origin = get_origin(return_type) or return_type if return_type_origin is End: end_edge = edge elif return_type_origin is BaseNode: returns_base_node = True elif issubclass(return_type_origin, BaseNode): next_node_edges[return_type.get_node_id()] = edge else: raise exceptions.GraphSetupError(f'Invalid return type: {return_type}') return NodeDef( node=cls, node_id=cls.get_node_id(), note=cls.get_note(), next_node_edges=next_node_edges, end_edge=end_edge, returns_base_node=returns_base_node, ) ``` #### deep_copy ```python deep_copy() -> Self ``` Returns a deep copy of the node. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy(self) -> Self: """Returns a deep copy of the node.""" return copy.deepcopy(self) ``` ### End Bases: `Generic[RunEndT]` Type to return from a node to signal the end of the graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass class End(Generic[RunEndT]): """Type to return from a node to signal the end of the graph.""" data: RunEndT """Data to return from the graph.""" def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end def get_snapshot_id(self) -> str: if snapshot_id := getattr(self, '__snapshot_id', None): return snapshot_id else: self.__dict__['__snapshot_id'] = snapshot_id = generate_snapshot_id('end') return snapshot_id def set_snapshot_id(self, set_id: str) -> None: self.__dict__['__snapshot_id'] = set_id ``` #### data ```python data: RunEndT ``` Data to return from the graph. #### deep_copy_data ```python deep_copy_data() -> End[RunEndT] ``` Returns a deep copy of the end of the run. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python def deep_copy_data(self) -> End[RunEndT]: """Returns a deep copy of the end of the run.""" if self.data is None: return self else: end = End(copy.deepcopy(self.data)) end.set_snapshot_id(self.get_snapshot_id()) return end ``` ### Edge Annotation to apply a label to an edge in a graph. Source code in `pydantic_graph/pydantic_graph/nodes.py` ```python @dataclass(frozen=True) class Edge: """Annotation to apply a label to an edge in a graph.""" label: str | None """Label for the edge.""" ``` #### label ```python label: str | None ``` Label for the edge. ### DepsT ```python DepsT = TypeVar('DepsT', default=None, contravariant=True) ``` Type variable for the dependencies of a graph and node. ### RunEndT ```python RunEndT = TypeVar('RunEndT', covariant=True, default=None) ``` Covariant type variable for the return type of a graph run. ### NodeRunEndT ```python NodeRunEndT = TypeVar( "NodeRunEndT", covariant=True, default=Never ) ``` Covariant type variable for the return type of a node run. # `pydantic_graph.persistence` ### SnapshotStatus ```python SnapshotStatus = Literal[ "created", "pending", "running", "success", "error" ] ``` The status of a snapshot. - `'created'`: The snapshot has been created but not yet run. - `'pending'`: The snapshot has been retrieved with load_next but not yet run. - `'running'`: The snapshot is currently running. - `'success'`: The snapshot has been run successfully. - `'error'`: The snapshot has been run but an error occurred. ### NodeSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the execution of a node in a graph. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class NodeSnapshot(Generic[StateT, RunEndT]): """History step describing the execution of a node in a graph.""" state: StateT """The state of the graph before the node is run.""" node: Annotated[BaseNode[StateT, Any, RunEndT], _utils.CustomNodeSchema()] """The node to run next.""" start_ts: datetime | None = None """The timestamp when the node started running, `None` until the run starts.""" duration: float | None = None """The duration of the node run in seconds, if the node has been run.""" status: SnapshotStatus = 'created' """The status of the snapshot.""" kind: Literal['node'] = 'node' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() ``` #### state ```python state: StateT ``` The state of the graph before the node is run. #### node ```python node: Annotated[ BaseNode[StateT, Any, RunEndT], CustomNodeSchema() ] ``` The node to run next. #### start_ts ```python start_ts: datetime | None = None ``` The timestamp when the node started running, `None` until the run starts. #### duration ```python duration: float | None = None ``` The duration of the node run in seconds, if the node has been run. #### status ```python status: SnapshotStatus = 'created' ``` The status of the snapshot. #### kind ```python kind: Literal['node'] = 'node' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. ### EndSnapshot Bases: `Generic[StateT, RunEndT]` History step describing the end of a graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @dataclass(kw_only=True) class EndSnapshot(Generic[StateT, RunEndT]): """History step describing the end of a graph run.""" state: StateT """The state of the graph at the end of the run.""" result: End[RunEndT] """The result of the graph run.""" ts: datetime = field(default_factory=_utils.now_utc) """The timestamp when the graph run ended.""" kind: Literal['end'] = 'end' """The kind of history step, can be used as a discriminator when deserializing history.""" id: str = UNSET_SNAPSHOT_ID """Unique ID of the snapshot.""" def __post_init__(self) -> None: if self.id == UNSET_SNAPSHOT_ID: self.id = self.node.get_snapshot_id() @property def node(self) -> End[RunEndT]: """Shim to get the [`result`][pydantic_graph.persistence.EndSnapshot.result]. Useful to allow `[snapshot.node for snapshot in persistence.history]`. """ return self.result ``` #### state ```python state: StateT ``` The state of the graph at the end of the run. #### result ```python result: End[RunEndT] ``` The result of the graph run. #### ts ```python ts: datetime = field(default_factory=now_utc) ``` The timestamp when the graph run ended. #### kind ```python kind: Literal['end'] = 'end' ``` The kind of history step, can be used as a discriminator when deserializing history. #### id ```python id: str = UNSET_SNAPSHOT_ID ``` Unique ID of the snapshot. #### node ```python node: End[RunEndT] ``` Shim to get the result. Useful to allow `[snapshot.node for snapshot in persistence.history]`. ### Snapshot ```python Snapshot = ( NodeSnapshot[StateT, RunEndT] | EndSnapshot[StateT, RunEndT] ) ``` A step in the history of a graph run. Graph.run returns a list of these steps describing the execution of the graph, together with the run return value. ### BaseStatePersistence Bases: `ABC`, `Generic[StateT, RunEndT]` Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python class BaseStatePersistence(ABC, Generic[StateT, RunEndT]): """Abstract base class for storing the state of a graph run. Each instance of a `BaseStatePersistence` subclass should be used for a single graph run. """ @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` #### snapshot_node ```python snapshot_node( state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None ``` Snapshot the state of a graph, when the next step is to run a node. This method should add a NodeSnapshot to persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: """Snapshot the state of a graph, when the next step is to run a node. This method should add a [`NodeSnapshot`][pydantic_graph.persistence.NodeSnapshot] to persistence. Args: state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_node_if_new ```python snapshot_node_if_new( snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT], ) -> None ``` Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call snapshot_node but should do so in an atomic way. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `snapshot_id` | `str` | The ID of the snapshot to check. | *required* | | `state` | `StateT` | The state of the graph. | *required* | | `next_node` | `BaseNode[StateT, Any, RunEndT]` | The next node to run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: """Snapshot the state of a graph if the snapshot ID doesn't already exist in persistence. This method will generally call [`snapshot_node`][pydantic_graph.persistence.BaseStatePersistence.snapshot_node] but should do so in an atomic way. Args: snapshot_id: The ID of the snapshot to check. state: The state of the graph. next_node: The next node to run. """ raise NotImplementedError ``` #### snapshot_end ```python snapshot_end(state: StateT, end: End[RunEndT]) -> None ``` Snapshot the state of a graph when the graph has ended. This method should add an EndSnapshot to persistence. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state` | `StateT` | The state of the graph. | *required* | | `end` | `End[RunEndT]` | data from the end of the run. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: """Snapshot the state of a graph when the graph has ended. This method should add an [`EndSnapshot`][pydantic_graph.persistence.EndSnapshot] to persistence. Args: state: The state of the graph. end: data from the end of the run. """ raise NotImplementedError ``` #### record_run ```python record_run( snapshot_id: str, ) -> AbstractAsyncContextManager[None] ``` Record the run of the node, or error if the node is already running. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `snapshot_id` | `str` | The ID of the snapshot to record. | *required* | Raises: | Type | Description | | --- | --- | | `GraphNodeRunningError` | if the node status it not 'created' or 'pending'. | | `LookupError` | if the snapshot ID is not found in persistence. | Returns: | Type | Description | | --- | --- | | `AbstractAsyncContextManager[None]` | An async context manager that records the run of the node. | In particular this should set: - NodeSnapshot.status to `'running'` and NodeSnapshot.start_ts when the run starts. - NodeSnapshot.status to `'success'` or `'error'` and NodeSnapshot.duration when the run finishes. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod def record_run(self, snapshot_id: str) -> AbstractAsyncContextManager[None]: """Record the run of the node, or error if the node is already running. Args: snapshot_id: The ID of the snapshot to record. Raises: GraphNodeRunningError: if the node status it not `'created'` or `'pending'`. LookupError: if the snapshot ID is not found in persistence. Returns: An async context manager that records the run of the node. In particular this should set: - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'running'` and [`NodeSnapshot.start_ts`][pydantic_graph.persistence.NodeSnapshot.start_ts] when the run starts. - [`NodeSnapshot.status`][pydantic_graph.persistence.NodeSnapshot.status] to `'success'` or `'error'` and [`NodeSnapshot.duration`][pydantic_graph.persistence.NodeSnapshot.duration] when the run finishes. """ raise NotImplementedError ``` #### load_next ```python load_next() -> NodeSnapshot[StateT, RunEndT] | None ``` Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by Graph.iter_from_persistence to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: """Retrieve a node snapshot with status `'created`' and set its status to `'pending'`. This is used by [`Graph.iter_from_persistence`][pydantic_graph.graph.Graph.iter_from_persistence] to get the next node to run. Returns: The snapshot, or `None` if no snapshot with status `'created`' exists. """ raise NotImplementedError ``` #### load_all ```python load_all() -> list[Snapshot[StateT, RunEndT]] ``` Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all snapshots from persistence. Returns: The list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python @abstractmethod async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: """Load the entire history of snapshots. `load_all` is not used by pydantic-graph itself, instead it's provided to make it convenient to get all [snapshots][pydantic_graph.persistence.Snapshot] from persistence. Returns: The list of snapshots. """ raise NotImplementedError ``` #### set_graph_types ```python set_graph_types(graph: Graph[StateT, Any, RunEndT]) -> None ``` Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement set_types and should_set_types. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_graph_types(self, graph: Graph[StateT, Any, RunEndT]) -> None: """Set the types of the state and run end from a graph. You generally won't need to customise this method, instead implement [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] and [`should_set_types`][pydantic_graph.persistence.BaseStatePersistence.should_set_types]. """ if self.should_set_types(): with _utils.set_nodes_type_context(graph.get_nodes()): self.set_types(*graph.inferred_types) ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def should_set_types(self) -> bool: """Whether types need to be set. Implementations should override this method to return `True` when types have not been set if they are needed. """ return False ``` #### set_types ```python set_types( state_type: type[StateT], run_end_type: type[RunEndT] ) -> None ``` Set the types of the state and run end. This can be used to create type adapters for serializing and deserializing snapshots, e.g. with build_snapshot_list_type_adapter. Parameters: | Name | Type | Description | Default | | --- | --- | --- | --- | | `state_type` | `type[StateT]` | The state type. | *required* | | `run_end_type` | `type[RunEndT]` | The run end type. | *required* | Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: """Set the types of the state and run end. This can be used to create [type adapters][pydantic.TypeAdapter] for serializing and deserializing snapshots, e.g. with [`build_snapshot_list_type_adapter`][pydantic_graph.persistence.build_snapshot_list_type_adapter]. Args: state_type: The state type. run_end_type: The run end type. """ pass ``` ### build_snapshot_list_type_adapter ```python build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> TypeAdapter[list[Snapshot[StateT, RunEndT]]] ``` Build a type adapter for a list of snapshots. This method should be called from within set_types where context variables will be set such that Pydantic can create a schema for NodeSnapshot.node. Source code in `pydantic_graph/pydantic_graph/persistence/__init__.py` ```python def build_snapshot_list_type_adapter( state_t: type[StateT], run_end_t: type[RunEndT] ) -> pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]]: """Build a type adapter for a list of snapshots. This method should be called from within [`set_types`][pydantic_graph.persistence.BaseStatePersistence.set_types] where context variables will be set such that Pydantic can create a schema for [`NodeSnapshot.node`][pydantic_graph.persistence.NodeSnapshot.node]. """ return pydantic.TypeAdapter(list[Annotated[Snapshot[state_t, run_end_t], pydantic.Discriminator('kind')]]) ``` In memory state persistence. This module provides simple in memory state persistence for graphs. ### SimpleStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class SimpleStatePersistence(BaseStatePersistence[StateT, RunEndT]): """Simple in memory state persistence that just hold the latest snapshot. If no state persistence implementation is provided when running a graph, this is used by default. """ last_snapshot: Snapshot[StateT, RunEndT] | None = None """The last snapshot.""" async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: self.last_snapshot = NodeSnapshot(state=state, node=next_node) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if self.last_snapshot and self.last_snapshot.id == snapshot_id: return # pragma: no cover else: await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: self.last_snapshot = EndSnapshot(state=state, result=end) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: if self.last_snapshot is None or snapshot_id != self.last_snapshot.id: raise LookupError(f'No snapshot found with id={snapshot_id!r}') assert isinstance(self.last_snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(self.last_snapshot.status) self.last_snapshot.status = 'running' self.last_snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'error' raise else: self.last_snapshot.duration = perf_counter() - start self.last_snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if isinstance(self.last_snapshot, NodeSnapshot) and self.last_snapshot.status == 'created': self.last_snapshot.status = 'pending' return copy.deepcopy(self.last_snapshot) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: raise NotImplementedError('load is not supported for SimpleStatePersistence') ``` #### last_snapshot ```python last_snapshot: Snapshot[StateT, RunEndT] | None = None ``` The last snapshot. ### FullStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` In memory state persistence that hold a list of snapshots. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python @dataclass class FullStatePersistence(BaseStatePersistence[StateT, RunEndT]): """In memory state persistence that hold a list of snapshots.""" deep_copy: bool = True """Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. """ history: list[Snapshot[StateT, RunEndT]] = field(default_factory=list) """List of snapshots taken during the graph run.""" _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: snapshot = NodeSnapshot( state=self._prep_state(state), node=next_node.deep_copy() if self.deep_copy else next_node, ) self.history.append(snapshot) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: if not any(s.id == snapshot_id for s in self.history): await self.snapshot_node(state, next_node) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: snapshot = EndSnapshot( state=self._prep_state(state), result=end.deep_copy_data() if self.deep_copy else end, ) self.history.append(snapshot) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: try: snapshot = next(s for s in self.history if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() start = perf_counter() try: yield except Exception: snapshot.duration = perf_counter() - start snapshot.status = 'error' raise else: snapshot.duration = perf_counter() - start snapshot.status = 'success' async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: if snapshot := next((s for s in self.history if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' return copy.deepcopy(snapshot) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return self.history def should_set_types(self) -> bool: return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) def _prep_state(self, state: StateT) -> StateT: """Prepare state for snapshot, uses [`copy.deepcopy`][copy.deepcopy] by default.""" if not self.deep_copy or state is None: return state else: return copy.deepcopy(state) ``` #### deep_copy ```python deep_copy: bool = True ``` Whether to deep copy the state and nodes when storing them. Defaults to `True` so even if nodes or state are modified after the snapshot is taken, the persistence history will record the value at the time of the snapshot. #### history ```python history: list[Snapshot[StateT, RunEndT]] = field( default_factory=list ) ``` List of snapshots taken during the graph run. #### dump_json ```python dump_json(*, indent: int | None = None) -> bytes ``` Dump the history to JSON bytes. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def dump_json(self, *, indent: int | None = None) -> bytes: """Dump the history to JSON bytes.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `dump_json`' return self._snapshots_type_adapter.dump_json(self.history, indent=indent) ``` #### load_json ```python load_json(json_data: str | bytes | bytearray) -> None ``` Load the history from JSON. Source code in `pydantic_graph/pydantic_graph/persistence/in_mem.py` ```python def load_json(self, json_data: str | bytes | bytearray) -> None: """Load the history from JSON.""" assert self._snapshots_type_adapter is not None, 'type adapter must be set to use `load_json`' self.history = self._snapshots_type_adapter.validate_json(json_data) ``` ### FileStatePersistence Bases: `BaseStatePersistence[StateT, RunEndT]` File based state persistence that hold graph run state in a JSON file. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ````python @dataclass class FileStatePersistence(BaseStatePersistence[StateT, RunEndT]): """File based state persistence that hold graph run state in a JSON file.""" json_file: Path """Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` """ _snapshots_type_adapter: pydantic.TypeAdapter[list[Snapshot[StateT, RunEndT]]] | None = field( default=None, init=False, repr=False ) async def snapshot_node(self, state: StateT, next_node: BaseNode[StateT, Any, RunEndT]) -> None: await self._append_save(NodeSnapshot(state=state, node=next_node)) async def snapshot_node_if_new( self, snapshot_id: str, state: StateT, next_node: BaseNode[StateT, Any, RunEndT] ) -> None: async with self._lock(): snapshots = await self.load_all() if not any(s.id == snapshot_id for s in snapshots): # pragma: no branch await self._append_save(NodeSnapshot(state=state, node=next_node), lock=False) async def snapshot_end(self, state: StateT, end: End[RunEndT]) -> None: await self._append_save(EndSnapshot(state=state, result=end)) @asynccontextmanager async def record_run(self, snapshot_id: str) -> AsyncIterator[None]: async with self._lock(): snapshots = await self.load_all() try: snapshot = next(s for s in snapshots if s.id == snapshot_id) except StopIteration as e: raise LookupError(f'No snapshot found with id={snapshot_id!r}') from e assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' exceptions.GraphNodeStatusError.check(snapshot.status) snapshot.status = 'running' snapshot.start_ts = _utils.now_utc() await self._save(snapshots) start = perf_counter() try: yield except Exception: duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, duration, 'error') raise else: snapshot.duration = perf_counter() - start async with self._lock(): await _graph_utils.run_in_executor(self._after_run_sync, snapshot_id, snapshot.duration, 'success') async def load_next(self) -> NodeSnapshot[StateT, RunEndT] | None: async with self._lock(): snapshots = await self.load_all() if snapshot := next((s for s in snapshots if isinstance(s, NodeSnapshot) and s.status == 'created'), None): snapshot.status = 'pending' await self._save(snapshots) return snapshot def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None def set_types(self, state_type: type[StateT], run_end_type: type[RunEndT]) -> None: self._snapshots_type_adapter = build_snapshot_list_type_adapter(state_type, run_end_type) async def load_all(self) -> list[Snapshot[StateT, RunEndT]]: return await _graph_utils.run_in_executor(self._load_sync) def _load_sync(self) -> list[Snapshot[StateT, RunEndT]]: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' try: content = self.json_file.read_bytes() except FileNotFoundError: return [] else: return self._snapshots_type_adapter.validate_json(content) def _after_run_sync(self, snapshot_id: str, duration: float, status: SnapshotStatus) -> None: snapshots = self._load_sync() snapshot = next(s for s in snapshots if s.id == snapshot_id) assert isinstance(snapshot, NodeSnapshot), 'Only NodeSnapshot can be recorded' snapshot.duration = duration snapshot.status = status self._save_sync(snapshots) async def _save(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: await _graph_utils.run_in_executor(self._save_sync, snapshots) def _save_sync(self, snapshots: list[Snapshot[StateT, RunEndT]]) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' self.json_file.write_bytes(self._snapshots_type_adapter.dump_json(snapshots, indent=2)) async def _append_save(self, snapshot: Snapshot[StateT, RunEndT], *, lock: bool = True) -> None: assert self._snapshots_type_adapter is not None, 'snapshots type adapter must be set' async with AsyncExitStack() as stack: if lock: await stack.enter_async_context(self._lock()) snapshots = await self.load_all() snapshots.append(snapshot) await self._save(snapshots) @asynccontextmanager async def _lock(self, *, timeout: float = 1.0) -> AsyncIterator[None]: """Lock a file by checking and writing a `.pydantic-graph-persistence-lock` to it. Args: timeout: how long to wait for the lock Returns: an async context manager that holds the lock """ lock_file = self.json_file.parent / f'{self.json_file.name}.pydantic-graph-persistence-lock' lock_id = secrets.token_urlsafe().encode() with anyio.fail_after(timeout): while not await _file_append_check(lock_file, lock_id): await anyio.sleep(0.01) try: yield finally: await _graph_utils.run_in_executor(lock_file.unlink, missing_ok=True) ```` #### json_file ```python json_file: Path ``` Path to the JSON file where the snapshots are stored. You should use a different file for each graph run, but a single file should be reused for multiple steps of the same run. For example if you have a run ID of the form `run_123abc`, you might create a `FileStatePersistence` thus: ```py from pathlib import Path from pydantic_graph import FullStatePersistence run_id = 'run_123abc' persistence = FullStatePersistence(Path('runs') / f'{run_id}.json') ``` #### should_set_types ```python should_set_types() -> bool ``` Whether types need to be set. Source code in `pydantic_graph/pydantic_graph/persistence/file.py` ```python def should_set_types(self) -> bool: """Whether types need to be set.""" return self._snapshots_type_adapter is None ``` # Evals # Evals *Evals* is shorthand for both AI system *Evaluation* as a broad topic and for specific *Evaluation Metrics* or *Evaluators* as individual tests. Ironically, the overloading of this term makes it difficult to evaluate what people are even talking about when they say "Evals" (without further context). Warning Unlike unit tests, evals are an emerging art/science; anyone who claims to know exactly how your evals should be defined can safely be ignored. ## Pydantic Evals Package **Pydantic Evals** is a powerful evaluation framework designed to help you systematically test and evaluate the performance and accuracy of the systems you build, from augmented LLMs to multi-agent systems. Install Pydantic Evals as part of the Pydantic AI (agent framework) package, or stand-alone. We've designed Pydantic Evals to be useful while not being too opinionated since we (along with everyone else) are still figuring out best practices. We'd love your [feedback](../help/) on the package and how we can improve it. In Beta Pydantic Evals support was [introduced](https://github.com/pydantic/pydantic-ai/pull/935) in v0.0.47 and is currently in beta. The API is subject to change and the documentation is incomplete. ## Code-First Evaluation Pydantic Evals follows a **code-first approach** where you define all evaluation components (datasets, experiments, tasks, cases and evaluators) in Python code, or as serialized data loaded by Python code. This differs from platforms with fully web-based configuration. When you run an *Experiment* you'll see a progress indicator and can print the results wherever you run your python code (IDE, terminal, etc). You also get a report object back that you can serialize and store or send to a notebook or other application for further visualization and analysis. If you are using [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/), your experiment results automatically appear in the Logfire web interface for visualization, comparison, and collaborative analysis. Logfire serves as a observability layer - you write and run evals in code, then view and analyze results in the web UI. ## Installation To install the Pydantic Evals package, run: ```bash pip install pydantic-evals ``` ```bash uv add pydantic-evals ``` `pydantic-evals` does not depend on `pydantic-ai`, but has an optional dependency on `logfire` if you'd like to use OpenTelemetry traces in your evals, or send evaluation results to [logfire](https://pydantic.dev/logfire). ```bash pip install 'pydantic-evals[logfire]' ``` ```bash uv add 'pydantic-evals[logfire]' ``` ## Pydantic Evals Data Model ### Data Model Diagram ```text Dataset (1) ──────────── (Many) Case │ │ │ │ └─── (Many) Experiment ──┴─── (Many) Case results │ └─── (1) Task │ └─── (Many) Evaluator ``` ### Key Relationships 1. **Dataset → Cases**: One Dataset contains many Cases 1. **Dataset → Experiments**: One Dataset can be used across many Experiments over time 1. **Experiment → Case results**: One Experiment generates results by executing each Case 1. **Experiment → Task**: One Experiment evaluates one defined Task 1. **Experiment → Evaluators**: One Experiment uses multiple Evaluators. Dataset-wide Evaluators are run against all Cases, and Case-specific Evaluators against their respective Cases ### Data Flow 1. **Dataset creation**: Define cases and evaluators in YAML/JSON, or directly in Python 1. **Experiment execution**: Run `dataset.evaluate_sync(task_function)` 1. **Cases run**: Each Case is executed against the Task 1. **Evaluation**: Evaluators score the Task outputs for each Case 1. **Results**: All Case results are collected into a summary report A metaphor A useful metaphor (although not perfect) is to think of evals like a **Unit Testing** framework: - **Cases + Evaluators** are your individual unit tests - each one defines a specific scenario you want to test, complete with inputs and expected outcomes. Just like a unit test, a case asks: *"Given this input, does my system produce the right output?"* - **Datasets** are like test suites - they are the scaffolding that holds your unit tests together. They group related cases and define shared evaluation criteria that should apply across all tests in the suite. - **Experiments** are like running your entire test suite and getting a report. When you execute `dataset.evaluate_sync(my_ai_function)`, you're running all your cases against your AI system and collecting the results - just like running `pytest` and getting a summary of passes, failures, and performance metrics. The key difference from traditional unit testing is that AI systems are probabilistic. If you're type checking you'll still get a simple pass/fail, but scores for text outputs are likely qualitative and/or categorical, and more open to interpretation. ## Datasets and Cases In Pydantic Evals, everything begins with `Dataset`s and `Case`s: - Case: A single test scenario corresponding to Task inputs. Can also optionally have a name, expected outputs, metadata, and evaluators. - Dataset: A collection of test Cases designed for the evaluation of a specific task or function. simple_eval_dataset.py ```python from pydantic_evals import Case, Dataset case1 = Case( name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) dataset = Dataset(cases=[case1]) ``` *(This example is complete, it can be run "as is")* ## Evaluators Evaluators analyze and score the results of your Task when tested against a Case. These can be a classic unit test: deterministic, code-based checks, such as testing model output format with a regex, or checking for the appearance of PII or sensitive data. Alternatively Evaluators can assess the non-deterministic model outputs for qualities like accuracy, precision/recall, hallucinations or instruction-following. While both kinds of testing are useful in LLM systems, classical code-based tests are cheaper and easier than tests which require either human or machine review of model outputs. We encourage you to look for quick wins of this type, when setting up a test framework for your system. Pydantic Evals includes several built-in evaluators and allows you to define custom evaluators: simple_eval_evaluator.py ```python from dataclasses import dataclass from pydantic_evals.evaluators import Evaluator, EvaluatorContext from pydantic_evals.evaluators.common import IsInstance from simple_eval_dataset import dataset dataset.add_evaluator(IsInstance(type_name='str')) # (1)! @dataclass class MyEvaluator(Evaluator): async def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: # (2)! if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset.add_evaluator(MyEvaluator()) ``` 1. You can add built-in evaluators to a dataset using the add_evaluator method. 1. This custom evaluator returns a simple score based on whether the output matches the expected output. *(This example is complete, it can be run "as is")* ## Running Experiments Performing evaluations involves running a task against all cases in a dataset, also known as running an "experiment". Putting the above two examples together and using the more declarative `evaluators` kwarg to Dataset: simple_eval_complete.py ```python from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator, EvaluatorContext, IsInstance case1 = Case( # (1)! name='simple_case', inputs='What is the capital of France?', expected_output='Paris', metadata={'difficulty': 'easy'}, ) class MyEvaluator(Evaluator[str, str]): def evaluate(self, ctx: EvaluatorContext[str, str]) -> float: if ctx.output == ctx.expected_output: return 1.0 elif ( isinstance(ctx.output, str) and ctx.expected_output.lower() in ctx.output.lower() ): return 0.8 else: return 0.0 dataset = Dataset( cases=[case1], evaluators=[IsInstance(type_name='str'), MyEvaluator()], # (3)! ) async def guess_city(question: str) -> str: # (4)! return 'Paris' report = dataset.evaluate_sync(guess_city) # (5)! report.print(include_input=True, include_output=True, include_durations=False) # (6)! """ Evaluation Summary: guess_city ┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ simple_case │ What is the capital of France? │ Paris │ MyEvaluator: 1.00 │ ✔ │ ├─────────────┼────────────────────────────────┼─────────┼───────────────────┼────────────┤ │ Averages │ │ │ MyEvaluator: 1.00 │ 100.0% ✔ │ └─────────────┴────────────────────────────────┴─────────┴───────────────────┴────────────┘ """ ``` 1. Create a test case as above 1. Also create a custom evaluator function as above 1. Create a Dataset with test cases, also set the evaluators when creating the dataset 1. Our function to evaluate. 1. Run the evaluation with evaluate_sync, which runs the function against all test cases in the dataset, and returns an EvaluationReport object. 1. Print the report with print, which shows the results of the evaluation, including input and output. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## Evaluation with `LLMJudge` In this example we evaluate a method for generating recipes based on customer orders. judge_recipes.py ```python from __future__ import annotations from typing import Any from pydantic import BaseModel from pydantic_ai import Agent, format_as_xml from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import IsInstance, LLMJudge class CustomerOrder(BaseModel): # (1)! dish_name: str dietary_restriction: str | None = None class Recipe(BaseModel): ingredients: list[str] steps: list[str] recipe_agent = Agent( 'groq:llama-3.3-70b-versatile', output_type=Recipe, system_prompt=( 'Generate a recipe to cook the dish that meets the dietary restrictions.' ), ) async def transform_recipe(customer_order: CustomerOrder) -> Recipe: # (2)! r = await recipe_agent.run(format_as_xml(customer_order)) return r.output recipe_dataset = Dataset[CustomerOrder, Recipe, Any]( # (3)! cases=[ Case( name='vegetarian_recipe', inputs=CustomerOrder( dish_name='Spaghetti Bolognese', dietary_restriction='vegetarian' ), expected_output=None, # (4) metadata={'focus': 'vegetarian'}, evaluators=( LLMJudge( # (5)! rubric='Recipe should not contain meat or animal products', ), ), ), Case( name='gluten_free_recipe', inputs=CustomerOrder( dish_name='Chocolate Cake', dietary_restriction='gluten-free' ), expected_output=None, metadata={'focus': 'gluten-free'}, # Case-specific evaluator with a focused rubric evaluators=( LLMJudge( rubric='Recipe should not contain gluten or wheat products', ), ), ), ], evaluators=[ # (6)! IsInstance(type_name='Recipe'), LLMJudge( rubric='Recipe should have clear steps and relevant ingredients', include_input=True, model='anthropic:claude-3-7-sonnet-latest', # (7)! ), ], ) report = recipe_dataset.evaluate_sync(transform_recipe) print(report) """ Evaluation Summary: transform_recipe ┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ Case ID ┃ Assertions ┃ Duration ┃ ┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ vegetarian_recipe │ ✔✔✔ │ 10ms │ ├────────────────────┼────────────┼──────────┤ │ gluten_free_recipe │ ✔✔✔ │ 10ms │ ├────────────────────┼────────────┼──────────┤ │ Averages │ 100.0% ✔ │ 10ms │ └────────────────────┴────────────┴──────────┘ """ ``` 1. Define models for our task — Input for recipe generation task and output of the task. 1. Define our recipe generation function - this is the task we want to evaluate. 1. Create a dataset with different test cases and different rubrics. 1. No expected output, we'll let the LLM judge the quality. 1. Case-specific evaluator with a focused rubric using LLMJudge. 1. Dataset-level evaluators that apply to all cases, including a general quality rubric for all recipes 1. By default `LLMJudge` uses `openai:gpt-4o`, here we use a specific Anthropic model. *(This example is complete, it can be run "as is")* ## Saving and Loading Datasets Datasets can be saved to and loaded from YAML or JSON files. save_load_dataset_example.py ```python from pathlib import Path from judge_recipes import CustomerOrder, Recipe, recipe_dataset from pydantic_evals import Dataset recipe_transforms_file = Path('recipe_transform_tests.yaml') recipe_dataset.to_file(recipe_transforms_file) # (1)! print(recipe_transforms_file.read_text()) """ # yaml-language-server: $schema=recipe_transform_tests_schema.json cases: - name: vegetarian_recipe inputs: dish_name: Spaghetti Bolognese dietary_restriction: vegetarian metadata: focus: vegetarian evaluators: - LLMJudge: Recipe should not contain meat or animal products - name: gluten_free_recipe inputs: dish_name: Chocolate Cake dietary_restriction: gluten-free metadata: focus: gluten-free evaluators: - LLMJudge: Recipe should not contain gluten or wheat products evaluators: - IsInstance: Recipe - LLMJudge: rubric: Recipe should have clear steps and relevant ingredients model: anthropic:claude-3-7-sonnet-latest include_input: true """ # Load dataset from file loaded_dataset = Dataset[CustomerOrder, Recipe, dict].from_file(recipe_transforms_file) print(f'Loaded dataset with {len(loaded_dataset.cases)} cases') #> Loaded dataset with 2 cases ``` *(This example is complete, it can be run "as is")* ## Parallel Evaluation You can control concurrency during evaluation (this might be useful to prevent exceeding a rate limit): parallel_evaluation_example.py ```python import asyncio import time from pydantic_evals import Case, Dataset # Create a dataset with multiple test cases dataset = Dataset( cases=[ Case( name=f'case_{i}', inputs=i, expected_output=i * 2, ) for i in range(5) ] ) async def double_number(input_value: int) -> int: """Function that simulates work by sleeping for a tenth of a second before returning double the input.""" await asyncio.sleep(0.1) # Simulate work return input_value * 2 # Run evaluation with unlimited concurrency t0 = time.time() report_default = dataset.evaluate_sync(double_number) print(f'Evaluation took less than 0.5s: {time.time() - t0 < 0.5}') #> Evaluation took less than 0.5s: True report_default.print(include_input=True, include_output=True, include_durations=False) # (1)! """ Evaluation Summary: double_number ┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ ┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ case_0 │ 0 │ 0 │ ├──────────┼────────┼─────────┤ │ case_1 │ 1 │ 2 │ ├──────────┼────────┼─────────┤ │ case_2 │ 2 │ 4 │ ├──────────┼────────┼─────────┤ │ case_3 │ 3 │ 6 │ ├──────────┼────────┼─────────┤ │ case_4 │ 4 │ 8 │ ├──────────┼────────┼─────────┤ │ Averages │ │ │ └──────────┴────────┴─────────┘ """ # Run evaluation with limited concurrency t0 = time.time() report_limited = dataset.evaluate_sync(double_number, max_concurrency=1) print(f'Evaluation took more than 0.5s: {time.time() - t0 > 0.5}') #> Evaluation took more than 0.5s: True report_limited.print(include_input=True, include_output=True, include_durations=False) # (2)! """ Evaluation Summary: double_number ┏━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ ┡━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩ │ case_0 │ 0 │ 0 │ ├──────────┼────────┼─────────┤ │ case_1 │ 1 │ 2 │ ├──────────┼────────┼─────────┤ │ case_2 │ 2 │ 4 │ ├──────────┼────────┼─────────┤ │ case_3 │ 3 │ 6 │ ├──────────┼────────┼─────────┤ │ case_4 │ 4 │ 8 │ ├──────────┼────────┼─────────┤ │ Averages │ │ │ └──────────┴────────┴─────────┘ """ ``` 1. We have omitted duration here just to keep the printed output from changing from run to run. 1. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## OpenTelemetry Integration Pydantic Evals integrates with OpenTelemetry for tracing. The EvaluatorContext includes a property called `span_tree` which returns a SpanTree. The `SpanTree` provides a way to query and analyze the spans generated during function execution. This provides a way to access the results of instrumentation during evaluation. Note If you just want to write unit tests that ensure that specific spans are produced during calls to your evaluation task, it's usually better to just use the `logfire.testing.capfire` fixture directly. There are two main ways this is useful. opentelemetry_example.py ```python import asyncio from typing import Any import logfire from pydantic_evals import Case, Dataset from pydantic_evals.evaluators import Evaluator from pydantic_evals.evaluators.context import EvaluatorContext from pydantic_evals.otel.span_tree import SpanQuery logfire.configure( # ensure that an OpenTelemetry tracer is configured send_to_logfire='if-token-present' ) class SpanTracingEvaluator(Evaluator[str, str]): """Evaluator that analyzes the span tree generated during function execution.""" def evaluate(self, ctx: EvaluatorContext[str, str]) -> dict[str, Any]: # Get the span tree from the context span_tree = ctx.span_tree if span_tree is None: return {'has_spans': False, 'performance_score': 0.0} # Find all spans with "processing" in the name processing_spans = span_tree.find(lambda node: 'processing' in node.name) # Calculate total processing time total_processing_time = sum( (span.duration.total_seconds() for span in processing_spans), 0.0 ) # Check for error spans error_query: SpanQuery = {'name_contains': 'error'} has_errors = span_tree.any(error_query) # Calculate a performance score performance_score = 1.0 if total_processing_time < 2.0 else 0.5 return { 'has_spans': True, 'has_errors': has_errors, 'performance_score': 0 if has_errors else performance_score, } async def process_text(text: str) -> str: """Function that processes text with OpenTelemetry instrumentation.""" with logfire.span('process_text'): # Simulate initial processing with logfire.span('text_processing'): await asyncio.sleep(0.1) processed = text.strip().lower() # Simulate additional processing with logfire.span('additional_processing'): if 'error' in processed: with logfire.span('error_handling'): logfire.error(f'Error detected in text: {text}') return f'Error processing: {text}' await asyncio.sleep(0.2) processed = processed.replace(' ', '_') return f'Processed: {processed}' # Create test cases dataset = Dataset( cases=[ Case( name='normal_text', inputs='Hello World', expected_output='Processed: hello_world', ), Case( name='text_with_error', inputs='Contains error marker', expected_output='Error processing: Contains error marker', ), ], evaluators=[SpanTracingEvaluator()], ) # Run evaluation - spans are automatically captured since logfire is configured report = dataset.evaluate_sync(process_text) # Print the report report.print(include_input=True, include_output=True, include_durations=False) # (1)! """ Evaluation Summary: process_text ┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Case ID ┃ Inputs ┃ Outputs ┃ Scores ┃ Assertions ┃ ┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ normal_text │ Hello World │ Processed: hello_world │ performance_score: 1.00 │ ✔✗ │ ├─────────────────┼───────────────────────┼─────────────────────────────────────────┼──────────────────────────┼────────────┤ │ text_with_error │ Contains error marker │ Error processing: Contains error marker │ performance_score: 0 │ ✔✔ │ ├─────────────────┼───────────────────────┼─────────────────────────────────────────┼──────────────────────────┼────────────┤ │ Averages │ │ │ performance_score: 0.500 │ 75.0% ✔ │ └─────────────────┴───────────────────────┴─────────────────────────────────────────┴──────────────────────────┴────────────┘ """ ``` 1. We have omitted duration here just to keep the printed output from changing from run to run. *(This example is complete, it can be run "as is")* ## Generating Test Datasets Pydantic Evals allows you to generate test datasets using LLMs with generate_dataset. Datasets can be generated in either JSON or YAML format, in both cases a JSON schema file is generated alongside the dataset and referenced in the dataset, so you should get type checking and auto-completion in your editor. generate_dataset_example.py ```python from __future__ import annotations from pathlib import Path from pydantic import BaseModel, Field from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset class QuestionInputs(BaseModel, use_attribute_docstrings=True): # (1)! """Model for question inputs.""" question: str """A question to answer""" context: str | None = None """Optional context for the question""" class AnswerOutput(BaseModel, use_attribute_docstrings=True): # (2)! """Model for expected answer outputs.""" answer: str """The answer to the question""" confidence: float = Field(ge=0, le=1) """Confidence level (0-1)""" class MetadataType(BaseModel, use_attribute_docstrings=True): # (3)! """Metadata model for test cases.""" difficulty: str """Difficulty level (easy, medium, hard)""" category: str """Question category""" async def main(): dataset = await generate_dataset( # (4)! dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.yaml') dataset.to_file(output_file) # (5)! print(output_file.read_text()) """ # yaml-language-server: $schema=questions_cases_schema.json name: null cases: - name: Easy Capital Question inputs: question: What is the capital of France? context: null metadata: difficulty: easy category: Geography expected_output: answer: Paris confidence: 0.95 evaluators: - EqualsExpected - name: Challenging Landmark Question inputs: question: Which world-famous landmark is located on the banks of the Seine River? context: null metadata: difficulty: hard category: Landmarks expected_output: answer: Eiffel Tower confidence: 0.9 evaluators: - EqualsExpected evaluators: [] """ ``` 1. Define the schema for the inputs to the task. 1. Define the schema for the expected outputs of the task. 1. Define the schema for the metadata of the test cases. 1. Call generate_dataset to create a Dataset with 2 cases confirming to the schema. 1. Save the dataset to a YAML file, this will also write `questions_cases_schema.json` with the schema JSON schema for `questions_cases.yaml` to make editing easier. The magic `yaml-language-server` comment is supported by at least vscode, jetbrains/pycharm (more details [here](https://github.com/redhat-developer/yaml-language-server#using-inlined-schema)). *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main(answer))` to run `main`)* You can also write datasets as JSON files: generate_dataset_example_json.py ```python from pathlib import Path from pydantic_evals import Dataset from pydantic_evals.generation import generate_dataset from generate_dataset_example import AnswerOutput, MetadataType, QuestionInputs async def main(): dataset = await generate_dataset( # (1)! dataset_type=Dataset[QuestionInputs, AnswerOutput, MetadataType], n_examples=2, extra_instructions=""" Generate question-answer pairs about world capitals and landmarks. Make sure to include both easy and challenging questions. """, ) output_file = Path('questions_cases.json') dataset.to_file(output_file) # (2)! print(output_file.read_text()) """ { "$schema": "questions_cases_schema.json", "name": null, "cases": [ { "name": "Easy Capital Question", "inputs": { "question": "What is the capital of France?", "context": null }, "metadata": { "difficulty": "easy", "category": "Geography" }, "expected_output": { "answer": "Paris", "confidence": 0.95 }, "evaluators": [ "EqualsExpected" ] }, { "name": "Challenging Landmark Question", "inputs": { "question": "Which world-famous landmark is located on the banks of the Seine River?", "context": null }, "metadata": { "difficulty": "hard", "category": "Landmarks" }, "expected_output": { "answer": "Eiffel Tower", "confidence": 0.9 }, "evaluators": [ "EqualsExpected" ] } ], "evaluators": [] } """ ``` 1. Generate the Dataset exactly as above. 1. Save the dataset to a JSON file, this will also write `questions_cases_schema.json` with th JSON schema for `questions_cases.json`. This time the `$schema` key is included in the JSON file to define the schema for IDEs to use while you edit the file, there's no formal spec for this, but it works in vscode and pycharm and is discussed at length in [json-schema-org/json-schema-spec#828](https://github.com/json-schema-org/json-schema-spec/issues/828). *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main(answer))` to run `main`)* ## Integration with Logfire Pydantic Evals is implemented using OpenTelemetry to record traces of the evaluation process. These traces contain all the information included in the terminal output as attributes, but also include full tracing from the executions of the evaluation task function. You can send these traces to any OpenTelemetry-compatible backend, including [Pydantic Logfire](https://logfire.pydantic.dev/docs/guides/web-ui/evals/). All you need to do is configure Logfire via `logfire.configure`: logfire_integration.py ```python import logfire from judge_recipes import recipe_dataset, transform_recipe logfire.configure( send_to_logfire='if-token-present', # (1)! environment='development', # (2)! service_name='evals', # (3)! ) recipe_dataset.evaluate_sync(transform_recipe) ``` 1. The `send_to_logfire` argument controls when traces are sent to Logfire. You can set it to `'if-token-present'` to send data to Logfire only if the `LOGFIRE_TOKEN` environment variable is set. See the [Logfire configuration docs](https://logfire.pydantic.dev/docs/reference/configuration/) for more details. 1. The `environment` argument sets the environment for the traces. It's a good idea to set this to `'development'` when running tests or evaluations and sending data to a project with production data, to make it easier to filter these traces out while reviewing data from your production environment(s). 1. The `service_name` argument sets the service name for the traces. This is displayed in the Logfire UI to help you identify the source of the associated spans. Logfire has some special integration with Pydantic Evals traces, including a table view of the evaluation results on the evaluation root span (which is generated in each call to Dataset.evaluate): and a detailed view of the inputs and outputs for the execution of each case: In addition, any OpenTelemetry spans generated during the evaluation process will be sent to Logfire, allowing you to visualize the full execution of the code called during the evaluation process: This can be especially helpful when attempting to write evaluators that make use of the `span_tree` property of the EvaluatorContext, as described in the [OpenTelemetry Integration](#opentelemetry-integration) section above. This allows you to write evaluations that depend on information about which code paths were executed during the call to the task function without needing to manually instrument the code being evaluated, as long as the code being evaluated is already adequately instrumented with OpenTelemetry. In the case of Pydantic AI agents, for example, this can be used to ensure specific tools are (or are not) called during the execution of specific cases. Using OpenTelemetry in this way also means that all data used to evaluate the task executions will be accessible in the traces produced by production runs of the code, making it straightforward to perform the same evaluations on production data. ## API Reference For comprehensive coverage of all classes, methods, and configuration options, see the detailed [API Reference documentation](https://ai.pydantic.dev/api/pydantic_evals/dataset/). ## Next Steps 1. **Start with simple evaluations** using basic evaluators like [`IsInstance`](https://ai.pydantic.dev/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.IsInstance) and [`EqualsExpected`](https://ai.pydantic.dev/api/pydantic_evals/evaluators/#pydantic_evals.evaluators.EqualsExpected) 1. **Integrate with Logfire** to visualize results and enable team collaboration 1. **Build comprehensive test suites** with diverse cases covering edge cases and performance requirements 1. **Implement custom evaluators** for domain-specific quality metrics 1. **Automate evaluation runs** as part of your development and deployment pipeline # Durable Execution # Durable Execution with DBOS [DBOS](https://www.dbos.dev/) is a lightweight [durable execution](https://docs.dbos.dev/architecture) library natively integrated with Pydantic AI. ## Durable Execution DBOS workflows make your program **durable** by checkpointing its state in a database. If your program ever fails, when it restarts all your workflows will automatically resume from the last completed step. - **Workflows** must be deterministic and generally cannot include I/O. - **Steps** may perform I/O (network, disk, API calls). If a step fails, it restarts from the beginning. Every workflow input and step output is durably stored in the system database. When workflow execution fails, whether from crashes, network issues, or server restarts, DBOS leverages these checkpoints to recover workflows from their last completed step. DBOS **queues** provide durable, database-backed alternatives to systems like Celery or BullMQ, supporting features such as concurrency limits, rate limits, timeouts, and prioritization. See the [DBOS docs](https://docs.dbos.dev/architecture) for details. The diagram below shows the overall architecture of an agentic application in DBOS. DBOS runs fully in-process as a library. Functions remain normal Python functions but are checkpointed into a database (Postgres or SQLite). ```text Clients (HTTP, RPC, Kafka, etc.) | v +------------------------------------------------------+ | Application Servers | | | | +----------------------------------------------+ | | | Pydantic AI + DBOS Libraries | | | | | | | | [ Workflows (Agent Run Loop) ] | | | | [ Steps (Tool, MCP, Model) ] | | | | [ Queues ] [ Cron Jobs ] [ Messaging ] | | | +----------------------------------------------+ | | | +------------------------------------------------------+ | v +------------------------------------------------------+ | Database | | (Stores workflow and step state, schedules tasks) | +------------------------------------------------------+ ``` See the [DBOS documentation](https://docs.dbos.dev/architecture) for more information. ## Durable Agent Any agent can be wrapped in a DBOSAgent to get durable execution. `DBOSAgent` automatically:, - Wraps `Agent.run` and `Agent.run_sync` as DBOS workflows. - Wraps [model requests](../../models/overview/) and [MCP communication](../../mcp/client/) as DBOS steps. Custom tool functions and event stream handlers are **not automatically wrapped** by DBOS. If they involve non-deterministic behavior or perform I/O, you should explicitly decorate them with `@DBOS.step`. The original agent, model, and MCP server can still be used as normal outside the DBOS workflow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with the DBOS [open-source library](https://github.com/dbos-inc/dbos-transact-py): ```bash pip install pydantic-ai[dbos] ``` ```bash uv add pydantic-ai[dbos] ``` Or if you're using the slim package, you can install it with the `dbos` optional group: ```bash pip install pydantic-ai-slim[dbos] ``` ```bash uv add pydantic-ai-slim[dbos] ``` dbos_agent.py ```python from dbos import DBOS, DBOSConfig from pydantic_ai import Agent from pydantic_ai.durable_exec.dbos import DBOSAgent dbos_config: DBOSConfig = { 'name': 'pydantic_dbos_agent', 'system_database_url': 'sqlite:///dbostest.sqlite', # (3)! } DBOS(config=dbos_config) agent = Agent( 'gpt-5', instructions="You're an expert in geography.", name='geography', # (4)! ) dbos_agent = DBOSAgent(agent) # (1)! async def main(): DBOS.launch() result = await dbos_agent.run('What is the capital of Mexico?') # (2)! print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` 1. Workflows and `DBOSAgent` must be defined before `DBOS.launch()` so that recovery can correctly find all workflows. 1. DBOSAgent.run() works like Agent.run(), but runs as a DBOS workflow and executes model requests, decorated tool calls, and MCP communication as DBOS steps. 1. This example uses SQLite. Postgres is recommended for production. 1. The agent's `name` is used to uniquely identify its workflows. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* Because DBOS workflows need to be defined before calling `DBOS.launch()` and the `DBOSAgent` instance automatically registers `run` and `run_sync` as workflows, it needs to be defined before calling `DBOS.launch()` as well. For more information on how to use DBOS in Python applications, see their [Python SDK guide](https://docs.dbos.dev/python/programming-guide). ## DBOS Integration Considerations When using DBOS with Pydantic AI agents, there are a few important considerations to ensure workflows and toolsets behave correctly. ### Agent and Toolset Requirements Each agent instance must have a unique `name` so DBOS can correctly resume workflows after a failure or restart. Tools and event stream handlers are not automatically wrapped by DBOS. You can decide how to integrate them: - Decorate with `@DBOS.step` if the function involves non-determinism or I/O. - Skip the decorator if durability isn't needed, so you avoid the extra DB checkpoint write. - If the function needs to enqueue tasks or invoke other DBOS workflows, run it inside the agent's main workflow (not as a step). Other than that, any agent and toolset will just work! ### Agent Run Context and Dependencies DBOS checkpoints workflow inputs/outputs and step outputs into a database using [`pickle`](https://docs.python.org/3/library/pickle.html). This means you need to make sure [dependencies](../../dependencies/) object provided to DBOSAgent.run() or DBOSAgent.run_sync(), and tool outputs can be serialized using pickle. You may also want to keep the inputs and outputs small (under ~2 MB). PostgreSQL and SQLite support up to 1 GB per field, but large objects may impact performance. ### Streaming Because DBOS cannot stream output directly to the workflow or step call site, Agent.run_stream() and Agent.run_stream_events() are not supported when running inside of a DBOS workflow. Instead, you can implement streaming by setting an event_stream_handler on the `Agent` or `DBOSAgent` instance and using DBOSAgent.run(). The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](../../agents/#streaming-all-events). ## Step Configuration You can customize DBOS step behavior, such as retries, by passing StepConfig objects to the `DBOSAgent` constructor: - `mcp_step_config`: The DBOS step config to use for MCP server communication. No retries if omitted. - `model_step_config`: The DBOS step config to use for model request steps. No retries if omitted. For custom tools, you can annotate them directly with [`@DBOS.step`](https://docs.dbos.dev/python/reference/decorators#step) or [`@DBOS.workflow`](https://docs.dbos.dev/python/reference/decorators#workflow) decorators as needed. These decorators have no effect outside DBOS workflows, so tools remain usable in non-DBOS agents. ## Step Retries On top of the automatic retries for request failures that DBOS will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using DBOS, it's recommended to not use [HTTP Request Retries](../../retries/) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](../../models/openai/#custom-openai-client). You can customize DBOS's retry policy using [step configuration](#step-configuration). ## Observability with Logfire DBOS can be configured to generate OpenTelemetry spans for each workflow and step execution, and Pydantic AI emits spans for each agent run, model request, and tool invocation. You can send these spans to [Pydantic Logfire](../../logfire/) to get a full, end-to-end view of what's happening in your application. For more information about DBOS logging and tracing, please see the [DBOS docs](https://docs.dbos.dev/python/tutorials/logging-and-tracing) for details. # Durable Execution Pydantic AI allows you to build durable agents that can preserve their progress across transient API failures and application errors or restarts, and handle long-running, asynchronous, and human-in-the-loop workflows with production-grade reliability. Durable agents have full support for [streaming](../../agents/#streaming-all-events) and [MCP](../../mcp/client/), with the added benefit of fault tolerance. Pydantic AI natively supports three durable execution solutions: - [Temporal](../temporal/) - [DBOS](../dbos/) - [Prefect](../prefect/) These integrations only use Pydantic AI's public interface, so they also serve as a reference for integrating with other durable systems. # Durable Execution with Prefect [Prefect](https://www.prefect.io/) is a workflow orchestration framework for building resilient data pipelines in Python, natively integrated with Pydantic AI. ## Durable Execution Prefect 3.0 brings [transactional semantics](https://www.prefect.io/blog/transactional-ml-pipelines-with-prefect-3-0) to your Python workflows, allowing you to group tasks into atomic units and define failure modes. If any part of a transaction fails, the entire transaction can be rolled back to a clean state. - **Flows** are the top-level entry points for your workflow. They can contain tasks and other flows. - **Tasks** are individual units of work that can be retried, cached, and monitored independently. Prefect 3.0's approach to transactional orchestration makes your workflows automatically **idempotent**: rerunnable without duplication or inconsistency across any environment. Every task is executed within a transaction that governs when and where the task's result record is persisted. If the task runs again under an identical context, it will not re-execute but instead load its previous result. The diagram below shows the overall architecture of an agentic application with Prefect. Prefect uses client-side task orchestration by default, with optional server connectivity for advanced features like scheduling and monitoring. ```text +---------------------+ | Prefect Server | (Monitoring, | or Cloud | scheduling, UI, +---------------------+ orchestration) ^ | Flow state, | Schedule flows, metadata, | track execution logs | | +------------------------------------------------------+ | Application Process | | +----------------------------------------------+ | | | Flow (Agent.run) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Task | | Task | | Task | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | | Cache & Cache & Cache & | | persist persist persist | | to to to | | v v v | | +----------------------------------------------+ | | | Result Storage (Local FS, S3, etc.) | | | +----------------------------------------------+ | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Prefect documentation](https://docs.prefect.io/) for more information. ## Durable Agent Any agent can be wrapped in a PrefectAgent to get durable execution. `PrefectAgent` automatically: - Wraps Agent.run and Agent.run_sync as Prefect flows. - Wraps [model requests](../../models/overview/) as Prefect tasks. - Wraps [tool calls](../../tools/) as Prefect tasks (configurable per-tool). - Wraps [MCP communication](../../mcp/client/) as Prefect tasks. Event stream handlers are **automatically wrapped** by Prefect when running inside a Prefect flow. Each event from the stream is processed in a separate Prefect task for durability. You can customize the task behavior using the `event_stream_handler_task_config` parameter when creating the `PrefectAgent`. Do **not** manually decorate event stream handlers with `@task`. For examples, see the [streaming docs](../../agents/#streaming-all-events) The original agent, model, and MCP server can still be used as normal outside the Prefect flow. Here is a simple but complete example of wrapping an agent for durable execution. All it requires is to install Pydantic AI with Prefect: ```bash pip install pydantic-ai[prefect] ``` ```bash uv add pydantic-ai[prefect] ``` Or if you're using the slim package, you can install it with the `prefect` optional group: ```bash pip install pydantic-ai-slim[prefect] ``` ```bash uv add pydantic-ai-slim[prefect] ``` prefect_agent.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent agent = Agent( 'gpt-4o', instructions="You're an expert in geography.", name='geography', # (1)! ) prefect_agent = PrefectAgent(agent) # (2)! async def main(): result = await prefect_agent.run('What is the capital of Mexico?') # (3)! print(result.output) #> Mexico City (Ciudad de México, CDMX) ``` 1. The agent's `name` is used to uniquely identify its flows and tasks. 1. Wrapping the agent with `PrefectAgent` enables durable execution for all agent runs. 1. PrefectAgent.run() works like Agent.run(), but runs as a Prefect flow and executes model requests, decorated tool calls, and MCP communication as Prefect tasks. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* For more information on how to use Prefect in Python applications, see their [Python documentation](https://docs.prefect.io/v3/how-to-guides/workflows/write-and-run). ## Prefect Integration Considerations When using Prefect with Pydantic AI agents, there are a few important considerations to ensure workflows behave correctly. ### Agent Requirements Each agent instance must have a unique `name` so Prefect can correctly identify and track its flows and tasks. ### Tool Wrapping Agent tools are automatically wrapped as Prefect tasks, which means they benefit from: - **Retry logic**: Failed tool calls can be retried automatically - **Caching**: Tool results are cached based on their inputs - **Observability**: Tool execution is tracked in the Prefect UI You can customize tool task behavior using `tool_task_config` (applies to all tools) or `tool_task_config_by_name` (per-tool configuration): prefect_agent_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent('gpt-4o', name='my_agent') @agent.tool_plain def fetch_data(url: str) -> str: # This tool will be wrapped as a Prefect task ... prefect_agent = PrefectAgent( agent, tool_task_config=TaskConfig(retries=3), # Default for all tools tool_task_config_by_name={ 'fetch_data': TaskConfig(timeout_seconds=10.0), # Specific to fetch_data 'simple_tool': None, # Disable task wrapping for simple_tool }, ) ``` Set a tool's config to `None` in `tool_task_config_by_name` to disable task wrapping for that specific tool. ### Streaming When running inside a Prefect flow, Agent.run_stream() works but doesn't provide real-time streaming because Prefect tasks consume their entire execution before returning results. The method will execute fully and return the complete result at once. For real-time streaming behavior inside Prefect flows, you can set an event_stream_handler on the `Agent` or `PrefectAgent` instance and use PrefectAgent.run(). **Note**: Event stream handlers behave differently when running inside a Prefect flow versus outside: - **Outside a flow**: The handler receives events as they stream from the model - **Inside a flow**: Each event is wrapped as a Prefect task for durability, which may affect timing but ensures reliability The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](../../agents/#streaming-all-events). ## Task Configuration You can customize Prefect task behavior, such as retries and timeouts, by passing TaskConfig objects to the `PrefectAgent` constructor: - `mcp_task_config`: Configuration for MCP server communication tasks - `model_task_config`: Configuration for model request tasks - `tool_task_config`: Default configuration for all tool calls - `tool_task_config_by_name`: Per-tool task configuration (overrides `tool_task_config`) - `event_stream_handler_task_config`: Configuration for event stream handler tasks (applies when running inside a Prefect flow) Available `TaskConfig` options: - `retries`: Maximum number of retries for the task (default: `0`) - `retry_delay_seconds`: Delay between retries in seconds (can be a single value or list for exponential backoff, default: `1.0`) - `timeout_seconds`: Maximum time in seconds for the task to complete - `cache_policy`: Custom Prefect cache policy for the task - `persist_result`: Whether to persist the task result - `result_storage`: Prefect result storage for the task (e.g., `'s3-bucket/my-storage'` or a `WritableFileSystem` block) - `log_prints`: Whether to log print statements from the task (default: `False`) Example: prefect_agent_config.py ```python from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent, TaskConfig agent = Agent( 'gpt-4o', instructions="You're an expert in geography.", name='geography', ) prefect_agent = PrefectAgent( agent, model_task_config=TaskConfig( retries=3, retry_delay_seconds=[1.0, 2.0, 4.0], # Exponential backoff timeout_seconds=30.0, ), ) async def main(): result = await prefect_agent.run('What is the capital of France?') print(result.output) #> Paris ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### Retry Considerations Pydantic AI and provider API clients have their own retry logic. When using Prefect, you may want to: - Disable [HTTP Request Retries](../../retries/) in Pydantic AI - Turn off your provider API client's retry logic (e.g., `max_retries=0` on a [custom OpenAI client](../../models/openai/#custom-openai-client)) - Rely on Prefect's task-level retry configuration for consistency This prevents requests from being retried multiple times at different layers. ## Caching and Idempotency Prefect 3.0 provides built-in caching and transactional semantics. Tasks with identical inputs will not re-execute if their results are already cached, making workflows naturally idempotent and resilient to failures. - **Task inputs**: Messages, settings, parameters, tool arguments, and serializable dependencies **Note**: For user dependencies to be included in cache keys, they must be serializable (e.g., Pydantic models or basic Python types). Non-serializable dependencies are automatically excluded from cache computation. ## Observability with Prefect and Logfire Prefect provides a built-in UI for monitoring flow runs, task executions, and failures. You can: - View real-time flow run status - Debug failures with full stack traces - Set up alerts and notifications To access the Prefect UI, you can either: 1. Use [Prefect Cloud](https://www.prefect.io/cloud) (managed service) 1. Run a local [Prefect server](https://docs.prefect.io/v3/how-to-guides/self-hosted/server-cli) with `prefect server start` You can also use [Pydantic Logfire](../../logfire/) for detailed observability. When using both Prefect and Logfire, you'll get complementary views: - **Prefect**: Workflow-level orchestration, task status, and retry history - **Logfire**: Fine-grained tracing of agent runs, model requests, and tool invocations When using Logfire with Prefect, you can enable distributed tracing to see spans for your Prefect runs included with your agent runs, model requests, and tool invocations. For more information about Prefect monitoring, see the [Prefect documentation](https://docs.prefect.io/). ## Deployments and Scheduling To deploy and schedule a `PrefectAgent`, wrap it in a Prefect flow and use the flow's [`serve()`](https://docs.prefect.io/v3/how-to-guides/deployments/create-deployments#create-a-deployment-with-serve) or [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) methods: serve_agent.py ```python from prefect import flow from pydantic_ai import Agent from pydantic_ai.durable_exec.prefect import PrefectAgent agent = Agent( 'openai:gpt-4o', name='daily_report_agent', instructions='Generate a daily summary report.', ) prefect_agent = PrefectAgent(agent) @flow async def daily_report_flow(user_prompt: str): """Generate a daily report using the agent.""" result = await prefect_agent.run(user_prompt) return result.output # Serve the flow with a daily schedule if __name__ == '__main__': daily_report_flow.serve( name='daily-report-deployment', cron='0 9 * * *', # Run daily at 9am parameters={'user_prompt': "Generate today's report"}, tags=['production', 'reports'], ) ``` The `serve()` method accepts scheduling options: - **`cron`**: Cron schedule string (e.g., `'0 9 * * *'` for daily at 9am) - **`interval`**: Schedule interval in seconds or as a timedelta - **`rrule`**: iCalendar RRule schedule string For production deployments with Docker, Kubernetes, or other infrastructure, use the flow's [`deploy()`](https://docs.prefect.io/v3/how-to-guides/deployments/deploy-via-python) method. See the [Prefect deployment documentation](https://docs.prefect.io/v3/how-to-guides/deployments/create-deploymentsy) for more information. # Durable Execution with Temporal [Temporal](https://temporal.io) is a popular [durable execution](https://docs.temporal.io/evaluate/understanding-temporal#durable-execution) platform that's natively supported by Pydantic AI. ## Durable Execution In Temporal's durable execution implementation, a program that crashes or encounters an exception while interacting with a model or API will retry until it can successfully complete. Temporal relies primarily on a replay mechanism to recover from failures. As the program makes progress, Temporal saves key inputs and decisions, allowing a re-started program to pick up right where it left off. The key to making this work is to separate the application's repeatable (deterministic) and non-repeatable (non-deterministic) parts: 1. Deterministic pieces, termed [**workflows**](https://docs.temporal.io/workflow-definition), execute the same way when re-run with the same inputs. 1. Non-deterministic pieces, termed [**activities**](https://docs.temporal.io/activities), can run arbitrary code, performing I/O and any other operations. Workflow code can run for extended periods and, if interrupted, resume exactly where it left off. Critically, workflow code generally *cannot* include any kind of I/O, over the network, disk, etc. Activity code faces no restrictions on I/O or external interactions, but if an activity fails part-way through it is restarted from the beginning. Note If you are familiar with celery, it may be helpful to think of Temporal activities as similar to celery tasks, but where you wait for the task to complete and obtain its result before proceeding to the next step in the workflow. However, Temporal workflows and activities offer a great deal more flexibility and functionality than celery tasks. See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information In the case of Pydantic AI agents, integration with Temporal means that [model requests](../../models/overview/), [tool calls](../../tools/) that may require I/O, and [MCP server communication](../../mcp/client/) all need to be offloaded to Temporal activities due to their I/O requirements, while the logic that coordinates them (i.e. the agent run) lives in the workflow. Code that handles a scheduled job or web request can then execute the workflow, which will in turn execute the activities as needed. The diagram below shows the overall architecture of an agentic application in Temporal. The Temporal Server is responsible for tracking program execution and making sure the associated state is preserved reliably (i.e., stored to an internal database, and possibly replicated across cloud regions). Temporal Server manages data in encrypted form, so all data processing occurs on the Worker, which runs the workflow and activities. ```text +---------------------+ | Temporal Server | (Stores workflow state, +---------------------+ schedules activities, ^ persists progress) | Save state, | Schedule Tasks, progress, | load state on resume timeouts | | +------------------------------------------------------+ | Worker | | +----------------------------------------------+ | | | Workflow Code | | | | (Agent Run Loop) | | | +----------------------------------------------+ | | | | | | | v v v | | +-----------+ +------------+ +-------------+ | | | Activity | | Activity | | Activity | | | | (Tool) | | (MCP Tool) | | (Model API) | | | +-----------+ +------------+ +-------------+ | | | | | | +------------------------------------------------------+ | | | v v v [External APIs, services, databases, etc.] ``` See the [Temporal documentation](https://docs.temporal.io/evaluate/understanding-temporal#temporal-application-the-building-blocks) for more information. ## Durable Agent Any agent can be wrapped in a TemporalAgent to get a durable agent that can be used inside a deterministic Temporal workflow, by automatically offloading all work that requires I/O (namely model requests, tool calls, and MCP server communication) to non-deterministic activities. At the time of wrapping, the agent's [model](../../models/overview/) and [toolsets](../../toolsets/) (including function tools registered on the agent and MCP servers) are frozen, activities are dynamically created for each, and the original model and toolsets are wrapped to call on the worker to execute the corresponding activities instead of directly performing the actions inside the workflow. The original agent can still be used as normal outside the Temporal workflow, but any changes to its model or toolsets after wrapping will not be reflected in the durable agent. Here is a simple but complete example of wrapping an agent for durable execution, creating a Temporal workflow with durable execution logic, connecting to a Temporal server, and running the workflow from non-durable code. All it requires is a Temporal server to be [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally): ```sh brew install temporal temporal server start-dev ``` temporal_agent.py ```python import uuid from temporalio import workflow from temporalio.client import Client from temporalio.worker import Worker from pydantic_ai import Agent from pydantic_ai.durable_exec.temporal import ( AgentPlugin, PydanticAIPlugin, TemporalAgent, ) agent = Agent( 'gpt-5', instructions="You're an expert in geography.", name='geography', # (10)! ) temporal_agent = TemporalAgent(agent) # (1)! @workflow.defn class GeographyWorkflow: # (2)! @workflow.run async def run(self, prompt: str) -> str: result = await temporal_agent.run(prompt) # (3)! return result.output async def main(): client = await Client.connect( # (4)! 'localhost:7233', # (5)! plugins=[PydanticAIPlugin()], # (6)! ) async with Worker( # (7)! client, task_queue='geography', workflows=[GeographyWorkflow], plugins=[AgentPlugin(temporal_agent)], # (8)! ): output = await client.execute_workflow( # (9)! GeographyWorkflow.run, args=['What is the capital of Mexico?'], id=f'geography-{uuid.uuid4()}', task_queue='geography', ) print(output) #> Mexico City (Ciudad de México, CDMX) ``` 1. The original `Agent` cannot be used inside a deterministic Temporal workflow, but the `TemporalAgent` can. 1. As explained above, the workflow represents a deterministic piece of code that can use non-deterministic activities for operations that require I/O. 1. TemporalAgent.run() works just like Agent.run(), but it will automatically offload model requests, tool calls, and MCP server communication to Temporal activities. 1. We connect to the Temporal server which keeps track of workflow and activity execution. 1. This assumes the Temporal server is [running locally](https://github.com/temporalio/temporal#download-and-start-temporal-server-locally). 1. The PydanticAIPlugin tells Temporal to use Pydantic for serialization and deserialization, and to treat UserError exceptions as non-retryable. 1. We start the worker that will listen on the specified task queue and run workflows and activities. In a real world application, this might be run in a separate service. 1. The AgentPlugin registers the `TemporalAgent`'s activities with the worker. 1. We call on the server to execute the workflow on a worker that's listening on the specified task queue. 1. The agent's `name` is used to uniquely identify its activities. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* In a real world application, the agent, workflow, and worker are typically defined separately from the code that calls for a workflow to be executed. Because Temporal workflows need to be defined at the top level of the file and the `TemporalAgent` instance is needed inside the workflow and when starting the worker (to register the activities), it needs to be defined at the top level of the file as well. For more information on how to use Temporal in Python applications, see their [Python SDK guide](https://docs.temporal.io/develop/python). ## Temporal Integration Considerations There are a few considerations specific to agents and toolsets when using Temporal for durable execution. These are important to understand to ensure that your agents and toolsets work correctly with Temporal's workflow and activity model. ### Agent Names and Toolset IDs To ensure that Temporal knows what code to run when an activity fails or is interrupted and then restarted, even if your code is changed in between, each activity needs to have a name that's stable and unique. When `TemporalAgent` dynamically creates activities for the wrapped agent's model requests and toolsets (specifically those that implement their own tool listing and calling, i.e. FunctionToolset and MCPServer), their names are derived from the agent's name and the toolsets' ids. These fields are normally optional, but are required to be set when using Temporal. They should not be changed once the durable agent has been deployed to production as this would break active workflows. Other than that, any agent and toolset will just work! ### Instructions Functions, Output Functions, and History Processors Pydantic AI runs non-async [instructions](../../agents/#instructions) and [system prompt](../../agents/#system-prompts) functions, [history processors](../../message-history/#processing-message-history), [output functions](../../output/#output-functions), and [output validators](../../output/#output-validator-functions) in threads, which are not supported inside Temporal workflows and require an activity. Ensure that these functions are async instead. Synchronous tool functions are supported, as tools are automatically run in activities unless this is [explicitly disabled](#activity-configuration). Still, it's recommended to make tool functions async as well to improve performance. ### Agent Run Context and Dependencies As workflows and activities run in separate processes, any values passed between them need to be serializable. As these payloads are stored in the workflow execution event history, Temporal limits their size to 2MB. To account for these limitations, tool functions and the [event stream handler](#streaming) running inside activities receive a limited version of the agent's RunContext, and it's your responsibility to make sure that the [dependencies](../../dependencies/) object provided to TemporalAgent.run() can be serialized using Pydantic. Specifically, only the `deps`, `retries`, `tool_call_id`, `tool_name`, `tool_call_approved`, `retry`, `max_retries` and `run_step` fields are available by default, and trying to access `model`, `usage`, `prompt`, `messages`, or `tracer` will raise an error. If you need one or more of these attributes to be available inside activities, you can create a TemporalRunContext subclass with custom `serialize_run_context` and `deserialize_run_context` class methods and pass it to TemporalAgent as `run_context_type`. ### Streaming Because Temporal activities cannot stream output directly to the activity call site, Agent.run_stream(), Agent.run_stream_events(), and Agent.iter() are not supported. Instead, you can implement streaming by setting an event_stream_handler on the `Agent` or `TemporalAgent` instance and using TemporalAgent.run() inside the workflow. The event stream handler function will receive the agent run context and an async iterable of events from the model's streaming response and the agent's execution of tools. For examples, see the [streaming docs](../../agents/#streaming-all-events). As the streaming model request activity, workflow, and workflow execution call all take place in separate processes, passing data between them requires some care: - To get data from the workflow call site or workflow to the event stream handler, you can use a [dependencies object](#agent-run-context-and-dependencies). - To get data from the event stream handler to the workflow, workflow call site, or a frontend, you need to use an external system that the event stream handler can write to and the event consumer can read from, like a message queue. You can use the dependency object to make sure the same connection string or other unique ID is available in all the places that need it. ## Activity Configuration Temporal activity configuration, like timeouts and retry policies, can be customized by passing [`temporalio.workflow.ActivityConfig`](https://python.temporal.io/temporalio.workflow.ActivityConfig.html) objects to the `TemporalAgent` constructor: - `activity_config`: The base Temporal activity config to use for all activities. If no config is provided, a `start_to_close_timeout` of 60 seconds is used. - `model_activity_config`: The Temporal activity config to use for model request activities. This is merged with the base activity config. - `toolset_activity_config`: The Temporal activity config to use for get-tools and call-tool activities for specific toolsets identified by ID. This is merged with the base activity config. - `tool_activity_config`: The Temporal activity config to use for specific tool call activities identified by toolset ID and tool name. This is merged with the base and toolset-specific activity configs. If a tool does not use I/O, you can specify `False` to disable using an activity. Note that the tool is required to be defined as an `async` function as non-async tools are run in threads which are non-deterministic and thus not supported outside of activities. ## Activity Retries On top of the automatic retries for request failures that Temporal will perform, Pydantic AI and various provider API clients also have their own request retry logic. Enabling these at the same time may cause the request to be retried more often than expected, with improper `Retry-After` handling. When using Temporal, it's recommended to not use [HTTP Request Retries](../../retries/) and to turn off your provider API client's own retry logic, for example by setting `max_retries=0` on a [custom `OpenAIProvider` API client](../../models/openai/#custom-openai-client). You can customize Temporal's retry policy using [activity configuration](#activity-configuration). ## Observability with Logfire Temporal generates telemetry events and metrics for each workflow and activity execution, and Pydantic AI generates events for each agent run, model request and tool call. These can be sent to [Pydantic Logfire](../../logfire/) to get a complete picture of what's happening in your application. To use Logfire with Temporal, you need to pass a LogfirePlugin object to Temporal's `Client.connect()`: logfire_plugin.py ```py from temporalio.client import Client from pydantic_ai.durable_exec.temporal import LogfirePlugin, PydanticAIPlugin async def main(): client = await Client.connect( 'localhost:7233', plugins=[PydanticAIPlugin(), LogfirePlugin()], ) ``` By default, the `LogfirePlugin` will instrument Temporal (including metrics) and Pydantic AI and send all data to Logfire. To customize Logfire configuration and instrumentation, you can pass a `logfire_setup` function to the `LogfirePlugin` constructor and return a custom `Logfire` instance (i.e. the result of `logfire.configure()`). To disable sending Temporal metrics to Logfire, you can pass `metrics=False` to the `LogfirePlugin` constructor. ## Known Issues ### Pandas When `logfire.info` is used inside an activity and the `pandas` package is among your project's dependencies, you may encounter the following error which seems to be the result of an import race condition: ```text AttributeError: partially initialized module 'pandas' has no attribute '_pandas_parser_CAPI' (most likely due to a circular import) ``` To fix this, you can use the [`temporalio.workflow.unsafe.imports_passed_through()`](https://python.temporal.io/temporalio.workflow.unsafe.html#imports_passed_through) context manager to proactively import the package and not have it be reloaded in the workflow sandbox: temporal_activity.py ```python from temporalio import workflow with workflow.unsafe.imports_passed_through(): import pandas ``` # MCP # Client Pydantic AI can act as an [MCP client](https://modelcontextprotocol.io/quickstart/client), connecting to MCP servers to use their tools. ## Install You need to either install [`pydantic-ai`](../../install/), or[`pydantic-ai-slim`](../../install/#slim-install) with the `mcp` optional group: ```bash pip install "pydantic-ai-slim[mcp]" ``` ```bash uv add "pydantic-ai-slim[mcp]" ``` ## Usage Pydantic AI comes with three ways to connect to MCP servers: - MCPServerStreamableHTTP which connects to an MCP server using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport - MCPServerSSE which connects to an MCP server using the [HTTP SSE](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) transport - MCPServerStdio which runs the server as a subprocess and connects to it using the [stdio](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) transport Examples of all three are shown below. Each MCP server instance is a [toolset](../../toolsets/) and can be registered with an Agent using the `toolsets` argument. You can use the async with agent context manager to open and close connections to all registered servers (and in the case of stdio servers, start and stop the subprocesses) around the context where they'll be used in agent runs. You can also use async with server to manage the connection or subprocess of a specific server, for example if you'd like to use it with multiple agents. If you don't explicitly enter one of these context managers to set up the server, this will be done automatically when it's needed (e.g. to list the available tools or call a specific tool), but it's more efficient to do so around the entire context where you expect the servers to be used. ### Streamable HTTP Client MCPServerStreamableHTTP connects over HTTP using the [Streamable HTTP](https://modelcontextprotocol.io/introduction#streamable-http) transport to a server. Note MCPServerStreamableHTTP requires an MCP server to be running and accepting HTTP connections before running the agent. Running the server is not managed by Pydantic AI. Before creating the Streamable HTTP client, we need to run a server that supports the Streamable HTTP transport. streamable_http_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='streamable-http') ``` Then we can create the client: mcp_streamable_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStreamableHTTP server = MCPServerStreamableHTTP('http://localhost:8000/mcp') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) # (2)! async def main(): async with agent: # (3)! result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. 1. Create a client session to connect to the server. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* **What's happening here?** - The model receives the prompt "What is 7 plus 5?" - The model decides "Oh, I've got this `add` tool, that will be a good way to answer this question" - The model returns a tool call - Pydantic AI sends the tool call to the MCP server using the Streamable HTTP transport - The model is called again with the return value of running the `add` tool (12) - The model returns the final answer You can visualise this clearly, and even see the tool call, by adding three lines of code to instrument the example with [logfire](https://logfire.pydantic.dev/docs): mcp_sse_client_logfire.py ```python import logfire logfire.configure() logfire.instrument_pydantic_ai() ``` ### SSE Client MCPServerSSE connects over HTTP using the [HTTP + Server Sent Events transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#http-with-sse) to a server. Note The SSE transport in MCP is deprecated, you should use Streamable HTTP instead. Before creating the SSE client, we need to run a server that supports the SSE transport. sse_server.py ```python from mcp.server.fastmcp import FastMCP app = FastMCP() @app.tool() def add(a: int, b: int) -> int: return a + b if __name__ == '__main__': app.run(transport='sse') ``` Then we can create the client: mcp_sse_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE server = MCPServerSSE('http://localhost:3001/sse') # (1)! agent = Agent('openai:gpt-4o', toolsets=[server]) # (2)! async def main(): async with agent: # (3)! result = await agent.run('What is 7 plus 5?') print(result.output) #> The answer is 12. ``` 1. Define the MCP server with the URL used to connect. 1. Create an agent with the MCP server attached. 1. Create a client session to connect to the server. *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ### MCP "stdio" Server MCP also offers [stdio transport](https://spec.modelcontextprotocol.io/specification/2024-11-05/basic/transports/#stdio) where the server is run as a subprocess and communicates with the client over `stdin` and `stdout`. In this case, you'd use the MCPServerStdio class. In this example [mcp-run-python](https://github.com/pydantic/mcp-run-python) is used as the MCP server. mcp_stdio_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( # (1)! 'uv', args=['run', 'mcp-run-python', 'stdio'], timeout=10 ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. See [MCP Run Python](https://github.com/pydantic/mcp-run-python) for more information. ## Loading MCP Servers from Configuration Instead of creating MCP server instances individually in code, you can load multiple servers from a JSON configuration file using load_mcp_servers(). This is particularly useful when you need to manage multiple MCP servers or want to configure servers externally without modifying code. ### Configuration Format The configuration file should be a JSON file with an `mcpServers` object containing server definitions. Each server is identified by a unique key and contains the configuration for that server type: mcp_config.json ```json { "mcpServers": { "python-runner": { "command": "uv", "args": ["run", "mcp-run-python", "stdio"] }, "weather-api": { "url": "http://localhost:3001/sse" }, "calculator": { "url": "http://localhost:8000/mcp" } } } ``` Note The MCP server is only inferred to be an SSE server because of the `/sse` suffix. Any other server with the "url" field will be inferred to be a Streamable HTTP server. We made this decision given that the SSE transport is deprecated. ### Usage mcp_config_loader.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import load_mcp_servers # Load all servers from configuration file servers = load_mcp_servers('mcp_config.json') # Create agent with all loaded servers agent = Agent('openai:gpt-5', toolsets=servers) async def main(): async with agent: result = await agent.run('What is 7 plus 5?') print(result.output) ``` *(This example is complete, it can be run "as is" — you'll need to add `asyncio.run(main())` to run `main`)* ## Tool call customization The MCP servers provide the ability to set a `process_tool_call` which allows the customization of tool call requests and their responses. A common use case for this is to inject metadata to the requests which the server call needs: mcp_process_tool_call.py ```python from typing import Any from pydantic_ai import Agent, RunContext from pydantic_ai.mcp import CallToolFunc, MCPServerStdio, ToolResult from pydantic_ai.models.test import TestModel async def process_tool_call( ctx: RunContext[int], call_tool: CallToolFunc, name: str, tool_args: dict[str, Any], ) -> ToolResult: """A tool call processor that passes along the deps.""" return await call_tool(name, tool_args, {'deps': ctx.deps}) server = MCPServerStdio('python', args=['mcp_server.py'], process_tool_call=process_tool_call) agent = Agent( model=TestModel(call_tools=['echo_deps']), deps_type=int, toolsets=[server] ) async def main(): async with agent: result = await agent.run('Echo with deps set to 42', deps=42) print(result.output) #> {"echo_deps":{"echo":"This is an echo message","deps":42}} ``` How to access the metadata is MCP server SDK specific. For example with the [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk), it is accessible via the [`ctx: Context`](https://github.com/modelcontextprotocol/python-sdk#context) argument that can be included on tool call handlers: mcp_server.py ```python from typing import Any from mcp.server.fastmcp import Context, FastMCP from mcp.server.session import ServerSession mcp = FastMCP('Pydantic AI MCP Server') log_level = 'unset' @mcp.tool() async def echo_deps(ctx: Context[ServerSession, None]) -> dict[str, Any]: """Echo the run context. Args: ctx: Context object containing request and session information. Returns: Dictionary with an echo message and the deps. """ await ctx.info('This is an info message') deps: Any = getattr(ctx.request_context.meta, 'deps') return {'echo': 'This is an echo message', 'deps': deps} if __name__ == '__main__': mcp.run() ``` ## Using Tool Prefixes to Avoid Naming Conflicts When connecting to multiple MCP servers that might provide tools with the same name, you can use the `tool_prefix` parameter to avoid naming conflicts. This parameter adds a prefix to all tool names from a specific server. This allows you to use multiple servers that might have overlapping tool names without conflicts: mcp_tool_prefix_http_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Create two servers with different prefixes weather_server = MCPServerSSE( 'http://localhost:3001/sse', tool_prefix='weather' # Tools will be prefixed with 'weather_' ) calculator_server = MCPServerSSE( 'http://localhost:3002/sse', tool_prefix='calc' # Tools will be prefixed with 'calc_' ) # Both servers might have a tool named 'get_data', but they'll be exposed as: # - 'weather_get_data' # - 'calc_get_data' agent = Agent('openai:gpt-4o', toolsets=[weather_server, calculator_server]) ``` ## Tool metadata MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when filtering tools. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the ToolDefinition object that's passed to filter functions. ## Custom TLS / SSL configuration In some environments you need to tweak how HTTPS connections are established – for example to trust an internal Certificate Authority, present a client certificate for **mTLS**, or (during local development only!) disable certificate verification altogether. All HTTP-based MCP client classes (MCPServerStreamableHTTP and MCPServerSSE) expose an `http_client` parameter that lets you pass your own pre-configured [`httpx.AsyncClient`](https://www.python-httpx.org/async/). mcp_custom_tls_client.py ```python import ssl import httpx from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerSSE # Trust an internal / self-signed CA ssl_ctx = ssl.create_default_context(cafile='/etc/ssl/private/my_company_ca.pem') # OPTIONAL: if the server requires **mutual TLS** load your client certificate ssl_ctx.load_cert_chain(certfile='/etc/ssl/certs/client.crt', keyfile='/etc/ssl/private/client.key',) http_client = httpx.AsyncClient( verify=ssl_ctx, timeout=httpx.Timeout(10.0), ) server = MCPServerSSE( 'http://localhost:3001/sse', http_client=http_client, # (1)! ) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: result = await agent.run('How many days between 2000-01-01 and 2025-03-18?') print(result.output) #> There are 9,208 days between January 1, 2000, and March 18, 2025. ``` 1. When you supply `http_client`, Pydantic AI re-uses this client for every request. Anything supported by **httpx** (`verify`, `cert`, custom proxies, timeouts, etc.) therefore applies to all MCP traffic. ## MCP Sampling What is MCP Sampling? In MCP [sampling](https://modelcontextprotocol.io/docs/concepts/sampling) is a system by which an MCP server can make LLM calls via the MCP client - effectively proxying requests to an LLM via the client over whatever transport is being used. Sampling is extremely useful when MCP servers need to use Gen AI but you don't want to provision them each with their own LLM credentials or when a public MCP server would like the connecting client to pay for LLM calls. Confusingly it has nothing to do with the concept of "sampling" in observability, or frankly the concept of "sampling" in any other domain. Sampling Diagram Here's a mermaid diagram that may or may not make the data flow clearer: ``` sequenceDiagram participant LLM participant MCP_Client as MCP client participant MCP_Server as MCP server MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM tool call response MCP_Client->>MCP_Server: tool call MCP_Server->>MCP_Client: sampling "create message" MCP_Client->>LLM: LLM call LLM->>MCP_Client: LLM text response MCP_Client->>MCP_Server: sampling response MCP_Server->>MCP_Client: tool call response ``` Pydantic AI supports sampling as both a client and server. See the [server](../server/#mcp-sampling) documentation for details on how to use sampling within a server. Sampling is automatically supported by Pydantic AI agents when they act as a client. To be able to use sampling, an MCP server instance needs to have a sampling_model set. This can be done either directly on the server using the constructor keyword argument or the property, or by using agent.set_mcp_sampling_model() to set the agent's model or one specified as an argument as the sampling model on all MCP servers registered with that agent. Let's say we have an MCP server that wants to use sampling (in this case to generate an SVG as per the tool arguments). Sampling MCP Server generate_svg.py ````python import re from pathlib import Path from mcp import SamplingMessage from mcp.server.fastmcp import Context, FastMCP from mcp.types import TextContent app = FastMCP() @app.tool() async def image_generator(ctx: Context, subject: str, style: str) -> str: prompt = f'{subject=} {style=}' # `ctx.session.create_message` is the sampling call result = await ctx.session.create_message( [SamplingMessage(role='user', content=TextContent(type='text', text=prompt))], max_tokens=1_024, system_prompt='Generate an SVG image as per the user input', ) assert isinstance(result.content, TextContent) path = Path(f'{subject}_{style}.svg') # remove triple backticks if the svg was returned within markdown if m := re.search(r'^```\w*$(.+?)```$', result.content.text, re.S | re.M): path.write_text(m.group(1)) else: path.write_text(result.content.text) return f'See {path}' if __name__ == '__main__': # run the server via stdio app.run() ```` Using this server with an `Agent` will automatically allow sampling: sampling_mcp_client.py ```python from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio('python', args=['generate_svg.py']) agent = Agent('openai:gpt-4o', toolsets=[server]) async def main(): async with agent: agent.set_mcp_sampling_model() result = await agent.run('Create an image of a robot in a punk style.') print(result.output) #> Image file written to robot_punk.svg. ``` *(This example is complete, it can be run "as is")* You can disallow sampling by setting allow_sampling=False when creating the server reference, e.g.: sampling_disallowed.py ```python from pydantic_ai.mcp import MCPServerStdio server = MCPServerStdio( 'python', args=['generate_svg.py'], allow_sampling=False, ) ``` ## Elicitation In MCP, [elicitation](https://modelcontextprotocol.io/docs/concepts/elicitation) allows a server to request for [structured input](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) from the client for missing or additional context during a session. Elicitation let models essentially say "Hold on - I need to know X before i can continue" rather than requiring everything upfront or taking a shot in the dark. ### How Elicitation works Elicitation introduces a new protocol message type called [`ElicitRequest`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitrequest), which is sent from the server to the client when it needs additional information. The client can then respond with an [`ElicitResult`](https://modelcontextprotocol.io/specification/2025-06-18/schema#elicitresult) or an `ErrorData` message. Here's a typical interaction: - User makes a request to the MCP server (e.g. "Book a table at that Italian place") - The server identifies that it needs more information (e.g. "Which Italian place?", "What date and time?") - The server sends an `ElicitRequest` to the client asking for the missing information. - The client receives the request, presents it to the user (e.g. via a terminal prompt, GUI dialog, or web interface). - User provides the requested information, `decline` or `cancel` the request. - The client sends an `ElicitResult` back to the server with the user's response. - With the structured data, the server can continue processing the original request. This allows for a more interactive and user-friendly experience, especially for multi-staged workflows. Instead of requiring all information upfront, the server can ask for it as needed, making the interaction feel more natural. ### Setting up Elicitation To enable elicitation, provide an elicitation_callback function when creating your MCP server instance: restaurant_server.py ```python from mcp.server.fastmcp import Context, FastMCP from pydantic import BaseModel, Field mcp = FastMCP(name='Restaurant Booking') class BookingDetails(BaseModel): """Schema for restaurant booking information.""" restaurant: str = Field(description='Choose a restaurant') party_size: int = Field(description='Number of people', ge=1, le=8) date: str = Field(description='Reservation date (DD-MM-YYYY)') @mcp.tool() async def book_table(ctx: Context) -> str: """Book a restaurant table with user input.""" # Ask user for booking details using Pydantic schema result = await ctx.elicit(message='Please provide your booking details:', schema=BookingDetails) if result.action == 'accept' and result.data: booking = result.data return f'✅ Booked table for {booking.party_size} at {booking.restaurant} on {booking.date}' elif result.action == 'decline': return 'No problem! Maybe another time.' else: # cancel return 'Booking cancelled.' if __name__ == '__main__': mcp.run(transport='stdio') ``` This server demonstrates elicitation by requesting structured booking details from the client when the `book_table` tool is called. Here's how to create a client that handles these elicitation requests: client_example.py ```python import asyncio from typing import Any from mcp.client.session import ClientSession from mcp.shared.context import RequestContext from mcp.types import ElicitRequestParams, ElicitResult from pydantic_ai import Agent from pydantic_ai.mcp import MCPServerStdio async def handle_elicitation( context: RequestContext[ClientSession, Any, Any], params: ElicitRequestParams, ) -> ElicitResult: """Handle elicitation requests from MCP server.""" print(f'\n{params.message}') if not params.requestedSchema: response = input('Response: ') return ElicitResult(action='accept', content={'response': response}) # Collect data for each field properties = params.requestedSchema['properties'] data = {} for field, info in properties.items(): description = info.get('description', field) value = input(f'{description}: ') # Convert to proper type based on JSON schema if info.get('type') == 'integer': data[field] = int(value) else: data[field] = value # Confirm confirm = input('\nConfirm booking? (y/n/c): ').lower() if confirm == 'y': print('Booking details:', data) return ElicitResult(action='accept', content=data) elif confirm == 'n': return ElicitResult(action='decline') else: return ElicitResult(action='cancel') # Set up MCP server connection restaurant_server = MCPServerStdio( 'python', args=['restaurant_server.py'], elicitation_callback=handle_elicitation ) # Create agent agent = Agent('openai:gpt-4o', toolsets=[restaurant_server]) async def main(): """Run the agent to book a restaurant table.""" async with agent: result = await agent.run('Book me a table') print(f'\nResult: {result.output}') if __name__ == '__main__': asyncio.run(main()) ``` ### Supported Schema Types MCP elicitation supports string, number, boolean, and enum types with flat object structures only. These limitations ensure reliable cross-client compatibility. See [supported schema types](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#supported-schema-types) for details. ### Security MCP Elicitation requires careful handling - servers must not request sensitive information, and clients must implement user approval controls with clear explanations. See [security considerations](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation#security-considerations) for details. # Model Context Protocol (MCP) Pydantic AI supports [Model Context Protocol (MCP)](https://modelcontextprotocol.io) in two ways: 1. [Agents](../../agents/) act as an MCP Client, connecting to MCP servers to use their tools, [learn more …](../client/) 1. Agents can be used within MCP servers, [learn more …](../server/) ## What is MCP? The Model Context Protocol is a standardized protocol that allow AI applications (including programmatic agents like Pydantic AI, coding agents like [cursor](https://www.cursor.com/), and desktop applications like [Claude Desktop](https://claude.ai/download)) to connect to external tools and services using a common interface. As with other protocols, the dream of MCP is that a wide range of applications can speak to each other without the need for specific integrations. There is a great list of MCP servers at [github.com/modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers). Some examples of what this means: - Pydantic AI could use a web search service implemented as an MCP server to implement a deep research agent - Cursor could connect to the [Pydantic Logfire](https://github.com/pydantic/logfire-mcp) MCP server to search logs, traces and metrics to gain context while fixing a bug - Pydantic AI, or any other MCP client could connect to our [Run Python](https://github.com/pydantic/mcp-run-python) MCP server to run arbitrary Python code in a sandboxed environment # Server Pydantic AI models can also be used within MCP Servers. ## MCP Server Here's a simple example of a [Python MCP server](https://github.com/modelcontextprotocol/python-sdk) using Pydantic AI within a tool call: mcp_server.py ```py from mcp.server.fastmcp import FastMCP from pydantic_ai import Agent server = FastMCP('Pydantic AI Server') server_agent = Agent( 'anthropic:claude-3-5-haiku-latest', system_prompt='always reply in rhyme' ) @server.tool() async def poet(theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}') return r.output if __name__ == '__main__': server.run() ``` ## Simple client This server can be queried with any MCP client. Here is an example using the Python SDK directly: mcp_client.py ```py import asyncio import os from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client async def client(): server_params = StdioServerParameters( command='python', args=['mcp_server.py'], env=os.environ ) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) """ Oh, socks, those garments soft and sweet, That nestle softly 'round our feet, From cotton, wool, or blended thread, They keep our toes from feeling dread. """ if __name__ == '__main__': asyncio.run(client()) ``` ## MCP Sampling What is MCP Sampling? See the [MCP client docs](../client/#mcp-sampling) for details of what MCP sampling is, and how you can support it when using Pydantic AI as an MCP client. When Pydantic AI agents are used within MCP servers, they can use sampling via MCPSamplingModel. We can extend the above example to use sampling so instead of connecting directly to the LLM, the agent calls back through the MCP client to make LLM calls. mcp_server_sampling.py ```py from mcp.server.fastmcp import Context, FastMCP from pydantic_ai import Agent from pydantic_ai.models.mcp_sampling import MCPSamplingModel server = FastMCP('Pydantic AI Server with sampling') server_agent = Agent(system_prompt='always reply in rhyme') @server.tool() async def poet(ctx: Context, theme: str) -> str: """Poem generator""" r = await server_agent.run(f'write a poem about {theme}', model=MCPSamplingModel(session=ctx.session)) return r.output if __name__ == '__main__': server.run() # run the server over stdio ``` The [above](#simple-client) client does not support sampling, so if you tried to use it with this server you'd get an error. The simplest way to support sampling in an MCP client is to [use](../client/#mcp-sampling) a Pydantic AI agent as the client, but if you wanted to support sampling with the vanilla MCP SDK, you could do so like this: mcp_client_sampling.py ```py import asyncio from typing import Any from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client from mcp.shared.context import RequestContext from mcp.types import ( CreateMessageRequestParams, CreateMessageResult, ErrorData, TextContent, ) async def sampling_callback( context: RequestContext[ClientSession, Any], params: CreateMessageRequestParams ) -> CreateMessageResult | ErrorData: print('sampling system prompt:', params.systemPrompt) #> sampling system prompt: always reply in rhyme print('sampling messages:', params.messages) """ sampling messages: [ SamplingMessage( role='user', content=TextContent( type='text', text='write a poem about socks', annotations=None, meta=None, ), ) ] """ # TODO get the response content by calling an LLM... response_content = 'Socks for a fox.' return CreateMessageResult( role='assistant', content=TextContent(type='text', text=response_content), model='fictional-llm', ) async def client(): server_params = StdioServerParameters(command='python', args=['mcp_server_sampling.py']) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write, sampling_callback=sampling_callback) as session: await session.initialize() result = await session.call_tool('poet', {'theme': 'socks'}) print(result.content[0].text) #> Socks for a fox. if __name__ == '__main__': asyncio.run(client()) ``` *(This example is complete, it can be run "as is")* # Optional # Command Line Interface (CLI) **Pydantic AI** comes with a CLI, `clai` (pronounced "clay") which you can use to interact with various LLMs from the command line. It provides a convenient way to chat with language models and quickly get answers right in the terminal. We originally developed this CLI for our own use, but found ourselves using it so frequently that we decided to share it as part of the Pydantic AI package. We plan to continue adding new features, such as interaction with MCP servers, access to tools, and more. ## Usage You'll need to set an environment variable depending on the provider you intend to use. E.g. if you're using OpenAI, set the `OPENAI_API_KEY` environment variable: ```bash export OPENAI_API_KEY='your-api-key-here' ``` Then with [`uvx`](https://docs.astral.sh/uv/guides/tools/), run: ```bash uvx clai ``` Or to install `clai` globally [with `uv`](https://docs.astral.sh/uv/guides/tools/#installing-tools), run: ```bash uv tool install clai ... clai ``` Or with `pip`, run: ```bash pip install clai ... clai ``` Either way, running `clai` will start an interactive session where you can chat with the AI model. Special commands available in interactive mode: - `/exit`: Exit the session - `/markdown`: Show the last response in markdown format - `/multiline`: Toggle multiline input mode (use Ctrl+D to submit) - `/cp`: Copy the last response to clipboard ### Help To get help on the CLI, use the `--help` flag: ```bash uvx clai --help ``` ### Choose a model You can specify which model to use with the `--model` flag: ```bash uvx clai --model anthropic:claude-sonnet-4-0 ``` (a full list of models available can be printed with `uvx clai --list-models`) ### Custom Agents You can specify a custom agent using the `--agent` flag with a module path and variable name: custom_agent.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') ``` Then run: ```bash uvx clai --agent custom_agent:agent "What's the weather today?" ``` The format must be `module:variable` where: - `module` is the importable Python module path - `variable` is the name of the Agent instance in that module Additionally, you can directly launch CLI mode from an `Agent` instance using `Agent.to_cli_sync()`: agent_to_cli_sync.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') agent.to_cli_sync() ``` You can also use the async interface with `Agent.to_cli()`: agent_to_cli.py ```python from pydantic_ai import Agent agent = Agent('openai:gpt-4.1', instructions='You always respond in Italian.') async def main(): await agent.to_cli() ``` *(You'll need to add `asyncio.run(main())` to run `main`)* ### Message History Both `Agent.to_cli()` and `Agent.to_cli_sync()` support a `message_history` parameter, allowing you to continue an existing conversation or provide conversation context: agent_with_history.py ```python from pydantic_ai import ( Agent, ModelMessage, ModelRequest, ModelResponse, TextPart, UserPromptPart, ) agent = Agent('openai:gpt-4.1') # Create some conversation history message_history: list[ModelMessage] = [ ModelRequest([UserPromptPart(content='What is 2+2?')]), ModelResponse([TextPart(content='2+2 equals 4.')]) ] # Start CLI with existing conversation context agent.to_cli_sync(message_history=message_history) ``` The CLI will start with the provided conversation history, allowing the agent to refer back to previous exchanges and maintain context throughout the session. We'd love you to contribute to Pydantic AI! ## Installation and Setup Clone your fork and cd into the repo directory ```bash git clone git@github.com:/pydantic-ai.git cd pydantic-ai ``` Install `uv` (version 0.4.30 or later), `pre-commit` and `deno`: - [`uv` install docs](https://docs.astral.sh/uv/getting-started/installation/) - [`pre-commit` install docs](https://pre-commit.com/#install) - [`deno` install docs](https://docs.deno.com/runtime/getting_started/installation/) To install `pre-commit` you can run the following command: ```bash uv tool install pre-commit ``` For `deno`, you can run the following, or check [their documentation](https://docs.deno.com/runtime/getting_started/installation/) for alternative installation methods: ```bash curl -fsSL https://deno.land/install.sh | sh ``` Install `pydantic-ai`, all dependencies and pre-commit hooks ```bash make install ``` ## Running Tests etc. We use `make` to manage most commands you'll need to run. For details on available commands, run: ```bash make help ``` To run code formatting, linting, static type checks, and tests with coverage report generation, run: ```bash make ``` ## Documentation Changes To run the documentation page locally, run: ```bash uv run mkdocs serve ``` ## Rules for adding new models to Pydantic AI To avoid an excessive workload for the maintainers of Pydantic AI, we can't accept all model contributions, so we're setting the following rules for when we'll accept new models and when we won't. This should hopefully reduce the chances of disappointment and wasted work. - To add a new model with an extra dependency, that dependency needs > 500k monthly downloads from PyPI consistently over 3 months or more - To add a new model which uses another models logic internally and has no extra dependencies, that model's GitHub org needs > 20k stars in total - For any other model that's just a custom URL and API key, we're happy to add a one-paragraph description with a link and instructions on the URL to use - For any other model that requires more logic, we recommend you release your own Python package `pydantic-ai-xxx`, which depends on [`pydantic-ai-slim`](../install/#slim-install) and implements a model that inherits from our Model ABC If you're unsure about adding a model, please [create an issue](https://github.com/pydantic/pydantic-ai/issues). # Pydantic Logfire Debugging and Monitoring Applications that use LLMs have some challenges that are well known and understood: LLMs are **slow**, **unreliable** and **expensive**. These applications also have some challenges that most developers have encountered much less often: LLMs are **fickle** and **non-deterministic**. Subtle changes in a prompt can completely change a model's performance, and there's no `EXPLAIN` query you can run to understand why. Warning From a software engineers point of view, you can think of LLMs as the worst database you've ever heard of, but worse. If LLMs weren't so bloody useful, we'd never touch them. To build successful applications with LLMs, we need new tools to understand both model performance, and the behavior of applications that rely on them. LLM Observability tools that just let you understand how your model is performing are useless: making API calls to an LLM is easy, it's building that into an application that's hard. ## Pydantic Logfire [Pydantic Logfire](https://pydantic.dev/logfire) is an observability platform developed by the team who created and maintain Pydantic Validation and Pydantic AI. Logfire aims to let you understand your entire application: Gen AI, classic predictive AI, HTTP traffic, database queries and everything else a modern application needs, all using OpenTelemetry. Pydantic Logfire is a commercial product Logfire is a commercially supported, hosted platform with an extremely generous and perpetual [free tier](https://pydantic.dev/pricing/). You can sign up and start using Logfire in a couple of minutes. Logfire can also be self-hosted on the enterprise tier. Pydantic AI has built-in (but optional) support for Logfire. That means if the `logfire` package is installed and configured and agent instrumentation is enabled then detailed information about agent runs is sent to Logfire. Otherwise there's virtually no overhead and nothing is sent. Here's an example showing details of running the [Weather Agent](../examples/weather-agent/) in Logfire: A trace is generated for the agent run, and spans are emitted for each model request and tool call. ## Using Logfire To use Logfire, you'll need a Logfire [account](https://logfire.pydantic.dev). The Logfire Python SDK is included with `pydantic-ai`: ```bash pip install pydantic-ai ``` ```bash uv add pydantic-ai ``` Or if you're using the slim package, you can install it with the `logfire` optional group: ```bash pip install "pydantic-ai-slim[logfire]" ``` ```bash uv add "pydantic-ai-slim[logfire]" ``` Then authenticate your local environment with Logfire: ```bash logfire auth ``` ```bash uv run logfire auth ``` And configure a project to send data to: ```bash logfire projects new ``` ```bash uv run logfire projects new ``` (Or use an existing project with `logfire projects use`) This will write to a `.logfire` directory in the current working directory, which the Logfire SDK will use for configuration at run time. With that, you can start using Logfire to instrument Pydantic AI code: instrument_pydantic_ai.py ```python import logfire from pydantic_ai import Agent logfire.configure() # (1)! logfire.instrument_pydantic_ai() # (2)! agent = Agent('openai:gpt-4o', instructions='Be concise, reply with one sentence.') result = agent.run_sync('Where does "hello world" come from?') # (3)! print(result.output) """ The first known use of "hello, world" was in a 1974 textbook about the C programming language. """ ``` 1. logfire.configure() configures the SDK, by default it will find the write token from the `.logfire` directory, but you can also pass a token directly. 1. logfire.instrument_pydantic_ai() enables instrumentation of Pydantic AI. 1. Since we've enabled instrumentation, a trace will be generated for each run, with spans emitted for models calls and tool function execution *(This example is complete, it can be run "as is")* Which will display in Logfire thus: The [Logfire documentation](https://logfire.pydantic.dev/docs/) has more details on how to use Logfire, including how to instrument other libraries like [HTTPX](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) and [FastAPI](https://logfire.pydantic.dev/docs/integrations/web-frameworks/fastapi/). Since Logfire is built on [OpenTelemetry](https://opentelemetry.io/), you can use the Logfire Python SDK to send data to any OpenTelemetry collector, see [below](#using-opentelemetry). ### Debugging To demonstrate how Logfire can let you visualise the flow of a Pydantic AI run, here's the view you get from Logfire while running the [chat app examples](../examples/chat-app/): ### Monitoring Performance We can also query data with SQL in Logfire to monitor the performance of an application. Here's a real world example of using Logfire to monitor Pydantic AI runs inside Logfire itself: ### Monitoring HTTP Requests "F\*\*k you, show me the prompt." As per Hamel Husain's influential 2024 blog post ["Fuck You, Show Me The Prompt."](https://hamel.dev/blog/posts/prompt/) (bear with the capitalization, the point is valid), it's often useful to be able to view the raw HTTP requests and responses made to model providers. To observe raw HTTP requests made to model providers, you can use Logfire's [HTTPX instrumentation](https://logfire.pydantic.dev/docs/integrations/http-clients/httpx/) since all provider SDKs use the [HTTPX](https://www.python-httpx.org/) library internally. with_logfire_instrument_httpx.py ```py import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) # (1)! agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` 1. See the logfire.instrument_httpx docs more details, `capture_all=True` means both headers and body are captured for both the request and response. without_logfire_instrument_httpx.py ```py import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` ## Using OpenTelemetry Pydantic AI's instrumentation uses [OpenTelemetry](https://opentelemetry.io/) (OTel), which Logfire is based on. This means you can debug and monitor Pydantic AI with any OpenTelemetry backend. Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/), so while we think you'll have the best experience using the Logfire platform , you should be able to use any OTel service with GenAI support. ### Logfire with an alternative OTel backend You can use the Logfire SDK completely freely and send the data to any OpenTelemetry backend. Here's an example of configuring the Logfire library to send data to the excellent [otel-tui](https://github.com/ymtdzzz/otel-tui) — an open source terminal based OTel backend and viewer (no association with Pydantic Validation). Run `otel-tui` with docker (see [the otel-tui readme](https://github.com/ymtdzzz/otel-tui) for more instructions): Terminal ```text docker run --rm -it -p 4318:4318 --name otel-tui ymtdzzz/otel-tui:latest ``` then run, otel_tui.py ```python import os import logfire from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' # (1)! logfire.configure(send_to_logfire=False) # (2)! logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` 1. Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable to the URL of your OpenTelemetry backend. If you're using a backend that requires authentication, you may need to set [other environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/). Of course, these can also be set outside the process, e.g. with `export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. 1. We configure Logfire to disable sending data to the Logfire OTel backend itself. If you removed `send_to_logfire=False`, data would be sent to both Logfire and your OpenTelemetry backend. Running the above code will send tracing data to `otel-tui`, which will display like this: Running the [weather agent](../examples/weather-agent/) example connected to `otel-tui` shows how it can be used to visualise a more complex trace: For more information on using the Logfire SDK to send data to alternative backends, see [the Logfire documentation](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/). ### OTel without Logfire You can also emit OpenTelemetry data from Pydantic AI without using Logfire at all. To do this, you'll need to install and configure the OpenTelemetry packages you need. To run the following examples, use Terminal ```text uv run \ --with 'pydantic-ai-slim[openai]' \ --with opentelemetry-sdk \ --with opentelemetry-exporter-otlp \ raw_otel.py ``` raw_otel.py ```python import os from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.trace import set_tracer_provider from pydantic_ai import Agent os.environ['OTEL_EXPORTER_OTLP_ENDPOINT'] = 'http://localhost:4318' exporter = OTLPSpanExporter() span_processor = BatchSpanProcessor(exporter) tracer_provider = TracerProvider() tracer_provider.add_span_processor(span_processor) set_tracer_provider(tracer_provider) Agent.instrument_all() agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> Paris ``` ### Alternative Observability backends Because Pydantic AI uses OpenTelemetry for observability, you can easily configure it to send data to any OpenTelemetry-compatible backend, not just our observability platform [Pydantic Logfire](#pydantic-logfire). The following providers have dedicated documentation on Pydantic AI: - [Langfuse](https://langfuse.com/docs/integrations/pydantic-ai) - [W&B Weave](https://weave-docs.wandb.ai/guides/integrations/pydantic_ai/) - [Arize](https://arize.com/docs/ax/observe/tracing-integrations-auto/pydantic-ai) - [Openlayer](https://www.openlayer.com/docs/integrations/pydantic-ai) - [OpenLIT](https://docs.openlit.io/latest/integrations/pydantic) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) - [Patronus AI](https://docs.patronus.ai/docs/percival/pydantic) - [Opik](https://www.comet.com/docs/opik/tracing/integrations/pydantic-ai) - [mlflow](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/pydantic_ai) - [Agenta](https://docs.agenta.ai/observability/integrations/pydanticai) - [Confident AI](https://documentation.confident-ai.com/docs/llm-tracing/integrations/pydanticai) - [LangWatch](https://docs.langwatch.ai/integration/python/integrations/pydantic-ai) ## Advanced usage ### Configuring data format Pydantic AI follows the [OpenTelemetry Semantic Conventions for Generative AI systems](https://opentelemetry.io/docs/specs/semconv/gen-ai/). Specifically, it follows version 1.37.0 of the conventions by default, with a few exceptions. Certain span and attribute names are not spec compliant by default for compatibility reasons, but can be made compliant by passing InstrumentationSettings(version=3) (the default is currently `version=2`). This will change the following: - The span name `agent run` becomes `invoke_agent {gen_ai.agent.name}` (with the agent name filled in) - The span name `running tool` becomes `execute_tool {gen_ai.tool.name}` (with the tool name filled in) - The attribute name `tool_arguments` becomes `gen_ai.tool.call.arguments` - The attribute name `tool_response` becomes `gen_ai.tool.call.result` To use [OpenTelemetry semantic conventions version 1.36.0](https://github.com/open-telemetry/semantic-conventions/blob/v1.36.0/docs/gen-ai/README.md) or older, pass InstrumentationSettings(version=1). Moreover, those semantic conventions specify that messages should be captured as individual events (logs) that are children of the request span, whereas by default, Pydantic AI instead collects these events into a JSON array which is set as a single large attribute called `events` on the request span. To change this, use `event_mode='logs'`: instrumentation_settings_event_mode.py ```python import logfire from pydantic_ai import Agent logfire.configure() logfire.instrument_pydantic_ai(version=1, event_mode='logs') agent = Agent('openai:gpt-4o') result = agent.run_sync('What is the capital of France?') print(result.output) #> The capital of France is Paris. ``` This won't look as good in the Logfire UI, and will also be removed from Pydantic AI in a future release, but may be useful for backwards compatibility. Note that the OpenTelemetry Semantic Conventions are still experimental and are likely to change. ### Setting OpenTelemetry SDK providers By default, the global `TracerProvider` and `EventLoggerProvider` are used. These are set automatically by `logfire.configure()`. They can also be set by the `set_tracer_provider` and `set_event_logger_provider` functions in the OpenTelemetry Python SDK. You can set custom providers with InstrumentationSettings. instrumentation_settings_providers.py ```python from opentelemetry.sdk._events import EventLoggerProvider from opentelemetry.sdk.trace import TracerProvider from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings( tracer_provider=TracerProvider(), event_logger_provider=EventLoggerProvider(), ) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Instrumenting a specific `Model` instrumented_model_example.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings, InstrumentedModel settings = InstrumentationSettings() model = InstrumentedModel('openai:gpt-4o', settings) agent = Agent(model) ``` ### Excluding binary content excluding_binary_content.py ```python from pydantic_ai import Agent, InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_binary_content=False) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` ### Excluding prompts and completions For privacy and security reasons, you may want to monitor your agent's behavior and performance without exposing sensitive user data or proprietary prompts in your observability platform. Pydantic AI allows you to exclude the actual content from instrumentation events while preserving the structural information needed for debugging and monitoring. When `include_content=False` is set, Pydantic AI will exclude sensitive content from OpenTelemetry events, including user prompts and model completions, tool call arguments and responses, and any other message content. excluding_sensitive_content.py ```python from pydantic_ai import Agent from pydantic_ai.models.instrumented import InstrumentationSettings instrumentation_settings = InstrumentationSettings(include_content=False) agent = Agent('openai:gpt-4o', instrument=instrumentation_settings) # or to instrument all agents: Agent.instrument_all(instrumentation_settings) ``` This setting is particularly useful in production environments where compliance requirements or data sensitivity concerns make it necessary to limit what content is sent to your observability platform. # Unit testing Writing unit tests for Pydantic AI code is just like unit tests for any other Python code. Because for the most part they're nothing new, we have pretty well established tools and patterns for writing and running these kinds of tests. Unless you're really sure you know better, you'll probably want to follow roughly this strategy: - Use [`pytest`](https://docs.pytest.org/en/stable/) as your test harness - If you find yourself typing out long assertions, use [inline-snapshot](https://15r10nk.github.io/inline-snapshot/latest/) - Similarly, [dirty-equals](https://dirty-equals.helpmanual.io/latest/) can be useful for comparing large data structures - Use TestModel or FunctionModel in place of your actual model to avoid the usage, latency and variability of real LLM calls - Use Agent.override to replace an agent's model, dependencies, or toolsets inside your application logic - Set ALLOW_MODEL_REQUESTS=False globally to block any requests from being made to non-test models accidentally ### Unit testing with `TestModel` The simplest and fastest way to exercise most of your application code is using TestModel, this will (by default) call all tools in the agent, then return either plain text or a structured response depending on the return type of the agent. `TestModel` is not magic The "clever" (but not too clever) part of `TestModel` is that it will attempt to generate valid structured data for [function tools](../tools/) and [output types](../output/#structured-output) based on the schema of the registered tools. There's no ML or AI in `TestModel`, it's just plain old procedural Python code that tries to generate data that satisfies the JSON schema of a tool. The resulting data won't look pretty or relevant, but it should pass Pydantic's validation in most cases. If you want something more sophisticated, use FunctionModel and write your own data generation logic. Let's write unit tests for the following application code: weather_app.py ```python import asyncio from datetime import date from pydantic_ai import Agent, RunContext from fake_database import DatabaseConn # (1)! from weather_service import WeatherService # (2)! weather_agent = Agent( 'openai:gpt-4o', deps_type=WeatherService, system_prompt='Providing a weather forecast at the locations the user provides.', ) @weather_agent.tool def weather_forecast( ctx: RunContext[WeatherService], location: str, forecast_date: date ) -> str: if forecast_date < date.today(): # (3)! return ctx.deps.get_historic_weather(location, forecast_date) else: return ctx.deps.get_forecast(location, forecast_date) async def run_weather_forecast( # (4)! user_prompts: list[tuple[str, int]], conn: DatabaseConn ): """Run weather forecast for a list of user prompts and save.""" async with WeatherService() as weather_service: async def run_forecast(prompt: str, user_id: int): result = await weather_agent.run(prompt, deps=weather_service) await conn.store_forecast(user_id, result.output) # run all prompts in parallel await asyncio.gather( *(run_forecast(prompt, user_id) for (prompt, user_id) in user_prompts) ) ``` 1. `DatabaseConn` is a class that holds a database connection 1. `WeatherService` has methods to get weather forecasts and historic data about the weather 1. We need to call a different endpoint depending on whether the date is in the past or the future, you'll see why this nuance is important below 1. This function is the code we want to test, together with the agent it uses Here we have a function that takes a list of `(user_prompt, user_id)` tuples, gets a weather forecast for each prompt, and stores the result in the database. **We want to test this code without having to mock certain objects or modify our code so we can pass test objects in.** Here's how we would write tests using TestModel: test_weather_app.py ```python from datetime import timezone import pytest from dirty_equals import IsNow, IsStr from pydantic_ai import models, capture_run_messages, RequestUsage from pydantic_ai.models.test import TestModel from pydantic_ai import ( ModelResponse, SystemPromptPart, TextPart, ToolCallPart, ToolReturnPart, UserPromptPart, ModelRequest, ) from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio # (1)! models.ALLOW_MODEL_REQUESTS = False # (2)! async def test_forecast(): conn = DatabaseConn() user_id = 1 with capture_run_messages() as messages: with weather_agent.override(model=TestModel()): # (3)! prompt = 'What will the weather be like in London on 2024-11-28?' await run_weather_forecast([(prompt, user_id)], conn) # (4)! forecast = await conn.get_forecast(user_id) assert forecast == '{"weather_forecast":"Sunny with a chance of rain"}' # (5)! assert messages == [ # (6)! ModelRequest( parts=[ SystemPromptPart( content='Providing a weather forecast at the locations the user provides.', timestamp=IsNow(tz=timezone.utc), ), UserPromptPart( content='What will the weather be like in London on 2024-11-28?', timestamp=IsNow(tz=timezone.utc), # (7)! ), ] ), ModelResponse( parts=[ ToolCallPart( tool_name='weather_forecast', args={ 'location': 'a', 'forecast_date': '2024-01-01', # (8)! }, tool_call_id=IsStr(), ) ], usage=RequestUsage( input_tokens=71, output_tokens=7, ), model_name='test', timestamp=IsNow(tz=timezone.utc), ), ModelRequest( parts=[ ToolReturnPart( tool_name='weather_forecast', content='Sunny with a chance of rain', tool_call_id=IsStr(), timestamp=IsNow(tz=timezone.utc), ), ], ), ModelResponse( parts=[ TextPart( content='{"weather_forecast":"Sunny with a chance of rain"}', ) ], usage=RequestUsage( input_tokens=77, output_tokens=16, ), model_name='test', timestamp=IsNow(tz=timezone.utc), ), ] ``` 1. We're using [anyio](https://anyio.readthedocs.io/en/stable/) to run async tests. 1. This is a safety measure to make sure we don't accidentally make real requests to the LLM while testing, see ALLOW_MODEL_REQUESTS for more details. 1. We're using Agent.override to replace the agent's model with TestModel, the nice thing about `override` is that we can replace the model inside agent without needing access to the agent `run*` methods call site. 1. Now we call the function we want to test inside the `override` context manager. 1. But default, `TestModel` will return a JSON string summarising the tools calls made, and what was returned. If you wanted to customise the response to something more closely aligned with the domain, you could add custom_output_text='Sunny' when defining `TestModel`. 1. So far we don't actually know which tools were called and with which values, we can use capture_run_messages to inspect messages from the most recent run and assert the exchange between the agent and the model occurred as expected. 1. The IsNow helper allows us to use declarative asserts even with data which will contain timestamps that change over time. 1. `TestModel` isn't doing anything clever to extract values from the prompt, so these values are hardcoded. ### Unit testing with `FunctionModel` The above tests are a great start, but careful readers will notice that the `WeatherService.get_forecast` is never called since `TestModel` calls `weather_forecast` with a date in the past. To fully exercise `weather_forecast`, we need to use FunctionModel to customise how the tools is called. Here's an example of using `FunctionModel` to test the `weather_forecast` tool with custom inputs test_weather_app2.py ```python import re import pytest from pydantic_ai import models from pydantic_ai import ( ModelMessage, ModelResponse, TextPart, ToolCallPart, ) from pydantic_ai.models.function import AgentInfo, FunctionModel from fake_database import DatabaseConn from weather_app import run_weather_forecast, weather_agent pytestmark = pytest.mark.anyio models.ALLOW_MODEL_REQUESTS = False def call_weather_forecast( # (1)! messages: list[ModelMessage], info: AgentInfo ) -> ModelResponse: if len(messages) == 1: # first call, call the weather forecast tool user_prompt = messages[0].parts[-1] m = re.search(r'\d{4}-\d{2}-\d{2}', user_prompt.content) assert m is not None args = {'location': 'London', 'forecast_date': m.group()} # (2)! return ModelResponse(parts=[ToolCallPart('weather_forecast', args)]) else: # second call, return the forecast msg = messages[-1].parts[0] assert msg.part_kind == 'tool-return' return ModelResponse(parts=[TextPart(f'The forecast is: {msg.content}')]) async def test_forecast_future(): conn = DatabaseConn() user_id = 1 with weather_agent.override(model=FunctionModel(call_weather_forecast)): # (3)! prompt = 'What will the weather be like in London on 2032-01-01?' await run_weather_forecast([(prompt, user_id)], conn) forecast = await conn.get_forecast(user_id) assert forecast == 'The forecast is: Rainy with a chance of sun' ``` 1. We define a function `call_weather_forecast` that will be called by `FunctionModel` in place of the LLM, this function has access to the list of ModelMessages that make up the run, and AgentInfo which contains information about the agent and the function tools and return tools. 1. Our function is slightly intelligent in that it tries to extract a date from the prompt, but just hard codes the location. 1. We use FunctionModel to replace the agent's model with our custom function. ### Overriding model via pytest fixtures If you're writing lots of tests that all require model to be overridden, you can use [pytest fixtures](https://docs.pytest.org/en/6.2.x/fixture.html) to override the model with TestModel or FunctionModel in a reusable way. Here's an example of a fixture that overrides the model with `TestModel`: test_agent.py ```python import pytest from pydantic_ai.models.test import TestModel from weather_app import weather_agent @pytest.fixture def override_weather_agent(): with weather_agent.override(model=TestModel()): yield async def test_forecast(override_weather_agent: None): ... # test code here ``` # Examples # Agent User Interaction (AG-UI) Example of using Pydantic AI agents with the [AG-UI Dojo](https://github.com/ag-ui-protocol/ag-ui/tree/main/typescript-sdk/apps/dojo) example app. See the [AG-UI docs](../../ag-ui/) for more information about the AG-UI integration. Demonstrates: - [AG-UI](../../ag-ui/) - [Tools](../../tools/) ## Prerequisites - An [OpenAI API key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage) you will need two command line windows. ### Pydantic AI AG-UI backend Setup your OpenAI API Key ```bash export OPENAI_API_KEY= ``` Start the Pydantic AI AG-UI example backend. ```bash python -m pydantic_ai_examples.ag_ui ``` ```bash uv run -m pydantic_ai_examples.ag_ui ``` ### AG-UI Dojo example frontend Next run the AG-UI Dojo example frontend. 1. Clone the [AG-UI repository](https://github.com/ag-ui-protocol/ag-ui) ```shell git clone https://github.com/ag-ui-protocol/ag-ui.git ``` 1. Change into to the `ag-ui/typescript-sdk` directory ```shell cd ag-ui/typescript-sdk ``` 1. Run the Dojo app following the [official instructions](https://github.com/ag-ui-protocol/ag-ui/tree/main/typescript-sdk/apps/dojo#development-setup) 1. Visit 1. Select View `Pydantic AI` from the sidebar ## Feature Examples ### Agentic Chat This demonstrates a basic agent interaction including Pydantic AI server side tools and AG-UI client side tools. If you've [run the example](#running-the-example), you can view it at . #### Agent Tools - `time` - Pydantic AI tool to check the current time for a time zone - `background` - AG-UI tool to set the background color of the client window #### Agent Prompts ```text What is the time in New York? ``` ```text Change the background to blue ``` A complex example which mixes both AG-UI and Pydantic AI tools: ```text Perform the following steps, waiting for the response of each step before continuing: 1. Get the time 2. Set the background to red 3. Get the time 4. Report how long the background set took by diffing the two times ``` #### Agentic Chat - Code [ag_ui/api/agentic_chat.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_chat.py) ```py """Agentic Chat feature.""" from __future__ import annotations from datetime import datetime from zoneinfo import ZoneInfo from pydantic_ai import Agent agent = Agent('openai:gpt-4o-mini') app = agent.to_ag_ui() @agent.tool_plain async def current_time(timezone: str = 'UTC') -> str: """Get the current time in ISO format. Args: timezone: The timezone to use. Returns: The current time in ISO format string. """ tz: ZoneInfo = ZoneInfo(timezone) return datetime.now(tz=tz).isoformat() ``` ### Agentic Generative UI Demonstrates a long running task where the agent sends updates to the frontend to let the user know what's happening. If you've [run the example](#running-the-example), you can view it at . #### Plan Prompts ```text Create a plan for breakfast and execute it ``` #### Agentic Generative UI - Code [ag_ui/api/agentic_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/agentic_generative_ui.py) ```py """Agentic Generative UI feature.""" from __future__ import annotations from textwrap import dedent from typing import Any, Literal from pydantic import BaseModel, Field from ag_ui.core import EventType, StateDeltaEvent, StateSnapshotEvent from pydantic_ai import Agent StepStatus = Literal['pending', 'completed'] class Step(BaseModel): """Represents a step in a plan.""" description: str = Field(description='The description of the step') status: StepStatus = Field( default='pending', description='The status of the step (e.g., pending, completed)', ) class Plan(BaseModel): """Represents a plan with multiple steps.""" steps: list[Step] = Field(default_factory=list, description='The steps in the plan') class JSONPatchOp(BaseModel): """A class representing a JSON Patch operation (RFC 6902).""" op: Literal['add', 'remove', 'replace', 'move', 'copy', 'test'] = Field( description='The operation to perform: add, remove, replace, move, copy, or test', ) path: str = Field(description='JSON Pointer (RFC 6901) to the target location') value: Any = Field( default=None, description='The value to apply (for add, replace operations)', ) from_: str | None = Field( default=None, alias='from', description='Source path (for move, copy operations)', ) agent = Agent( 'openai:gpt-4o-mini', instructions=dedent( """ When planning use tools only, without any other messages. IMPORTANT: - Use the `create_plan` tool to set the initial state of the steps - Use the `update_plan_step` tool to update the status of each step - Do NOT repeat the plan or summarise it in a message - Do NOT confirm the creation or updates in a message - Do NOT ask the user for additional information or next steps Only one plan can be active at a time, so do not call the `create_plan` tool again until all the steps in current plan are completed. """ ), ) @agent.tool_plain async def create_plan(steps: list[str]) -> StateSnapshotEvent: """Create a plan with multiple steps. Args: steps: List of step descriptions to create the plan. Returns: StateSnapshotEvent containing the initial state of the steps. """ plan: Plan = Plan( steps=[Step(description=step) for step in steps], ) return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot=plan.model_dump(), ) @agent.tool_plain async def update_plan_step( index: int, description: str | None = None, status: StepStatus | None = None ) -> StateDeltaEvent: """Update the plan with new steps or changes. Args: index: The index of the step to update. description: The new description for the step. status: The new status for the step. Returns: StateDeltaEvent containing the changes made to the plan. """ changes: list[JSONPatchOp] = [] if description is not None: changes.append( JSONPatchOp( op='replace', path=f'/steps/{index}/description', value=description ) ) if status is not None: changes.append( JSONPatchOp(op='replace', path=f'/steps/{index}/status', value=status) ) return StateDeltaEvent( type=EventType.STATE_DELTA, delta=changes, ) app = agent.to_ag_ui() ``` ### Human in the Loop Demonstrates simple human in the loop workflow where the agent comes up with a plan and the user can approve it using checkboxes. #### Task Planning Tools - `generate_task_steps` - AG-UI tool to generate and confirm steps #### Task Planning Prompt ```text Generate a list of steps for cleaning a car for me to review ``` #### Human in the Loop - Code [ag_ui/api/human_in_the_loop.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/human_in_the_loop.py) ```py """Human in the Loop Feature. No special handling is required for this feature. """ from __future__ import annotations from textwrap import dedent from pydantic_ai import Agent agent = Agent( 'openai:gpt-4o-mini', instructions=dedent( """ When planning tasks use tools only, without any other messages. IMPORTANT: - Use the `generate_task_steps` tool to display the suggested steps to the user - Never repeat the plan, or send a message detailing steps - If accepted, confirm the creation of the plan and the number of selected (enabled) steps only - If not accepted, ask the user for more information, DO NOT use the `generate_task_steps` tool again """ ), ) app = agent.to_ag_ui() ``` ### Predictive State Updates Demonstrates how to use the predictive state updates feature to update the state of the UI based on agent responses, including user interaction via user confirmation. If you've [run the example](#running-the-example), you can view it at . #### Story Tools - `write_document` - AG-UI tool to write the document to a window - `document_predict_state` - Pydantic AI tool that enables document state prediction for the `write_document` tool This also shows how to use custom instructions based on shared state information. #### Story Example Starting document text ```markdown Bruce was a good dog, ``` Agent prompt ```text Help me complete my story about bruce the dog, is should be no longer than a sentence. ``` #### Predictive State Updates - Code [ag_ui/api/predictive_state_updates.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/predictive_state_updates.py) ```py """Predictive State feature.""" from __future__ import annotations from textwrap import dedent from pydantic import BaseModel from ag_ui.core import CustomEvent, EventType from pydantic_ai import Agent, RunContext from pydantic_ai.ag_ui import StateDeps class DocumentState(BaseModel): """State for the document being written.""" document: str = '' agent = Agent('openai:gpt-4o-mini', deps_type=StateDeps[DocumentState]) # Tools which return AG-UI events will be sent to the client as part of the # event stream, single events and iterables of events are supported. @agent.tool_plain async def document_predict_state() -> list[CustomEvent]: """Enable document state prediction. Returns: CustomEvent containing the event to enable state prediction. """ return [ CustomEvent( type=EventType.CUSTOM, name='PredictState', value=[ { 'state_key': 'document', 'tool': 'write_document', 'tool_argument': 'document', }, ], ), ] @agent.instructions() async def story_instructions(ctx: RunContext[StateDeps[DocumentState]]) -> str: """Provide instructions for writing document if present. Args: ctx: The run context containing document state information. Returns: Instructions string for the document writing agent. """ return dedent( f"""You are a helpful assistant for writing documents. Before you start writing, you MUST call the `document_predict_state` tool to enable state prediction. To present the document to the user for review, you MUST use the `write_document` tool. When you have written the document, DO NOT repeat it as a message. If accepted briefly summarize the changes you made, 2 sentences max, otherwise ask the user to clarify what they want to change. This is the current document: {ctx.deps.state.document} """ ) app = agent.to_ag_ui(deps=StateDeps(DocumentState())) ``` ### Shared State Demonstrates how to use the shared state between the UI and the agent. State sent to the agent is detected by a function based instruction. This then validates the data using a custom pydantic model before using to create the instructions for the agent to follow and send to the client using a AG-UI tool. If you've [run the example](#running-the-example), you can view it at . #### Recipe Tools - `display_recipe` - AG-UI tool to display the recipe in a graphical format #### Recipe Example 1. Customise the basic settings of your recipe 1. Click `Improve with AI` #### Shared State - Code [ag_ui/api/shared_state.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/shared_state.py) ```py """Shared State feature.""" from __future__ import annotations from enum import StrEnum from textwrap import dedent from pydantic import BaseModel, Field from ag_ui.core import EventType, StateSnapshotEvent from pydantic_ai import Agent, RunContext from pydantic_ai.ag_ui import StateDeps class SkillLevel(StrEnum): """The level of skill required for the recipe.""" BEGINNER = 'Beginner' INTERMEDIATE = 'Intermediate' ADVANCED = 'Advanced' class SpecialPreferences(StrEnum): """Special preferences for the recipe.""" HIGH_PROTEIN = 'High Protein' LOW_CARB = 'Low Carb' SPICY = 'Spicy' BUDGET_FRIENDLY = 'Budget-Friendly' ONE_POT_MEAL = 'One-Pot Meal' VEGETARIAN = 'Vegetarian' VEGAN = 'Vegan' class CookingTime(StrEnum): """The cooking time of the recipe.""" FIVE_MIN = '5 min' FIFTEEN_MIN = '15 min' THIRTY_MIN = '30 min' FORTY_FIVE_MIN = '45 min' SIXTY_PLUS_MIN = '60+ min' class Ingredient(BaseModel): """A class representing an ingredient in a recipe.""" icon: str = Field( default='ingredient', description="The icon emoji (not emoji code like '\x1f35e', but the actual emoji like 🥕) of the ingredient", ) name: str amount: str class Recipe(BaseModel): """A class representing a recipe.""" skill_level: SkillLevel = Field( default=SkillLevel.BEGINNER, description='The skill level required for the recipe', ) special_preferences: list[SpecialPreferences] = Field( default_factory=list, description='Any special preferences for the recipe', ) cooking_time: CookingTime = Field( default=CookingTime.FIVE_MIN, description='The cooking time of the recipe' ) ingredients: list[Ingredient] = Field( default_factory=list, description='Ingredients for the recipe', ) instructions: list[str] = Field( default_factory=list, description='Instructions for the recipe' ) class RecipeSnapshot(BaseModel): """A class representing the state of the recipe.""" recipe: Recipe = Field( default_factory=Recipe, description='The current state of the recipe' ) agent = Agent('openai:gpt-4o-mini', deps_type=StateDeps[RecipeSnapshot]) @agent.tool_plain async def display_recipe(recipe: Recipe) -> StateSnapshotEvent: """Display the recipe to the user. Args: recipe: The recipe to display. Returns: StateSnapshotEvent containing the recipe snapshot. """ return StateSnapshotEvent( type=EventType.STATE_SNAPSHOT, snapshot={'recipe': recipe}, ) @agent.instructions async def recipe_instructions(ctx: RunContext[StateDeps[RecipeSnapshot]]) -> str: """Instructions for the recipe generation agent. Args: ctx: The run context containing recipe state information. Returns: Instructions string for the recipe generation agent. """ return dedent( f""" You are a helpful assistant for creating recipes. IMPORTANT: - Create a complete recipe using the existing ingredients - Append new ingredients to the existing ones - Use the `display_recipe` tool to present the recipe to the user - Do NOT repeat the recipe in the message, use the tool instead - Do NOT run the `display_recipe` tool multiple times in a row Once you have created the updated recipe and displayed it to the user, summarise the changes in one sentence, don't describe the recipe in detail or send it as a message to the user. The current state of the recipe is: {ctx.deps.state.recipe.model_dump_json(indent=2)} """, ) app = agent.to_ag_ui(deps=StateDeps(RecipeSnapshot())) ``` ### Tool Based Generative UI Demonstrates customised rendering for tool output with used confirmation. If you've [run the example](#running-the-example), you can view it at . #### Haiku Tools - `generate_haiku` - AG-UI tool to display a haiku in English and Japanese #### Haiku Prompt ```text Generate a haiku about formula 1 ``` #### Tool Based Generative UI - Code [ag_ui/api/tool_based_generative_ui.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/ag_ui/api/tool_based_generative_ui.py) ```py """Tool Based Generative UI feature. No special handling is required for this feature. """ from __future__ import annotations from pydantic_ai import Agent agent = Agent('openai:gpt-4o-mini') app = agent.to_ag_ui() ``` Small but complete example of using Pydantic AI to build a support agent for a bank. Demonstrates: - [dynamic system prompt](../../agents/#system-prompts) - [structured `output_type`](../../output/#structured-output) - [tools](../../tools/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.bank_support ``` ```bash uv run -m pydantic_ai_examples.bank_support ``` (or `PYDANTIC_AI_MODEL=gemini-1.5-flash ...`) ## Example Code [bank_support.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/bank_support.py) ```py """Small but complete example of using Pydantic AI to build a support agent for a bank. Run with: uv run -m pydantic_ai_examples.bank_support """ from dataclasses import dataclass from pydantic import BaseModel from pydantic_ai import Agent, RunContext class DatabaseConn: """This is a fake database for example purposes. In reality, you'd be connecting to an external database (e.g. PostgreSQL) to get information about customers. """ @classmethod async def customer_name(cls, *, id: int) -> str | None: if id == 123: return 'John' @classmethod async def customer_balance(cls, *, id: int, include_pending: bool) -> float: if id == 123: if include_pending: return 123.45 else: return 100.00 else: raise ValueError('Customer not found') @dataclass class SupportDependencies: customer_id: int db: DatabaseConn class SupportOutput(BaseModel): support_advice: str """Advice returned to the customer""" block_card: bool """Whether to block their card or not""" risk: int """Risk level of query""" support_agent = Agent( 'openai:gpt-4o', deps_type=SupportDependencies, output_type=SupportOutput, instructions=( 'You are a support agent in our bank, give the ' 'customer support and judge the risk level of their query. ' "Reply using the customer's name." ), ) @support_agent.instructions async def add_customer_name(ctx: RunContext[SupportDependencies]) -> str: customer_name = await ctx.deps.db.customer_name(id=ctx.deps.customer_id) return f"The customer's name is {customer_name!r}" @support_agent.tool async def customer_balance( ctx: RunContext[SupportDependencies], include_pending: bool ) -> str: """Returns the customer's current account balance.""" balance = await ctx.deps.db.customer_balance( id=ctx.deps.customer_id, include_pending=include_pending, ) return f'${balance:.2f}' if __name__ == '__main__': deps = SupportDependencies(customer_id=123, db=DatabaseConn()) result = support_agent.run_sync('What is my balance?', deps=deps) print(result.output) """ support_advice='Hello John, your current account balance, including pending transactions, is $123.45.' block_card=False risk=1 """ result = support_agent.run_sync('I just lost my card!', deps=deps) print(result.output) """ support_advice="I'm sorry to hear that, John. We are temporarily blocking your card to prevent unauthorized transactions." block_card=True risk=8 """ ``` # Chat App with FastAPI Simple chat app example build with FastAPI. Demonstrates: - [reusing chat history](../../message-history/) - [serializing messages](../../message-history/#accessing-messages-from-results) - [streaming responses](../../output/#streamed-results) This demonstrates storing chat history between requests and using it to give the model context for new responses. Most of the complex logic here is between `chat_app.py` which streams the response to the browser, and `chat_app.ts` which renders messages in the browser. ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.chat_app ``` ```bash uv run -m pydantic_ai_examples.chat_app ``` Then open the app at [localhost:8000](http://localhost:8000). ## Example Code Python code that runs the chat app: [chat_app.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.py) ```py """Simple chat app example build with FastAPI. Run with: uv run -m pydantic_ai_examples.chat_app """ from __future__ import annotations as _annotations import asyncio import json import sqlite3 from collections.abc import AsyncIterator, Callable from concurrent.futures.thread import ThreadPoolExecutor from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import datetime, timezone from functools import partial from pathlib import Path from typing import Annotated, Any, Literal, TypeVar import fastapi import logfire from fastapi import Depends, Request from fastapi.responses import FileResponse, Response, StreamingResponse from typing_extensions import LiteralString, ParamSpec, TypedDict from pydantic_ai import ( Agent, ModelMessage, ModelMessagesTypeAdapter, ModelRequest, ModelResponse, TextPart, UnexpectedModelBehavior, UserPromptPart, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent('openai:gpt-4o') THIS_DIR = Path(__file__).parent @asynccontextmanager async def lifespan(_app: fastapi.FastAPI): async with Database.connect() as db: yield {'db': db} app = fastapi.FastAPI(lifespan=lifespan) logfire.instrument_fastapi(app) @app.get('/') async def index() -> FileResponse: return FileResponse((THIS_DIR / 'chat_app.html'), media_type='text/html') @app.get('/chat_app.ts') async def main_ts() -> FileResponse: """Get the raw typescript code, it's compiled in the browser, forgive me.""" return FileResponse((THIS_DIR / 'chat_app.ts'), media_type='text/plain') async def get_db(request: Request) -> Database: return request.state.db @app.get('/chat/') async def get_chat(database: Database = Depends(get_db)) -> Response: msgs = await database.get_messages() return Response( b'\n'.join(json.dumps(to_chat_message(m)).encode('utf-8') for m in msgs), media_type='text/plain', ) class ChatMessage(TypedDict): """Format of messages sent to the browser.""" role: Literal['user', 'model'] timestamp: str content: str def to_chat_message(m: ModelMessage) -> ChatMessage: first_part = m.parts[0] if isinstance(m, ModelRequest): if isinstance(first_part, UserPromptPart): assert isinstance(first_part.content, str) return { 'role': 'user', 'timestamp': first_part.timestamp.isoformat(), 'content': first_part.content, } elif isinstance(m, ModelResponse): if isinstance(first_part, TextPart): return { 'role': 'model', 'timestamp': m.timestamp.isoformat(), 'content': first_part.content, } raise UnexpectedModelBehavior(f'Unexpected message type for chat app: {m}') @app.post('/chat/') async def post_chat( prompt: Annotated[str, fastapi.Form()], database: Database = Depends(get_db) ) -> StreamingResponse: async def stream_messages(): """Streams new line delimited JSON `Message`s to the client.""" # stream the user prompt so that can be displayed straight away yield ( json.dumps( { 'role': 'user', 'timestamp': datetime.now(tz=timezone.utc).isoformat(), 'content': prompt, } ).encode('utf-8') + b'\n' ) # get the chat history so far to pass as context to the agent messages = await database.get_messages() # run the agent with the user prompt and the chat history async with agent.run_stream(prompt, message_history=messages) as result: async for text in result.stream_output(debounce_by=0.01): # text here is a `str` and the frontend wants # JSON encoded ModelResponse, so we create one m = ModelResponse(parts=[TextPart(text)], timestamp=result.timestamp()) yield json.dumps(to_chat_message(m)).encode('utf-8') + b'\n' # add new messages (e.g. the user prompt and the agent response in this case) to the database await database.add_messages(result.new_messages_json()) return StreamingResponse(stream_messages(), media_type='text/plain') P = ParamSpec('P') R = TypeVar('R') @dataclass class Database: """Rudimentary database to store chat messages in SQLite. The SQLite standard library package is synchronous, so we use a thread pool executor to run queries asynchronously. """ con: sqlite3.Connection _loop: asyncio.AbstractEventLoop _executor: ThreadPoolExecutor @classmethod @asynccontextmanager async def connect( cls, file: Path = THIS_DIR / '.chat_app_messages.sqlite' ) -> AsyncIterator[Database]: with logfire.span('connect to DB'): loop = asyncio.get_event_loop() executor = ThreadPoolExecutor(max_workers=1) con = await loop.run_in_executor(executor, cls._connect, file) slf = cls(con, loop, executor) try: yield slf finally: await slf._asyncify(con.close) @staticmethod def _connect(file: Path) -> sqlite3.Connection: con = sqlite3.connect(str(file)) con = logfire.instrument_sqlite3(con) cur = con.cursor() cur.execute( 'CREATE TABLE IF NOT EXISTS messages (id INT PRIMARY KEY, message_list TEXT);' ) con.commit() return con async def add_messages(self, messages: bytes): await self._asyncify( self._execute, 'INSERT INTO messages (message_list) VALUES (?);', messages, commit=True, ) await self._asyncify(self.con.commit) async def get_messages(self) -> list[ModelMessage]: c = await self._asyncify( self._execute, 'SELECT message_list FROM messages order by id' ) rows = await self._asyncify(c.fetchall) messages: list[ModelMessage] = [] for row in rows: messages.extend(ModelMessagesTypeAdapter.validate_json(row[0])) return messages def _execute( self, sql: LiteralString, *args: Any, commit: bool = False ) -> sqlite3.Cursor: cur = self.con.cursor() cur.execute(sql, args) if commit: self.con.commit() return cur async def _asyncify( self, func: Callable[P, R], *args: P.args, **kwargs: P.kwargs ) -> R: return await self._loop.run_in_executor( # type: ignore self._executor, partial(func, **kwargs), *args, # type: ignore ) if __name__ == '__main__': import uvicorn uvicorn.run( 'pydantic_ai_examples.chat_app:app', reload=True, reload_dirs=[str(THIS_DIR)] ) ``` Simple HTML page to render the app: [chat_app.html](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.html) ```html Chat App

Chat App

Ask me anything...

Error occurred, check the browser developer console for more information.
``` TypeScript to handle rendering the messages, to keep this simple (and at the risk of offending frontend developers) the typescript code is passed to the browser as plain text and transpiled in the browser. [chat_app.ts](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/chat_app.ts) ```ts // BIG FAT WARNING: to avoid the complexity of npm, this typescript is compiled in the browser // there's currently no static type checking import { marked } from 'https://cdnjs.cloudflare.com/ajax/libs/marked/15.0.0/lib/marked.esm.js' const convElement = document.getElementById('conversation') const promptInput = document.getElementById('prompt-input') as HTMLInputElement const spinner = document.getElementById('spinner') // stream the response and render messages as each chunk is received // data is sent as newline-delimited JSON async function onFetchResponse(response: Response): Promise { let text = '' let decoder = new TextDecoder() if (response.ok) { const reader = response.body.getReader() while (true) { const {done, value} = await reader.read() if (done) { break } text += decoder.decode(value) addMessages(text) spinner.classList.remove('active') } addMessages(text) promptInput.disabled = false promptInput.focus() } else { const text = await response.text() console.error(`Unexpected response: ${response.status}`, {response, text}) throw new Error(`Unexpected response: ${response.status}`) } } // The format of messages, this matches pydantic-ai both for brevity and understanding // in production, you might not want to keep this format all the way to the frontend interface Message { role: string content: string timestamp: string } // take raw response text and render messages into the `#conversation` element // Message timestamp is assumed to be a unique identifier of a message, and is used to deduplicate // hence you can send data about the same message multiple times, and it will be updated // instead of creating a new message elements function addMessages(responseText: string) { const lines = responseText.split('\n') const messages: Message[] = lines.filter(line => line.length > 1).map(j => JSON.parse(j)) for (const message of messages) { // we use the timestamp as a crude element id const {timestamp, role, content} = message const id = `msg-${timestamp}` let msgDiv = document.getElementById(id) if (!msgDiv) { msgDiv = document.createElement('div') msgDiv.id = id msgDiv.title = `${role} at ${timestamp}` msgDiv.classList.add('border-top', 'pt-2', role) convElement.appendChild(msgDiv) } msgDiv.innerHTML = marked.parse(content) } window.scrollTo({ top: document.body.scrollHeight, behavior: 'smooth' }) } function onError(error: any) { console.error(error) document.getElementById('error').classList.remove('d-none') document.getElementById('spinner').classList.remove('active') } async function onSubmit(e: SubmitEvent): Promise { e.preventDefault() spinner.classList.add('active') const body = new FormData(e.target as HTMLFormElement) promptInput.value = '' promptInput.disabled = true const response = await fetch('/chat/', {method: 'POST', body}) await onFetchResponse(response) } // call onSubmit when the form is submitted (e.g. user clicks the send button or hits Enter) document.querySelector('form').addEventListener('submit', (e) => onSubmit(e).catch(onError)) // load messages on page load fetch('/chat/').then(onFetchResponse).catch(onError) ``` # Data Analyst Sometimes in an agent workflow, the agent does not need to know the exact tool output, but still needs to process the tool output in some ways. This is especially common in data analytics: the agent needs to know that the result of a query tool is a `DataFrame` with certain named columns, but not necessarily the content of every single row. With Pydantic AI, you can use a [dependencies object](../../dependencies/) to store the result from one tool and use it in another tool. In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes). Demonstrates: - [agent dependencies](../../dependencies/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.data_analyst ``` ```bash uv run -m pydantic_ai_examples.data_analyst ``` Output (debug): > Based on my analysis of the Cornell Movie Review dataset (rotten_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset). ## Example Code [data_analyst.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/data_analyst.py) ```py from dataclasses import dataclass, field import datasets import duckdb import pandas as pd from pydantic_ai import Agent, ModelRetry, RunContext @dataclass class AnalystAgentDeps: output: dict[str, pd.DataFrame] = field(default_factory=dict) def store(self, value: pd.DataFrame) -> str: """Store the output in deps and return the reference such as Out[1] to be used by the LLM.""" ref = f'Out[{len(self.output) + 1}]' self.output[ref] = value return ref def get(self, ref: str) -> pd.DataFrame: if ref not in self.output: raise ModelRetry( f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.' ) return self.output[ref] analyst_agent = Agent( 'openai:gpt-4o', deps_type=AnalystAgentDeps, instructions='You are a data analyst and your job is to analyze the data according to the user request.', ) @analyst_agent.tool def load_dataset( ctx: RunContext[AnalystAgentDeps], path: str, split: str = 'train', ) -> str: """Load the `split` of dataset `dataset_name` from huggingface. Args: ctx: Pydantic AI agent RunContext path: name of the dataset in the form of `/` split: load the split of the dataset (default: "train") """ # begin load data from hf builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType] splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} # pyright: ignore[reportUnknownMemberType] if split not in splits: raise ModelRetry( f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}' ) builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType] dataset = builder.as_dataset(split=split) assert isinstance(dataset, datasets.Dataset) dataframe = dataset.to_pandas() assert isinstance(dataframe, pd.DataFrame) # end load data from hf # store the dataframe in the deps and get a ref like "Out[1]" ref = ctx.deps.store(dataframe) # construct a summary of the loaded dataset output = [ f'Loaded the dataset as `{ref}`.', f'Description: {dataset.info.description}' if dataset.info.description else None, f'Features: {dataset.info.features!r}' if dataset.info.features else None, ] return '\n'.join(filter(None, output)) @analyst_agent.tool def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str: """Run DuckDB SQL query on the DataFrame. Note that the virtual table name used in DuckDB SQL must be `dataset`. Args: ctx: Pydantic AI agent RunContext dataset: reference string to the DataFrame sql: the query to be executed using DuckDB """ data = ctx.deps.get(dataset) result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql) # pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe) ref = ctx.deps.store(result.df()) # pyright: ignore[reportUnknownMemberType] return f'Executed SQL, result is `{ref}`' @analyst_agent.tool def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str: """Display at most 5 rows of the dataframe.""" dataset = ctx.deps.get(name) return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType] if __name__ == '__main__': deps = AnalystAgentDeps() result = analyst_agent.run_sync( user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`', deps=deps, ) print(result.output) ``` ## Appendix ### Choosing a Model This example requires using a model that understands DuckDB SQL. You can check with `clai`: ```sh > clai -m bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0 clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0 clai ➤ do you understand duckdb sql? # DuckDB SQL Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database that uses syntax similar to PostgreSQL. It specializes in analytical queries and is designed for high-performance analysis of structured data. Some key features of DuckDB SQL include: • OLAP (Online Analytical Processing) optimized • Columnar-vectorized query execution • Standard SQL support with PostgreSQL compatibility • Support for complex analytical queries • Efficient handling of CSV/Parquet/JSON files I can help you with DuckDB SQL queries, schema design, optimization, or other DuckDB-related questions. ``` Example of a multi-agent flow where one agent delegates work to another, then hands off control to a third agent. Demonstrates: - [agent delegation](../../multi-agent-applications/#agent-delegation) - [programmatic agent hand-off](../../multi-agent-applications/#programmatic-agent-hand-off) - [usage limits](../../agents/#usage-limits) In this scenario, a group of agents work together to find the best flight for a user. The control flow for this example can be summarised as follows: ``` graph TD START --> search_agent("search agent") search_agent --> extraction_agent("extraction agent") extraction_agent --> search_agent search_agent --> human_confirm("human confirm") human_confirm --> search_agent search_agent --> FAILED human_confirm --> find_seat_function("find seat function") find_seat_function --> human_seat_choice("human seat choice") human_seat_choice --> find_seat_agent("find seat agent") find_seat_agent --> find_seat_function find_seat_function --> buy_flights("buy flights") buy_flights --> SUCCESS ``` ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.flight_booking ``` ```bash uv run -m pydantic_ai_examples.flight_booking ``` ## Example Code [flight_booking.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/flight_booking.py) ```py """Example of a multi-agent flow where one agent delegates work to another. In this scenario, a group of agents work together to find flights for a user. """ import datetime from dataclasses import dataclass from typing import Literal import logfire from pydantic import BaseModel, Field from rich.prompt import Prompt from pydantic_ai import ( Agent, ModelMessage, ModelRetry, RunContext, RunUsage, UsageLimits, ) # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class FlightDetails(BaseModel): """Details of the most suitable flight.""" flight_number: str price: int origin: str = Field(description='Three-letter airport code') destination: str = Field(description='Three-letter airport code') date: datetime.date class NoFlightFound(BaseModel): """When no valid flight is found.""" @dataclass class Deps: web_page_text: str req_origin: str req_destination: str req_date: datetime.date # This agent is responsible for controlling the flow of the conversation. search_agent = Agent[Deps, FlightDetails | NoFlightFound]( 'openai:gpt-4o', output_type=FlightDetails | NoFlightFound, # type: ignore retries=4, system_prompt=( 'Your job is to find the cheapest flight for the user on the given date. ' ), ) # This agent is responsible for extracting flight details from web page text. extraction_agent = Agent( 'openai:gpt-4o', output_type=list[FlightDetails], system_prompt='Extract all the flight details from the given text.', ) @search_agent.tool async def extract_flights(ctx: RunContext[Deps]) -> list[FlightDetails]: """Get details of all flights.""" # we pass the usage to the search agent so requests within this agent are counted result = await extraction_agent.run(ctx.deps.web_page_text, usage=ctx.usage) logfire.info('found {flight_count} flights', flight_count=len(result.output)) return result.output @search_agent.output_validator async def validate_output( ctx: RunContext[Deps], output: FlightDetails | NoFlightFound ) -> FlightDetails | NoFlightFound: """Procedural validation that the flight meets the constraints.""" if isinstance(output, NoFlightFound): return output errors: list[str] = [] if output.origin != ctx.deps.req_origin: errors.append( f'Flight should have origin {ctx.deps.req_origin}, not {output.origin}' ) if output.destination != ctx.deps.req_destination: errors.append( f'Flight should have destination {ctx.deps.req_destination}, not {output.destination}' ) if output.date != ctx.deps.req_date: errors.append(f'Flight should be on {ctx.deps.req_date}, not {output.date}') if errors: raise ModelRetry('\n'.join(errors)) else: return output class SeatPreference(BaseModel): row: int = Field(ge=1, le=30) seat: Literal['A', 'B', 'C', 'D', 'E', 'F'] class Failed(BaseModel): """Unable to extract a seat selection.""" # This agent is responsible for extracting the user's seat selection seat_preference_agent = Agent[None, SeatPreference | Failed]( 'openai:gpt-4o', output_type=SeatPreference | Failed, system_prompt=( "Extract the user's seat preference. " 'Seats A and F are window seats. ' 'Row 1 is the front row and has extra leg room. ' 'Rows 14, and 20 also have extra leg room. ' ), ) # in reality this would be downloaded from a booking site, # potentially using another agent to navigate the site flights_web_page = """ 1. Flight SFO-AK123 - Price: $350 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 2. Flight SFO-AK456 - Price: $370 - Origin: San Francisco International Airport (SFO) - Destination: Fairbanks International Airport (FAI) - Date: January 10, 2025 3. Flight SFO-AK789 - Price: $400 - Origin: San Francisco International Airport (SFO) - Destination: Juneau International Airport (JNU) - Date: January 20, 2025 4. Flight NYC-LA101 - Price: $250 - Origin: San Francisco International Airport (SFO) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 10, 2025 5. Flight CHI-MIA202 - Price: $200 - Origin: Chicago O'Hare International Airport (ORD) - Destination: Miami International Airport (MIA) - Date: January 12, 2025 6. Flight BOS-SEA303 - Price: $120 - Origin: Boston Logan International Airport (BOS) - Destination: Ted Stevens Anchorage International Airport (ANC) - Date: January 12, 2025 7. Flight DFW-DEN404 - Price: $150 - Origin: Dallas/Fort Worth International Airport (DFW) - Destination: Denver International Airport (DEN) - Date: January 10, 2025 8. Flight ATL-HOU505 - Price: $180 - Origin: Hartsfield-Jackson Atlanta International Airport (ATL) - Destination: George Bush Intercontinental Airport (IAH) - Date: January 10, 2025 """ # restrict how many requests this app can make to the LLM usage_limits = UsageLimits(request_limit=15) async def main(): deps = Deps( web_page_text=flights_web_page, req_origin='SFO', req_destination='ANC', req_date=datetime.date(2025, 1, 10), ) message_history: list[ModelMessage] | None = None usage: RunUsage = RunUsage() # run the agent until a satisfactory flight is found while True: result = await search_agent.run( f'Find me a flight from {deps.req_origin} to {deps.req_destination} on {deps.req_date}', deps=deps, usage=usage, message_history=message_history, usage_limits=usage_limits, ) if isinstance(result.output, NoFlightFound): print('No flight found') break else: flight = result.output print(f'Flight found: {flight}') answer = Prompt.ask( 'Do you want to buy this flight, or keep searching? (buy/*search)', choices=['buy', 'search', ''], show_choices=False, ) if answer == 'buy': seat = await find_seat(usage) await buy_tickets(flight, seat) break else: message_history = result.all_messages( output_tool_return_content='Please suggest another flight' ) async def find_seat(usage: RunUsage) -> SeatPreference: message_history: list[ModelMessage] | None = None while True: answer = Prompt.ask('What seat would you like?') result = await seat_preference_agent.run( answer, message_history=message_history, usage=usage, usage_limits=usage_limits, ) if isinstance(result.output, SeatPreference): return result.output else: print('Could not understand seat preference. Please try again.') message_history = result.all_messages() async def buy_tickets(flight_details: FlightDetails, seat: SeatPreference): print(f'Purchasing flight {flight_details=!r} {seat=!r}...') if __name__ == '__main__': import asyncio asyncio.run(main()) ``` # Pydantic Model Simple example of using Pydantic AI to construct a Pydantic model from a text input. Demonstrates: - [structured `output_type`](../../output/#structured-output) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` This examples uses `openai:gpt-4o` by default, but it works well with other models, e.g. you can run it with Gemini using: ```bash PYDANTIC_AI_MODEL=gemini-1.5-pro python -m pydantic_ai_examples.pydantic_model ``` ```bash PYDANTIC_AI_MODEL=gemini-1.5-pro uv run -m pydantic_ai_examples.pydantic_model ``` (or `PYDANTIC_AI_MODEL=gemini-1.5-flash ...`) ## Example Code [pydantic_model.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/pydantic_model.py) ```py """Simple example of using Pydantic AI to construct a Pydantic model from a text input. Run with: uv run -m pydantic_ai_examples.pydantic_model """ import os import logfire from pydantic import BaseModel from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class MyModel(BaseModel): city: str country: str model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') print(f'Using model: {model}') agent = Agent(model, output_type=MyModel) if __name__ == '__main__': result = agent.run_sync('The windy city in the US of A.') print(result.output) print(result.usage()) ``` # Question Graph Example of a graph for asking and evaluating questions. Demonstrates: - [`pydantic_graph`](../../graph/) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.question_graph ``` ```bash uv run -m pydantic_ai_examples.question_graph ``` ## Example Code [question_graph.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/question_graph.py) ```py """Example of a graph for asking and evaluating questions. Run with: uv run -m pydantic_ai_examples.question_graph """ from __future__ import annotations as _annotations from dataclasses import dataclass, field from pathlib import Path import logfire from groq import BaseModel from pydantic_ai import Agent, ModelMessage, format_as_xml from pydantic_graph import ( BaseNode, End, Graph, GraphRunContext, ) from pydantic_graph.persistence.file import FileStatePersistence # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() ask_agent = Agent('openai:gpt-4o', output_type=str) @dataclass class QuestionState: question: str | None = None ask_agent_messages: list[ModelMessage] = field(default_factory=list) evaluate_agent_messages: list[ModelMessage] = field(default_factory=list) @dataclass class Ask(BaseNode[QuestionState]): async def run(self, ctx: GraphRunContext[QuestionState]) -> Answer: result = await ask_agent.run( 'Ask a simple question with a single correct answer.', message_history=ctx.state.ask_agent_messages, ) ctx.state.ask_agent_messages += result.all_messages() ctx.state.question = result.output return Answer(result.output) @dataclass class Answer(BaseNode[QuestionState]): question: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Evaluate: answer = input(f'{self.question}: ') return Evaluate(answer) class EvaluationOutput(BaseModel, use_attribute_docstrings=True): correct: bool """Whether the answer is correct.""" comment: str """Comment on the answer, reprimand the user if the answer is wrong.""" evaluate_agent = Agent( 'openai:gpt-4o', output_type=EvaluationOutput, system_prompt='Given a question and answer, evaluate if the answer is correct.', ) @dataclass class Evaluate(BaseNode[QuestionState, None, str]): answer: str async def run( self, ctx: GraphRunContext[QuestionState], ) -> End[str] | Reprimand: assert ctx.state.question is not None result = await evaluate_agent.run( format_as_xml({'question': ctx.state.question, 'answer': self.answer}), message_history=ctx.state.evaluate_agent_messages, ) ctx.state.evaluate_agent_messages += result.all_messages() if result.output.correct: return End(result.output.comment) else: return Reprimand(result.output.comment) @dataclass class Reprimand(BaseNode[QuestionState]): comment: str async def run(self, ctx: GraphRunContext[QuestionState]) -> Ask: print(f'Comment: {self.comment}') ctx.state.question = None return Ask() question_graph = Graph( nodes=(Ask, Answer, Evaluate, Reprimand), state_type=QuestionState ) async def run_as_continuous(): state = QuestionState() node = Ask() end = await question_graph.run(node, state=state) print('END:', end.output) async def run_as_cli(answer: str | None): persistence = FileStatePersistence(Path('question_graph.json')) persistence.set_graph_types(question_graph) if snapshot := await persistence.load_next(): state = snapshot.state assert answer is not None, ( 'answer required, usage "uv run -m pydantic_ai_examples.question_graph cli "' ) node = Evaluate(answer) else: state = QuestionState() node = Ask() # debug(state, node) async with question_graph.iter(node, state=state, persistence=persistence) as run: while True: node = await run.next() if isinstance(node, End): print('END:', node.data) history = await persistence.load_all() print('history:', '\n'.join(str(e.node) for e in history), sep='\n') print('Finished!') break elif isinstance(node, Answer): print(node.question) break # otherwise just continue if __name__ == '__main__': import asyncio import sys try: sub_command = sys.argv[1] assert sub_command in ('continuous', 'cli', 'mermaid') except (IndexError, AssertionError): print( 'Usage:\n' ' uv run -m pydantic_ai_examples.question_graph mermaid\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph continuous\n' 'or:\n' ' uv run -m pydantic_ai_examples.question_graph cli [answer]', file=sys.stderr, ) sys.exit(1) if sub_command == 'mermaid': print(question_graph.mermaid_code(start_node=Ask)) elif sub_command == 'continuous': asyncio.run(run_as_continuous()) else: a = sys.argv[2] if len(sys.argv) > 2 else None asyncio.run(run_as_cli(a)) ``` The mermaid diagram generated in this example looks like this: ``` --- title: question_graph --- stateDiagram-v2 [*] --> Ask Ask --> Answer: ask the question Answer --> Evaluate: answer the question Evaluate --> Congratulate Evaluate --> Castigate Congratulate --> [*]: success Castigate --> Ask: try again ``` # RAG RAG search example. This demo allows you to ask question of the [logfire](https://pydantic.dev/logfire) documentation. Demonstrates: - [tools](../../tools/) - [agent dependencies](../../dependencies/) - RAG search This is done by creating a database containing each section of the markdown documentation, then registering the search tool with the Pydantic AI agent. Logic for extracting sections from markdown files and a JSON file with that data is available in [this gist](https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992). [PostgreSQL with pgvector](https://github.com/pgvector/pgvector) is used as the search database, the easiest way to download and run pgvector is using Docker: ```bash mkdir postgres-data docker run --rm \ -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 ``` As with the [SQL gen](../sql-gen/) example, we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running. We also mount the PostgreSQL `data` directory locally to persist the data if you need to stop and restart the container. With that running and [dependencies installed and environment variables set](../setup/#usage), we can build the search database with (**WARNING**: this requires the `OPENAI_API_KEY` env variable and will calling the OpenAI embedding API around 300 times to generate embeddings for each section of the documentation): ```bash python -m pydantic_ai_examples.rag build ``` ```bash uv run -m pydantic_ai_examples.rag build ``` (Note building the database doesn't use Pydantic AI right now, instead it uses the OpenAI SDK directly.) You can then ask the agent a question with: ```bash python -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ```bash uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" ``` ## Example Code [rag.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/rag.py) ```py """RAG example with pydantic-ai — using vector search to augment a chat agent. Run pgvector with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres \ -p 54320:5432 \ -v `pwd`/postgres-data:/var/lib/postgresql/data \ pgvector/pgvector:pg17 Build the search DB with: uv run -m pydantic_ai_examples.rag build Ask the agent a question with: uv run -m pydantic_ai_examples.rag search "How do I configure logfire to work with FastAPI?" """ from __future__ import annotations as _annotations import asyncio import re import sys import unicodedata from contextlib import asynccontextmanager from dataclasses import dataclass import asyncpg import httpx import logfire import pydantic_core from openai import AsyncOpenAI from pydantic import TypeAdapter from typing_extensions import AsyncGenerator from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() @dataclass class Deps: openai: AsyncOpenAI pool: asyncpg.Pool agent = Agent('openai:gpt-4o', deps_type=Deps) @agent.tool async def retrieve(context: RunContext[Deps], search_query: str) -> str: """Retrieve documentation sections based on a search query. Args: context: The call context. search_query: The search query. """ with logfire.span( 'create embedding for {search_query=}', search_query=search_query ): embedding = await context.deps.openai.embeddings.create( input=search_query, model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc query: {search_query!r}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() rows = await context.deps.pool.fetch( 'SELECT url, title, content FROM doc_sections ORDER BY embedding <-> $1 LIMIT 8', embedding_json, ) return '\n\n'.join( f'# {row["title"]}\nDocumentation URL:{row["url"]}\n\n{row["content"]}\n' for row in rows ) async def run_agent(question: str): """Entry point to run the agent and perform RAG based question answering.""" openai = AsyncOpenAI() logfire.instrument_openai(openai) logfire.info('Asking "{question}"', question=question) async with database_connect(False) as pool: deps = Deps(openai=openai, pool=pool) answer = await agent.run(question, deps=deps) print(answer.output) ####################################################### # The rest of this file is dedicated to preparing the # # search database, and some utilities. # ####################################################### # JSON document from # https://gist.github.com/samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992 DOCS_JSON = ( 'https://gist.githubusercontent.com/' 'samuelcolvin/4b5bb9bb163b1122ff17e29e48c10992/raw/' '80c5925c42f1442c24963aaf5eb1a324d47afe95/logfire_docs.json' ) async def build_search_db(): """Build the search database.""" async with httpx.AsyncClient() as client: response = await client.get(DOCS_JSON) response.raise_for_status() sections = sessions_ta.validate_json(response.content) openai = AsyncOpenAI() logfire.instrument_openai(openai) async with database_connect(True) as pool: with logfire.span('create schema'): async with pool.acquire() as conn: async with conn.transaction(): await conn.execute(DB_SCHEMA) sem = asyncio.Semaphore(10) async with asyncio.TaskGroup() as tg: for section in sections: tg.create_task(insert_doc_section(sem, openai, pool, section)) async def insert_doc_section( sem: asyncio.Semaphore, openai: AsyncOpenAI, pool: asyncpg.Pool, section: DocsSection, ) -> None: async with sem: url = section.url() exists = await pool.fetchval('SELECT 1 FROM doc_sections WHERE url = $1', url) if exists: logfire.info('Skipping {url=}', url=url) return with logfire.span('create embedding for {url=}', url=url): embedding = await openai.embeddings.create( input=section.embedding_content(), model='text-embedding-3-small', ) assert len(embedding.data) == 1, ( f'Expected 1 embedding, got {len(embedding.data)}, doc section: {section}' ) embedding = embedding.data[0].embedding embedding_json = pydantic_core.to_json(embedding).decode() await pool.execute( 'INSERT INTO doc_sections (url, title, content, embedding) VALUES ($1, $2, $3, $4)', url, section.title, section.content, embedding_json, ) @dataclass class DocsSection: id: int parent: int | None path: str level: int title: str content: str def url(self) -> str: url_path = re.sub(r'\.md$', '', self.path) return ( f'https://logfire.pydantic.dev/docs/{url_path}/#{slugify(self.title, "-")}' ) def embedding_content(self) -> str: return '\n\n'.join((f'path: {self.path}', f'title: {self.title}', self.content)) sessions_ta = TypeAdapter(list[DocsSection]) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect( create_db: bool = False, ) -> AsyncGenerator[asyncpg.Pool, None]: server_dsn, database = ( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_rag', ) if create_db: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() pool = await asyncpg.create_pool(f'{server_dsn}/{database}') try: yield pool finally: await pool.close() DB_SCHEMA = """ CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE IF NOT EXISTS doc_sections ( id serial PRIMARY KEY, url text NOT NULL UNIQUE, title text NOT NULL, content text NOT NULL, -- text-embedding-3-small returns a vector of 1536 floats embedding vector(1536) NOT NULL ); CREATE INDEX IF NOT EXISTS idx_doc_sections_embedding ON doc_sections USING hnsw (embedding vector_l2_ops); """ def slugify(value: str, separator: str, unicode: bool = False) -> str: """Slugify a string, to make it URL friendly.""" # Taken unchanged from https://github.com/Python-Markdown/markdown/blob/3.7/markdown/extensions/toc.py#L38 if not unicode: # Replace Extended Latin characters with ASCII, i.e. `žlutý` => `zluty` value = unicodedata.normalize('NFKD', value) value = value.encode('ascii', 'ignore').decode('ascii') value = re.sub(r'[^\w\s-]', '', value).strip().lower() return re.sub(rf'[{separator}\s]+', separator, value) if __name__ == '__main__': action = sys.argv[1] if len(sys.argv) > 1 else None if action == 'build': asyncio.run(build_search_db()) elif action == 'search': if len(sys.argv) == 3: q = sys.argv[2] else: q = 'How do I configure logfire to work with FastAPI?' asyncio.run(run_agent(q)) else: print( 'uv run --extra examples -m pydantic_ai_examples.rag build|search', file=sys.stderr, ) sys.exit(1) ``` # Examples Here we include some examples of how to use Pydantic AI and what it can do. ## Usage These examples are distributed with `pydantic-ai` so you can run them either by cloning the [pydantic-ai repo](https://github.com/pydantic/pydantic-ai) or by simply installing `pydantic-ai` from PyPI with `pip` or `uv`. ### Installing required dependencies Either way you'll need to install extra dependencies to run some examples, you just need to install the `examples` optional dependency group. If you've installed `pydantic-ai` via pip/uv, you can install the extra dependencies with: ```bash pip install "pydantic-ai[examples]" ``` ```bash uv add "pydantic-ai[examples]" ``` If you clone the repo, you should instead use `uv sync --extra examples` to install extra dependencies. ### Setting model environment variables These examples will need you to set up authentication with one or more of the LLMs, see the [model configuration](../../models/overview/) docs for details on how to do this. TL;DR: in most cases you'll need to set one of the following environment variables: ```bash export OPENAI_API_KEY=your-api-key ``` ```bash export GEMINI_API_KEY=your-api-key ``` ### Running Examples To run the examples (this will work whether you installed `pydantic_ai`, or cloned the repo), run: ```bash python -m pydantic_ai_examples. ``` ```bash uv run -m pydantic_ai_examples. ``` For examples, to run the very simple [`pydantic_model`](../pydantic-model/) example: ```bash python -m pydantic_ai_examples.pydantic_model ``` ```bash uv run -m pydantic_ai_examples.pydantic_model ``` If you like one-liners and you're using uv, you can run a pydantic-ai example with zero setup: ```bash OPENAI_API_KEY='your-api-key' \ uv run --with "pydantic-ai[examples]" \ -m pydantic_ai_examples.pydantic_model ``` ______________________________________________________________________ You'll probably want to edit examples in addition to just running them. You can copy the examples to a new directory with: ```bash python -m pydantic_ai_examples --copy-to examples/ ``` ```bash uv run -m pydantic_ai_examples --copy-to examples/ ``` # Slack Lead Qualifier with Modal In this example, we're going to build an agentic app that: - automatically researches each new member that joins a company's public Slack community to see how good of a fit they are for the company's commercial product, - sends this analysis into a (private) Slack channel, and - sends a daily summary of the top 5 leads from the previous 24 hours into a (different) Slack channel. We'll be deploying the app on [Modal](https://modal.com), as it lets you use Python to define an app with web endpoints, scheduled functions, and background functions, and deploy them with a CLI, without needing to set up or manage any infrastructure. It's a great way to lower the barrier for people in your organization to start building and deploying AI agents to make their jobs easier. We also add [Pydantic Logfire](https://pydantic.dev/logfire) to get observability into the app and agent as they're running in response to webhooks and the schedule ## Screenshots This is what the analysis sent into Slack will look like: This is what the corresponding trace in [Logfire](https://pydantic.dev/logfire) will look like: All of these entries can be clicked on to get more details about what happened at that step, including the full conversation with the LLM and HTTP requests and responses. ## Prerequisites If you just want to see the code without actually going through the effort of setting up the bits necessary to run it, feel free to [jump ahead](#the-code). ### Slack app You need to have a Slack workspace and the necessary permissions to create apps. 2. Create a new Slack app using the instructions at . 1. In step 2, "Requesting scopes", request the following scopes: - [`users.read`](https://docs.slack.dev/reference/scopes/users.read) - [`users.read.email`](https://docs.slack.dev/reference/scopes/users.read.email) - [`users.profile.read`](https://docs.slack.dev/reference/scopes/users.profile.read) 1. In step 3, "Installing and authorizing the app", note down the Access Token as we're going to need to store it as a Secret in Modal. 1. You can skip steps 4 and 5. We're going to need to subscribe to the `team_join` event, but at this point you don't have a webhook URL yet. 1. Create the channels the app will post into, and add the Slack app to them: - `#new-slack-leads` - `#daily-slack-leads-summary` These names are hard-coded in the example. If you want to use different channels, you can clone the repo and change them in `examples/pydantic_examples/slack_lead_qualifier/functions.py`. ### Logfire Write Token 1. If you don't have a Logfire account yet, create one on . 1. Create a new project named, for example, `slack-lead-qualifier`. 1. Generate a new Write Token and note it down, as we're going to need to store it as a Secret in Modal. ### OpenAI API Key 1. If you don't have an OpenAI account yet, create one on . 1. Create a new API Key in Settings and note it down, as we're going to need to store it as a Secret in Modal. ### Modal account 1. If you don't have a Modal account yet, create one on . 1. Create 3 Secrets of type "Custom" on : - Name: `slack`, key: `SLACK_API_KEY`, value: the Slack Access Token you generated earlier - Name: `logfire`, key: `LOGFIRE_TOKEN`, value: the Logfire Write Token you generated earlier - Name: `openai`, key: `OPENAI_API_KEY`, value: the OpenAI API Key you generated earlier ## Usage 1. Make sure you have the [dependencies installed](../setup/#usage). 1. Authenticate with Modal: ```bash python/uv-run -m modal setup ``` 1. Run the example as an [ephemeral Modal app](https://modal.com/docs/guide/apps#ephemeral-apps), meaning it will only run until you quit it using Ctrl+C: ```bash python/uv-run -m modal serve -m pydantic_ai_examples.slack_lead_qualifier.modal ``` 1. Note down the URL after `Created web function web_app =>`, this is your webhook endpoint URL. 1. Go back to and follow step 4, "Configuring the app for event listening", to subscribe to the `team_join` event with the webhook endpoint URL you noted down as the Request URL. Now when someone new (possibly you with a throwaway email) joins the Slack workspace, you'll see the webhook event being processed in the terminal where you ran `modal serve` and in the Logfire Live view, and after waiting a few seconds you should see the result appear in the `#new-slack-leads` Slack channel! Faking a Slack signup You can also fake a Slack signup event and try out the agent like this, with any name or email you please: ```bash curl -X POST \ -H "Content-Type: application/json" \ -d '{ "type": "event_callback", "event": { "type": "team_join", "user": { "profile": { "email": "samuel@pydantic.dev", "first_name": "Samuel", "last_name": "Colvin", "display_name": "Samuel Colvin" } } } }' ``` Deploying to production If you'd like to deploy this app into your Modal workspace in a persistent fashion, you can use this command: ```bash python/uv-run -m modal deploy -m pydantic_ai_examples.slack_lead_qualifier.modal ``` You'll likely want to [download the code](https://github.com/pydantic/pydantic-ai/tree/main/examples/pydantic_ai_examples/slack_lead_qualifier) first, put it in a new repo, and then do [continuous deployment](https://modal.com/docs/guide/continuous-deployment#github-actions) using GitHub Actions. Don't forget to update the Slack event request URL to the new persistent URL! You'll also want to modify the [instructions for the agent](#agent) to your own situation. ## The code We're going to start with the basics, and then gradually build up into the full app. ### Models #### `Profile` First, we define a [Pydantic](https://docs.pydantic.dev) model that represents a Slack user profile. These are the fields we get from the [`team_join`](https://docs.slack.dev/reference/events/team_join) event that's sent to the webhook endpoint that we'll define in a bit. [slack_lead_qualifier/models.py (L11-L15)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L11-L15) ```py ... class Profile(BaseModel): first_name: str | None = None last_name: str | None = None display_name: str | None = None email: str ... ``` We also define a `Profile.as_prompt()` helper method that uses format_as_xml to turn the profile into a string that can be sent to the model. [slack_lead_qualifier/models.py (L7-L19)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L7-L19) ```py ... from pydantic_ai import format_as_xml ... class Profile(BaseModel): ... def as_prompt(self) -> str: return format_as_xml(self, root_tag='profile') ... ``` #### `Analysis` The second model we'll need represents the result of the analysis that the agent will perform. We include docstrings to provide additional context to the model on what these fields should contain. [slack_lead_qualifier/models.py (L23-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L31) ```py ... class Analysis(BaseModel): profile: Profile organization_name: str organization_domain: str job_title: str relevance: Annotated[int, Ge(1), Le(5)] """Estimated fit for Pydantic Logfire: 1 = low, 5 = high""" summary: str """One-sentence welcome note summarising who they are and how we might help""" ... ``` We also define a `Analysis.as_slack_blocks()` helper method that turns the analysis into some [Slack blocks](https://api.slack.com/reference/block-kit/blocks) that can be sent to the Slack API to post a new message. [slack_lead_qualifier/models.py (L23-L46)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/models.py#L23-L46) ```py ... class Analysis(BaseModel): ... def as_slack_blocks(self, include_relevance: bool = False) -> list[dict[str, Any]]: profile = self.profile relevance = f'({self.relevance}/5)' if include_relevance else '' return [ { 'type': 'markdown', 'text': f'[{profile.display_name}](mailto:{profile.email}), {self.job_title} at [**{self.organization_name}**](https://{self.organization_domain}) {relevance}', }, { 'type': 'markdown', 'text': self.summary, }, ] ``` ### Agent Now it's time to get into Pydantic AI and define the agent that will do the actual analysis! We specify the model we'll use (`openai:gpt-4o`), provide [instructions](../../agents/#instructions), give the agent access to the [DuckDuckGo search tool](../../common-tools/#duckduckgo-search-tool), and tell it to output either an `Analysis` or `None` using the [Native Output](../../output/#native-output) structured output mode. The real meat of the app is in the instructions that tell the agent how to evaluate each new Slack member. If you plan to use this app yourself, you'll of course want to modify them to your own situation. [slack_lead_qualifier/agent.py (L7-L40)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L7-L40) ```py ... from pydantic_ai import Agent, NativeOutput from pydantic_ai.common_tools.duckduckgo import duckduckgo_search_tool ... agent = Agent( 'openai:gpt-4o', instructions=dedent( """ When a new person joins our public Slack, please put together a brief snapshot so we can be most useful to them. **What to include** 1. **Who they are:** Any details about their professional role or projects (e.g. LinkedIn, GitHub, company bio). 2. **Where they work:** Name of the organisation and its domain. 3. **How we can help:** On a scale of 1–5, estimate how likely they are to benefit from **Pydantic Logfire** (our paid observability tool) based on factors such as company size, product maturity, or AI usage. *1 = probably not relevant, 5 = very strong fit.* **Our products (for context only)** • **Pydantic Validation** – Python data-validation (open source) • **Pydantic AI** – Python agent framework (open source) • **Pydantic Logfire** – Observability for traces, logs & metrics with first-class AI support (commercial) **How to research** • Use the provided DuckDuckGo search tool to research the person and the organization they work for, based on the email domain or what you find on e.g. LinkedIn and GitHub. • If you can't find enough to form a reasonable view, return **None**. """ ), tools=[duckduckgo_search_tool()], output_type=NativeOutput([Analysis, NoneType]), ) ... ``` #### `analyze_profile` We also define a `analyze_profile` helper function that takes a `Profile`, runs the agent, and returns an `Analysis` (or `None`), and instrument it using [Logfire](../../logfire/). [slack_lead_qualifier/agent.py (L44-L47)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/agent.py#L44-L47) ```py ... @logfire.instrument('Analyze profile') async def analyze_profile(profile: Profile) -> Analysis | None: result = await agent.run(profile.as_prompt()) return result.output ``` ### Analysis store The next building block we'll need is a place to store all the analyses that have been done so that we can look them up when we send the daily summary. Fortunately, Modal provides us with a convenient way to store some data that can be read back in a subsequent Modal run (webhook or scheduled): [`modal.Dict`](https://modal.com/docs/reference/modal.Dict). We define some convenience methods to easily add, list, and clear analyses. [slack_lead_qualifier/store.py (L4-L31)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/store.py#L4-L31) ```py ... import modal ... class AnalysisStore: @classmethod @logfire.instrument('Add analysis to store') async def add(cls, analysis: Analysis): await cls._get_store().put.aio(analysis.profile.email, analysis.model_dump()) @classmethod @logfire.instrument('List analyses from store') async def list(cls) -> list[Analysis]: return [ Analysis.model_validate(analysis) async for analysis in cls._get_store().values.aio() ] @classmethod @logfire.instrument('Clear analyses from store') async def clear(cls): await cls._get_store().clear.aio() @classmethod def _get_store(cls) -> modal.Dict: return modal.Dict.from_name('analyses', create_if_missing=True) # type: ignore ``` Note Note that `# type: ignore` on the last line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. ### Send Slack message Next, we'll need a way to actually send a Slack message, so we define a simple function that uses Slack's [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API. [slack_lead_qualifier/slack.py (L8-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/slack.py#L8-L30) ```py ... API_KEY = os.getenv('SLACK_API_KEY') assert API_KEY, 'SLACK_API_KEY is not set' @logfire.instrument('Send Slack message') async def send_slack_message(channel: str, blocks: list[dict[str, Any]]): client = httpx.AsyncClient() response = await client.post( 'https://slack.com/api/chat.postMessage', json={ 'channel': channel, 'blocks': blocks, }, headers={ 'Authorization': f'Bearer {API_KEY}', }, timeout=5, ) response.raise_for_status() result = response.json() if not result.get('ok', False): error = result.get('error', 'Unknown error') raise Exception(f'Failed to send to Slack: {error}') ``` ### Features Now we can start putting these building blocks together to implement the actual features we want! #### `process_slack_member` This function takes a [`Profile`](#profile), [analyzes](#analyze_profile) it using the agent, adds it to the [`AnalysisStore`](#analysis-store), and [sends](#send-slack-message) the analysis into the `#new-slack-leads` channel. [slack_lead_qualifier/functions.py (L4-L45)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L4-L45) ```py ... from .agent import analyze_profile from .models import Profile from .slack import send_slack_message from .store import AnalysisStore ... NEW_LEAD_CHANNEL = '#new-slack-leads' ... @logfire.instrument('Process Slack member') async def process_slack_member(profile: Profile): analysis = await analyze_profile(profile) logfire.info('Analysis', analysis=analysis) if analysis is None: return await AnalysisStore().add(analysis) await send_slack_message( NEW_LEAD_CHANNEL, [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'New Slack member with score {analysis.relevance}/5', }, }, { 'type': 'divider', }, *analysis.as_slack_blocks(), ], ) ... ``` #### `send_daily_summary` This function list all of the analyses in the [`AnalysisStore`](#analysis-store), takes the top 5 by relevance, [sends](#send-slack-message) them into the `#daily-slack-leads-summary` channel, and clears the `AnalysisStore` so that the next daily run won't process these analyses again. [slack_lead_qualifier/functions.py (L8-L85)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/functions.py#L8-L85) ```py ... from .slack import send_slack_message from .store import AnalysisStore ... DAILY_SUMMARY_CHANNEL = '#daily-slack-leads-summary' ... @logfire.instrument('Send daily summary') async def send_daily_summary(): analyses = await AnalysisStore().list() logfire.info('Analyses', analyses=analyses) if len(analyses) == 0: return sorted_analyses = sorted(analyses, key=lambda x: x.relevance, reverse=True) top_analyses = sorted_analyses[:5] blocks = [ { 'type': 'header', 'text': { 'type': 'plain_text', 'text': f'Top {len(top_analyses)} new Slack members from the last 24 hours', }, }, ] for analysis in top_analyses: blocks.extend( [ { 'type': 'divider', }, *analysis.as_slack_blocks(include_relevance=True), ] ) await send_slack_message( DAILY_SUMMARY_CHANNEL, blocks, ) await AnalysisStore().clear() ``` ### Web app As it stands, neither of these functions are actually being called from anywhere. Let's implement a [FastAPI](https://fastapi.tiangolo.com/) endpoint to handle the `team_join` Slack webhook (also known as the [Slack Events API](https://docs.slack.dev/apis/events-api)) and call the [`process_slack_member`](#process_slack_member) function we just defined. We also instrument FastAPI using Logfire for good measure. [slack_lead_qualifier/app.py (L20-L36)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L20-L36) ```py ... app = FastAPI() logfire.instrument_fastapi(app, capture_headers=True) @app.post('/') async def process_webhook(payload: dict[str, Any]) -> dict[str, Any]: if payload['type'] == 'url_verification': return {'challenge': payload['challenge']} elif ( payload['type'] == 'event_callback' and payload['event']['type'] == 'team_join' ): profile = Profile.model_validate(payload['event']['user']['profile']) process_slack_member(profile) return {'status': 'OK'} raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY) ``` #### `process_slack_member` with Modal I was a little sneaky there -- we're not actually calling the [`process_slack_member`](#process_slack_member) function we defined in `functions.py` directly, as Slack requires webhooks to respond within 3 seconds, and we need a bit more time than that to talk to the LLM, do some web searches, and send the Slack message. Instead, we're calling the following function defined alongside the app, which uses Modal's [`modal.Function.spawn`](https://modal.com/docs/reference/modal.Function#spawn) feature to run a function in the background. (If you're curious what the Modal side of this function looks like, you can [jump ahead](#backgrounded-process_slack_member).) Because `modal.py` (which we'll see in the next section) imports `app.py`, we import from `modal.py` inside the function definition because doing so at the top level would have resulted in a circular import error. We also pass along the current Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), meaning that the background function execution will show up nested under the webhook request trace, so that we have everything related to that request in one place. [slack_lead_qualifier/app.py (L11-L16)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/app.py#L11-L16) ```py ... def process_slack_member(profile: Profile): from .modal import process_slack_member as _process_slack_member _process_slack_member.spawn( profile.model_dump(), logfire_ctx=get_context() ) ... ``` ### Modal app Now let's see how easy Modal makes it to deploy all of this. #### Set up Modal The first thing we do is define the Modal app, by specifying the base image to use (Debian with Python 3.13), all the Python packages it needs, and all of the secrets defined in the Modal interface that need to be made available during runtime. [slack_lead_qualifier/modal.py (L4-L21)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L4-L21) ```py ... import modal image = modal.Image.debian_slim(python_version='3.13').pip_install( 'pydantic', 'pydantic_ai_slim[openai,duckduckgo]', 'logfire[httpx,fastapi]', 'fastapi[standard]', 'httpx', ) app = modal.App( name='slack-lead-qualifier', image=image, secrets=[ modal.Secret.from_name('logfire'), modal.Secret.from_name('openai'), modal.Secret.from_name('slack'), ], ) ... ``` #### Set up Logfire Next, we define a function to set up Logfire instrumentation for Pydantic AI and HTTPX. We cannot do this at the top level of the file, as the requested packages (like `logfire`) will only be available within functions running on Modal (like the ones we'll define next). This file, `modal.py`, runs on your local machine and only has access to the `modal` package. [slack_lead_qualifier/modal.py (L25-L30)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L25-L30) ```py ... def setup_logfire(): import logfire logfire.configure(service_name=app.name) logfire.instrument_pydantic_ai() logfire.instrument_httpx(capture_all=True) ... ``` #### Web app To deploy a [web endpoint](https://modal.com/docs/guide/webhooks) on Modal, we simply define a function that returns an ASGI app (like FastAPI) and decorate it with `@app.function()` and `@modal.asgi_app()`. This `web_app` function will be run on Modal, so inside the function we can call the `setup_logfire` function that requires the `logfire` package, and import `app.py` which uses the other requested packages. By default, Modal spins up a container to handle a function call (like a web request) on-demand, meaning there's a little bit of startup time to each request. However, Slack requires webhooks to respond within 3 seconds, so we specify `min_containers=1` to keep the web endpoint running and ready to answer requests at all times. This is a bit annoying and wasteful, but fortunately [Modal's pricing](https://modal.com/pricing) is pretty reasonable, you get $30 free monthly compute, and they offer up to $50k in free credits for startup and academic researchers. [slack_lead_qualifier/modal.py (L34-L41)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L34-L41) ```py ... @app.function(min_containers=1) @modal.asgi_app() # type: ignore def web_app(): setup_logfire() from .app import app as _app return _app ... ``` Note Note that `# type: ignore` on the `@modal.asgi_app()` line -- unfortunately `modal` does not fully define its types, so we need this to stop our static type checker `pyright`, which we run over all Pydantic AI code including examples, from complaining. #### Scheduled `send_daily_summary` To define a [scheduled function](https://modal.com/docs/guide/cron), we can use the `@app.function()` decorator with a `schedule` argument. This Modal function will call our imported [`send_daily_summary`](#send_daily_summary) function every day at 8 am UTC. [slack_lead_qualifier/modal.py (L60-L66)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L60-L66) ```py ... @app.function(schedule=modal.Cron('0 8 * * *')) # Every day at 8am UTC async def send_daily_summary(): setup_logfire() from .functions import send_daily_summary as _send_daily_summary await _send_daily_summary() ``` #### Backgrounded `process_slack_member` Finally, we define a Modal function that wraps our [`process_slack_member`](#process_slack_member) function, so that it can run in the background. As you'll remember from when we [spawned this function from the web app](#process_slack_member-with-modal), we passed along the Logfire context to get [Distributed Tracing](https://logfire.pydantic.dev/docs/how-to-guides/distributed-tracing/), so we need to attach it here. [slack_lead_qualifier/modal.py (L45-L56)](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/slack_lead_qualifier/modal.py#L45-L56) ```py ... @app.function() async def process_slack_member(profile_raw: dict[str, Any], logfire_ctx: Any): setup_logfire() from logfire.propagate import attach_context from .functions import process_slack_member as _process_slack_member from .models import Profile with attach_context(logfire_ctx): profile = Profile.model_validate(profile_raw) await _process_slack_member(profile) ... ``` ## Conclusion And that's it! Now, assuming you've met the [prerequisites](#prerequisites), you can run or deploy the app using the commands under [usage](#usage). # SQL Generation Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Demonstrates: - [dynamic system prompt](../../agents/#system-prompts) - [structured `output_type`](../../output/#structured-output) - [output validation](../../output/#output-validator-functions) - [agent dependencies](../../dependencies/) ## Running the Example The resulting SQL is validated by running it as an `EXPLAIN` query on PostgreSQL. To run the example, you first need to run PostgreSQL, e.g. via Docker: ```bash docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres ``` *(we run postgres on port `54320` to avoid conflicts with any other postgres instances you may have running)* With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.sql_gen ``` ```bash uv run -m pydantic_ai_examples.sql_gen ``` or to use a custom prompt: ```bash python -m pydantic_ai_examples.sql_gen "find me errors" ``` ```bash uv run -m pydantic_ai_examples.sql_gen "find me errors" ``` This model uses `gemini-1.5-flash` by default since Gemini is good at single shot queries of this kind. ## Example Code [sql_gen.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/sql_gen.py) ```py """Example demonstrating how to use Pydantic AI to generate SQL queries based on user input. Run postgres with: mkdir postgres-data docker run --rm -e POSTGRES_PASSWORD=postgres -p 54320:5432 postgres Run with: uv run -m pydantic_ai_examples.sql_gen "show me logs from yesterday, with level 'error'" """ import asyncio import sys from collections.abc import AsyncGenerator from contextlib import asynccontextmanager from dataclasses import dataclass from datetime import date from typing import Annotated, Any, TypeAlias import asyncpg import logfire from annotated_types import MinLen from devtools import debug from pydantic import BaseModel, Field from pydantic_ai import Agent, ModelRetry, RunContext, format_as_xml # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_asyncpg() logfire.instrument_pydantic_ai() DB_SCHEMA = """ CREATE TABLE records ( created_at timestamptz, start_timestamp timestamptz, end_timestamp timestamptz, trace_id text, span_id text, parent_span_id text, level log_level, span_name text, message text, attributes_json_schema text, attributes jsonb, tags text[], is_exception boolean, otel_status_message text, service_name text ); """ SQL_EXAMPLES = [ { 'request': 'show me records where foobar is false', 'response': "SELECT * FROM records WHERE attributes->>'foobar' = false", }, { 'request': 'show me records where attributes include the key "foobar"', 'response': "SELECT * FROM records WHERE attributes ? 'foobar'", }, { 'request': 'show me records from yesterday', 'response': "SELECT * FROM records WHERE start_timestamp::date > CURRENT_TIMESTAMP - INTERVAL '1 day'", }, { 'request': 'show me error records with the tag "foobar"', 'response': "SELECT * FROM records WHERE level = 'error' and 'foobar' = ANY(tags)", }, ] @dataclass class Deps: conn: asyncpg.Connection class Success(BaseModel): """Response when SQL could be successfully generated.""" sql_query: Annotated[str, MinLen(1)] explanation: str = Field( '', description='Explanation of the SQL query, as markdown' ) class InvalidRequest(BaseModel): """Response the user input didn't include enough information to generate SQL.""" error_message: str Response: TypeAlias = Success | InvalidRequest agent = Agent[Deps, Response]( 'google-gla:gemini-1.5-flash', # Type ignore while we wait for PEP-0747, nonetheless unions will work fine everywhere else output_type=Response, # type: ignore deps_type=Deps, ) @agent.system_prompt async def system_prompt() -> str: return f"""\ Given the following PostgreSQL table of records, your job is to write a SQL query that suits the user's request. Database schema: {DB_SCHEMA} today's date = {date.today()} {format_as_xml(SQL_EXAMPLES)} """ @agent.output_validator async def validate_output(ctx: RunContext[Deps], output: Response) -> Response: if isinstance(output, InvalidRequest): return output # gemini often adds extraneous backslashes to SQL output.sql_query = output.sql_query.replace('\\', '') if not output.sql_query.upper().startswith('SELECT'): raise ModelRetry('Please create a SELECT query') try: await ctx.deps.conn.execute(f'EXPLAIN {output.sql_query}') except asyncpg.exceptions.PostgresError as e: raise ModelRetry(f'Invalid query: {e}') from e else: return output async def main(): if len(sys.argv) == 1: prompt = 'show me logs from yesterday, with level "error"' else: prompt = sys.argv[1] async with database_connect( 'postgresql://postgres:postgres@localhost:54320', 'pydantic_ai_sql_gen' ) as conn: deps = Deps(conn) result = await agent.run(prompt, deps=deps) debug(result.output) # pyright: reportUnknownMemberType=false # pyright: reportUnknownVariableType=false @asynccontextmanager async def database_connect(server_dsn: str, database: str) -> AsyncGenerator[Any, None]: with logfire.span('check and create DB'): conn = await asyncpg.connect(server_dsn) try: db_exists = await conn.fetchval( 'SELECT 1 FROM pg_database WHERE datname = $1', database ) if not db_exists: await conn.execute(f'CREATE DATABASE {database}') finally: await conn.close() conn = await asyncpg.connect(f'{server_dsn}/{database}') try: with logfire.span('create schema'): async with conn.transaction(): if not db_exists: await conn.execute( "CREATE TYPE log_level AS ENUM ('debug', 'info', 'warning', 'error', 'critical')" ) await conn.execute(DB_SCHEMA) yield conn finally: await conn.close() if __name__ == '__main__': asyncio.run(main()) ``` This example shows how to stream markdown from an agent, using the [`rich`](https://github.com/Textualize/rich) library to highlight the output in the terminal. It'll run the example with both OpenAI and Google Gemini models if the required environment variables are set. Demonstrates: - [streaming text responses](../../output/#streaming-text) ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_markdown ``` ```bash uv run -m pydantic_ai_examples.stream_markdown ``` ## Example Code [stream_markdown.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_markdown.py) ```py """This example shows how to stream markdown from an agent, using the `rich` library to display the markdown. Run with: uv run -m pydantic_ai_examples.stream_markdown """ import asyncio import os import logfire from rich.console import Console, ConsoleOptions, RenderResult from rich.live import Live from rich.markdown import CodeBlock, Markdown from rich.syntax import Syntax from rich.text import Text from pydantic_ai import Agent from pydantic_ai.models import KnownModelName # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() agent = Agent() # models to try, and the appropriate env var models: list[tuple[KnownModelName, str]] = [ ('google-gla:gemini-2.0-flash', 'GEMINI_API_KEY'), ('openai:gpt-4o-mini', 'OPENAI_API_KEY'), ('groq:llama-3.3-70b-versatile', 'GROQ_API_KEY'), ] async def main(): prettier_code_blocks() console = Console() prompt = 'Show me a short example of using Pydantic.' console.log(f'Asking: {prompt}...', style='cyan') for model, env_var in models: if env_var in os.environ: console.log(f'Using model: {model}') with Live('', console=console, vertical_overflow='visible') as live: async with agent.run_stream(prompt, model=model) as result: async for message in result.stream_output(): live.update(Markdown(message)) console.log(result.usage()) else: console.log(f'{model} requires {env_var} to be set.') def prettier_code_blocks(): """Make rich code blocks prettier and easier to copy. From https://github.com/samuelcolvin/aicli/blob/v0.8.0/samuelcolvin_aicli.py#L22 """ class SimpleCodeBlock(CodeBlock): def __rich_console__( self, console: Console, options: ConsoleOptions ) -> RenderResult: code = str(self.text).rstrip() yield Text(self.lexer_name, style='dim') yield Syntax( code, self.lexer_name, theme=self.theme, background_color='default', word_wrap=True, ) yield Text(f'/{self.lexer_name}', style='dim') Markdown.elements['fence'] = SimpleCodeBlock if __name__ == '__main__': asyncio.run(main()) ``` Information about whales — an example of streamed structured response validation. Demonstrates: - [streaming structured output](../../output/#streaming-structured-output) This script streams structured responses from GPT-4 about whales, validates the data and displays it as a dynamic table using [`rich`](https://github.com/Textualize/rich) as the data is received. ## Running the Example With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.stream_whales ``` ```bash uv run -m pydantic_ai_examples.stream_whales ``` Should give an output like this: ## Example Code [stream_whales.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/stream_whales.py) ```py """Information about whales — an example of streamed structured response validation. This script streams structured responses from GPT-4 about whales, validates the data and displays it as a dynamic table using Rich as the data is received. Run with: uv run -m pydantic_ai_examples.stream_whales """ from typing import Annotated import logfire from pydantic import Field from rich.console import Console from rich.live import Live from rich.table import Table from typing_extensions import NotRequired, TypedDict from pydantic_ai import Agent # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() class Whale(TypedDict): name: str length: Annotated[ float, Field(description='Average length of an adult whale in meters.') ] weight: NotRequired[ Annotated[ float, Field(description='Average weight of an adult whale in kilograms.', ge=50), ] ] ocean: NotRequired[str] description: NotRequired[Annotated[str, Field(description='Short Description')]] agent = Agent('openai:gpt-4', output_type=list[Whale]) async def main(): console = Console() with Live('\n' * 36, console=console) as live: console.print('Requesting data...', style='cyan') async with agent.run_stream( 'Generate me details of 5 species of Whale.' ) as result: console.print('Response:', style='green') async for whales in result.stream_output(debounce_by=0.01): table = Table( title='Species of Whale', caption='Streaming Structured responses from GPT-4', width=120, ) table.add_column('ID', justify='right') table.add_column('Name') table.add_column('Avg. Length (m)', justify='right') table.add_column('Avg. Weight (kg)', justify='right') table.add_column('Ocean') table.add_column('Description', justify='right') for wid, whale in enumerate(whales, start=1): table.add_row( str(wid), whale['name'], f'{whale["length"]:0.0f}', f'{w:0.0f}' if (w := whale.get('weight')) else '…', whale.get('ocean') or '…', whale.get('description') or '…', ) live.update(table) if __name__ == '__main__': import asyncio asyncio.run(main()) ``` Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. Demonstrates: - [tools](../../tools/) - [agent dependencies](../../dependencies/) - [streaming text responses](../../output/#streaming-text) - Building a [Gradio](https://www.gradio.app/) UI for the agent In this case the idea is a "weather" agent — the user can ask for the weather in multiple locations, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather for those locations. ## Running the Example To run this example properly, you might want to add two extra API keys **(Note if either key is missing, the code will fall back to dummy data, so they're not required)**: - A weather API key from [tomorrow.io](https://www.tomorrow.io/weather-api/) set via `WEATHER_API_KEY` - A geocoding API key from [geocode.maps.co](https://geocode.maps.co/) set via `GEO_API_KEY` With [dependencies installed and environment variables set](../setup/#usage), run: ```bash python -m pydantic_ai_examples.weather_agent ``` ```bash uv run -m pydantic_ai_examples.weather_agent ``` ## Example Code [weather_agent.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent.py) ```py """Example of Pydantic AI with multiple tools which the LLM needs to call in turn to answer a question. In this case the idea is a "weather" agent — the user can ask for the weather in multiple cities, the agent will use the `get_lat_lng` tool to get the latitude and longitude of the locations, then use the `get_weather` tool to get the weather. Run with: uv run -m pydantic_ai_examples.weather_agent """ from __future__ import annotations as _annotations import asyncio from dataclasses import dataclass from typing import Any import logfire from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import Agent, RunContext # 'if-token-present' means nothing will be sent (and the example will work) if you don't have logfire configured logfire.configure(send_to_logfire='if-token-present') logfire.instrument_pydantic_ai() @dataclass class Deps: client: AsyncClient weather_agent = Agent( 'openai:gpt-4.1-mini', # 'Be concise, reply with one sentence.' is enough for some models (like openai) to use # the below tools appropriately, but others like anthropic and gemini require a bit more direction. instructions='Be concise, reply with one sentence.', deps_type=Deps, retries=2, ) class LatLng(BaseModel): lat: float lng: float @weather_agent.tool async def get_lat_lng(ctx: RunContext[Deps], location_description: str) -> LatLng: """Get the latitude and longitude of a location. Args: ctx: The context. location_description: A description of a location. """ # NOTE: the response here will be random, and is not related to the location description. r = await ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/latlng', params={'location': location_description}, ) r.raise_for_status() return LatLng.model_validate_json(r.content) @weather_agent.tool async def get_weather(ctx: RunContext[Deps], lat: float, lng: float) -> dict[str, Any]: """Get the weather at a location. Args: ctx: The context. lat: Latitude of the location. lng: Longitude of the location. """ # NOTE: the responses here will be random, and are not related to the lat and lng. temp_response, descr_response = await asyncio.gather( ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/number', params={'min': 10, 'max': 30}, ), ctx.deps.client.get( 'https://demo-endpoints.pydantic.workers.dev/weather', params={'lat': lat, 'lng': lng}, ), ) temp_response.raise_for_status() descr_response.raise_for_status() return { 'temperature': f'{temp_response.text} °C', 'description': descr_response.text, } async def main(): async with AsyncClient() as client: logfire.instrument_httpx(client, capture_all=True) deps = Deps(client=client) result = await weather_agent.run( 'What is the weather like in London and in Wiltshire?', deps=deps ) print('Response:', result.output) if __name__ == '__main__': asyncio.run(main()) ``` ## Running the UI You can build multi-turn chat applications for your agent with [Gradio](https://www.gradio.app/), a framework for building AI web applications entirely in python. Gradio comes with built-in chat components and agent support so the entire UI will be implemented in a single python file! Here's what the UI looks like for the weather agent: ```bash pip install gradio>=5.9.0 python/uv-run -m pydantic_ai_examples.weather_agent_gradio ``` ## UI Code [weather_agent_gradio.py](https://github.com/pydantic/pydantic-ai/blob/main/examples/pydantic_ai_examples/weather_agent_gradio.py) ```py from __future__ import annotations as _annotations import json from httpx import AsyncClient from pydantic import BaseModel from pydantic_ai import ToolCallPart, ToolReturnPart from pydantic_ai_examples.weather_agent import Deps, weather_agent try: import gradio as gr except ImportError as e: raise ImportError( 'Please install gradio with `pip install gradio`. You must use python>=3.10.' ) from e TOOL_TO_DISPLAY_NAME = {'get_lat_lng': 'Geocoding API', 'get_weather': 'Weather API'} client = AsyncClient() deps = Deps(client=client) async def stream_from_agent(prompt: str, chatbot: list[dict], past_messages: list): chatbot.append({'role': 'user', 'content': prompt}) yield gr.Textbox(interactive=False, value=''), chatbot, gr.skip() async with weather_agent.run_stream( prompt, deps=deps, message_history=past_messages ) as result: for message in result.new_messages(): for call in message.parts: if isinstance(call, ToolCallPart): call_args = call.args_as_json_str() metadata = { 'title': f'🛠️ Using {TOOL_TO_DISPLAY_NAME[call.tool_name]}', } if call.tool_call_id is not None: metadata['id'] = call.tool_call_id gr_message = { 'role': 'assistant', 'content': 'Parameters: ' + call_args, 'metadata': metadata, } chatbot.append(gr_message) if isinstance(call, ToolReturnPart): for gr_message in chatbot: if ( gr_message.get('metadata', {}).get('id', '') == call.tool_call_id ): if isinstance(call.content, BaseModel): json_content = call.content.model_dump_json() else: json_content = json.dumps(call.content) gr_message['content'] += f'\nOutput: {json_content}' yield gr.skip(), chatbot, gr.skip() chatbot.append({'role': 'assistant', 'content': ''}) async for message in result.stream_text(): chatbot[-1]['content'] = message yield gr.skip(), chatbot, gr.skip() past_messages = result.all_messages() yield gr.Textbox(interactive=True), gr.skip(), past_messages async def handle_retry(chatbot, past_messages: list, retry_data: gr.RetryData): new_history = chatbot[: retry_data.index] previous_prompt = chatbot[retry_data.index]['content'] past_messages = past_messages[: retry_data.index] async for update in stream_from_agent(previous_prompt, new_history, past_messages): yield update def undo(chatbot, past_messages: list, undo_data: gr.UndoData): new_history = chatbot[: undo_data.index] past_messages = past_messages[: undo_data.index] return chatbot[undo_data.index]['content'], new_history, past_messages def select_data(message: gr.SelectData) -> str: return message.value['text'] with gr.Blocks() as demo: gr.HTML( """

Weather Assistant

This assistant answer your weather questions.

""" ) past_messages = gr.State([]) chatbot = gr.Chatbot( label='Packing Assistant', type='messages', avatar_images=(None, 'https://ai.pydantic.dev/img/logo-white.svg'), examples=[ {'text': 'What is the weather like in Miami?'}, {'text': 'What is the weather like in London?'}, ], ) with gr.Row(): prompt = gr.Textbox( lines=1, show_label=False, placeholder='What is the weather like in New York City?', ) generation = prompt.submit( stream_from_agent, inputs=[prompt, chatbot, past_messages], outputs=[prompt, chatbot, past_messages], ) chatbot.example_select(select_data, None, [prompt]) chatbot.retry( handle_retry, [chatbot, past_messages], [prompt, chatbot, past_messages] ) chatbot.undo(undo, [chatbot, past_messages], [prompt, chatbot, past_messages]) if __name__ == '__main__': demo.launch() ```