`pydantic_ai.settings`

ModelSettings

Bases: TypedDict

Settings to configure an LLM.

Here we include only settings which apply to multiple models / model providers, though not all of these settings are supported by all models.

Source code in pydantic_ai_slim/pydantic_ai/settings.py

class ModelSettings(TypedDict, total=False):
    """Settings to configure an LLM.

    Here we include only settings which apply to multiple models / model providers,
    though not all of these settings are supported by all models.
    """

    max_tokens: int
    """The maximum number of tokens to generate before stopping.

    Supported by:

    * Gemini
    * Anthropic
    * OpenAI
    * Groq
    * Cohere
    * Mistral
    * Bedrock
    * MCP Sampling
    """

    temperature: float
    """Amount of randomness injected into the response.

    Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to a model's
    maximum `temperature` for creative and generative tasks.

    Note that even with `temperature` of `0.0`, the results will not be fully deterministic.

    Supported by:

    * Gemini
    * Anthropic
    * OpenAI
    * Groq
    * Cohere
    * Mistral
    * Bedrock
    """

    top_p: float
    """An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.

    So 0.1 means only the tokens comprising the top 10% probability mass are considered.

    You should either alter `temperature` or `top_p`, but not both.

    Supported by:

    * Gemini
    * Anthropic
    * OpenAI
    * Groq
    * Cohere
    * Mistral
    * Bedrock
    """

    timeout: float | Timeout
    """Override the client-level default timeout for a request, in seconds.

    Supported by:

    * Gemini
    * Anthropic
    * OpenAI
    * Groq
    * Mistral
    """

    parallel_tool_calls: bool
    """Whether to allow parallel tool calls.

    Supported by:

    * OpenAI (some models, not o1)
    * Groq
    * Anthropic
    """

    seed: int
    """The random seed to use for the model, theoretically allowing for deterministic results.

    Supported by:

    * OpenAI
    * Groq
    * Cohere
    * Mistral
    """

    presence_penalty: float
    """Penalize new tokens based on whether they have appeared in the text so far.

    Supported by:

    * OpenAI
    * Groq
    * Cohere
    * Gemini
    * Mistral
    """

    frequency_penalty: float
    """Penalize new tokens based on their existing frequency in the text so far.

    Supported by:

    * OpenAI
    * Groq
    * Cohere
    * Gemini
    * Mistral
    """

    logit_bias: dict[str, int]
    """Modify the likelihood of specified tokens appearing in the completion.

    Supported by:

    * OpenAI
    * Groq
    """

    stop_sequences: list[str]
    """Sequences that will cause the model to stop generating.

    Supported by:

    * OpenAI
    * Anthropic
    * Bedrock
    * Mistral
    * Groq
    * Cohere
    * Google
    """

    extra_headers: dict[str, str]
    """Extra headers to send to the model.

    Supported by:

    * OpenAI
    * Anthropic
    * Groq
    """

    extra_body: object
    """Extra body to send to the model.

    Supported by:

    * OpenAI
    * Anthropic
    * Groq
    """

max_tokens `instance-attribute`

max_tokens: int

The maximum number of tokens to generate before stopping.

Supported by:

Gemini
Anthropic
OpenAI
Groq
Cohere
Mistral
Bedrock
MCP Sampling

temperature `instance-attribute`

temperature: float

Amount of randomness injected into the response.

Use temperature closer to 0.0 for analytical / multiple choice, and closer to a model's maximum temperature for creative and generative tasks.

Note that even with temperature of 0.0, the results will not be fully deterministic.

Supported by:

Gemini
Anthropic
OpenAI
Groq
Cohere
Mistral
Bedrock

top_p `instance-attribute`

top_p: float

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.

So 0.1 means only the tokens comprising the top 10% probability mass are considered.

You should either alter temperature or top_p, but not both.

Supported by:

Gemini
Anthropic
OpenAI
Groq
Cohere
Mistral
Bedrock

timeout `instance-attribute`

timeout: float | Timeout

Override the client-level default timeout for a request, in seconds.

Supported by:

Gemini
Anthropic
OpenAI
Groq
Mistral

parallel_tool_calls `instance-attribute`

parallel_tool_calls: bool

Whether to allow parallel tool calls.

Supported by:

OpenAI (some models, not o1)
Groq
Anthropic

seed `instance-attribute`

seed: int

The random seed to use for the model, theoretically allowing for deterministic results.

Supported by:

OpenAI
Groq
Cohere
Mistral

presence_penalty `instance-attribute`

presence_penalty: float

Penalize new tokens based on whether they have appeared in the text so far.

Supported by:

OpenAI
Groq
Cohere
Gemini
Mistral

frequency_penalty `instance-attribute`

frequency_penalty: float

Penalize new tokens based on their existing frequency in the text so far.

Supported by:

OpenAI
Groq
Cohere
Gemini
Mistral

logit_bias `instance-attribute`

logit_bias: dict[str, int]

Modify the likelihood of specified tokens appearing in the completion.

Supported by:

OpenAI
Groq

stop_sequences `instance-attribute`

stop_sequences: list[str]

Sequences that will cause the model to stop generating.

Supported by:

OpenAI
Anthropic
Bedrock
Mistral
Groq
Cohere
Google

extra_headers `instance-attribute`

extra_headers: dict[str, str]

Extra headers to send to the model.

Supported by:

OpenAI
Anthropic
Groq

extra_body `instance-attribute`

extra_body: object

Extra body to send to the model.

Supported by:

OpenAI
Anthropic
Groq

pydantic_ai.settings

ModelSettings

max_tokens instance-attribute

temperature instance-attribute

top_p instance-attribute

timeout instance-attribute

parallel_tool_calls instance-attribute

seed instance-attribute

presence_penalty instance-attribute

frequency_penalty instance-attribute

logit_bias instance-attribute

stop_sequences instance-attribute

extra_headers instance-attribute

extra_body instance-attribute

`pydantic_ai.settings`

max_tokens `instance-attribute`

temperature `instance-attribute`

top_p `instance-attribute`

timeout `instance-attribute`

parallel_tool_calls `instance-attribute`

seed `instance-attribute`

presence_penalty `instance-attribute`

frequency_penalty `instance-attribute`

logit_bias `instance-attribute`

stop_sequences `instance-attribute`

extra_headers `instance-attribute`

extra_body `instance-attribute`