pydantic_ai.embeddings
EmbeddingModel
Bases: ABC
Abstract base class for embedding models.
Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations:
OpenAIEmbeddingModelCohereEmbeddingModelGoogleEmbeddingModelBedrockEmbeddingModelSentenceTransformerEmbeddingModel
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
__init__
__init__(
*, settings: EmbeddingSettings | None = None
) -> None
Initialize the model with optional settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
EmbeddingSettings | None
|
Model-specific settings that will be used as defaults for this model. |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
23 24 25 26 27 28 29 30 31 32 33 | |
system
abstractmethod
property
system: str
The embedding model provider/system identifier (e.g., 'openai', 'cohere').
embed
abstractmethod
async
embed(
inputs: str | Sequence[str],
*,
input_type: EmbedInputType,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Generate embeddings for the given inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
str | Sequence[str]
|
A single string or sequence of strings to embed. |
required |
input_type
|
EmbedInputType
|
Whether the inputs are queries or documents. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to override the model's defaults. |
None
|
Returns:
| Type | Description |
|---|---|
EmbeddingResult
|
An |
EmbeddingResult
|
the embeddings and metadata. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
prepare_embed
prepare_embed(
inputs: str | Sequence[str],
settings: EmbeddingSettings | None = None,
) -> tuple[list[str], EmbeddingSettings]
Prepare the inputs and settings for embedding.
This method normalizes inputs to a list and merges settings.
Subclasses should call this at the start of their embed() implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
str | Sequence[str]
|
A single string or sequence of strings. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to merge with defaults. |
None
|
Returns:
| Type | Description |
|---|---|
tuple[list[str], EmbeddingSettings]
|
A tuple of (normalized inputs list, merged settings). |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
max_input_tokens
async
max_input_tokens() -> int | None
Get the maximum number of tokens that can be input to the model.
Returns:
| Type | Description |
|---|---|
int | None
|
The maximum token count, or |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
95 96 97 98 99 100 101 | |
count_tokens
async
Count the number of tokens in the given text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to tokenize and count. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The number of tokens. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the model doesn't support token counting. |
UserError
|
If the model or tokenizer is not supported. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
InstrumentedEmbeddingModel
dataclass
Bases: WrapperEmbeddingModel
Embedding model which wraps another model so that requests are instrumented with OpenTelemetry.
See the Debugging and Monitoring guide for more info.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
instrumentation_settings
instance-attribute
instrumentation_settings: InstrumentationSettings = (
options or InstrumentationSettings()
)
Instrumentation settings for this model.
instrument_embedding_model
instrument_embedding_model(
model: EmbeddingModel,
instrument: InstrumentationSettings | bool,
) -> EmbeddingModel
Instrument an embedding model with OpenTelemetry/logfire.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py
30 31 32 33 34 35 36 37 38 | |
EmbeddingResult
dataclass
The result of an embedding operation.
This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing.
Example:
from pydantic_ai import Embedder
embedder = Embedder('openai:text-embedding-3-small')
async def main():
result = await embedder.embed_query('What is AI?')
# Access embeddings by index
print(len(result.embeddings[0]))
#> 1536
# Access embeddings by original input text
print(result['What is AI?'] == result.embeddings[0])
#> True
# Check usage
print(f'Tokens used: {result.usage.input_tokens}')
#> Tokens used: 3
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
embeddings
instance-attribute
The computed embedding vectors, one per input text.
Each embedding is a sequence of floats representing the text in vector space.
input_type
instance-attribute
input_type: EmbedInputType
Whether the inputs were embedded as queries or documents.
model_name
instance-attribute
model_name: str
The name of the model that generated these embeddings.
provider_name
instance-attribute
provider_name: str
The name of the provider (e.g., 'openai', 'cohere').
timestamp
class-attribute
instance-attribute
When the embedding request was made.
usage
class-attribute
instance-attribute
usage: RequestUsage = field(default_factory=RequestUsage)
Token usage statistics for this request.
provider_details
class-attribute
instance-attribute
Provider-specific details from the response.
provider_response_id
class-attribute
instance-attribute
provider_response_id: str | None = None
Unique identifier for this response from the provider, if available.
__getitem__
Get the embedding for an input by index or by the original input text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
int | str
|
Either an integer index or the original input string. |
required |
Returns:
| Type | Description |
|---|---|
Sequence[float]
|
The embedding vector for the specified input. |
Raises:
| Type | Description |
|---|---|
IndexError
|
If the index is out of range. |
ValueError
|
If the string is not found in the inputs. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
cost
cost() -> PriceCalculation
Calculate the cost of the embedding request.
Uses genai-prices for pricing data.
Returns:
| Type | Description |
|---|---|
PriceCalculation
|
A price calculation object with |
Raises:
| Type | Description |
|---|---|
LookupError
|
If pricing data is not available for this model/provider. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
EmbeddingSettings
Bases: TypedDict
Common settings for configuring embedding models.
These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details.
Provider-specific settings classes (e.g.,
OpenAIEmbeddingSettings,
CohereEmbeddingSettings)
extend this with additional provider-prefixed options.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/settings.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |
dimensions
instance-attribute
dimensions: int
The number of dimensions for the output embeddings.
Supported by:
- OpenAI
- Cohere
- Sentence Transformers
- Bedrock
- VoyageAI
truncate
instance-attribute
truncate: bool
Whether to truncate inputs that exceed the model's context length.
Defaults to False. If True, inputs that are too long will be truncated.
If False, an error will be raised for inputs that exceed the context length.
For more control over truncation, you can use
max_input_tokens() and
count_tokens() to implement
your own truncation logic.
Provider-specific truncation settings (e.g., cohere_truncate, bedrock_cohere_truncate)
take precedence if specified.
Supported by:
- Cohere
- Bedrock (Cohere and Nova models)
- VoyageAI
extra_headers
instance-attribute
Extra headers to send to the model.
Supported by:
- OpenAI
- Cohere
extra_body
instance-attribute
extra_body: object
Extra body to send to the model.
Supported by:
- OpenAI
- Cohere
merge_embedding_settings
merge_embedding_settings(
base: EmbeddingSettings | None,
overrides: EmbeddingSettings | None,
) -> EmbeddingSettings | None
Merge two sets of embedding settings, with overrides taking precedence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base
|
EmbeddingSettings | None
|
Base settings (typically from the embedder or model). |
required |
overrides
|
EmbeddingSettings | None
|
Settings that should override the base (typically per-call settings). |
required |
Returns:
| Type | Description |
|---|---|
EmbeddingSettings | None
|
Merged settings, or |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/settings.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | |
TestEmbeddingModel
dataclass
Bases: EmbeddingModel
A mock embedding model for testing.
This model returns deterministic embeddings (all 1.0 values) and tracks
the settings used in the last call via the last_settings attribute.
Example:
from pydantic_ai import Embedder
from pydantic_ai.embeddings import TestEmbeddingModel
test_model = TestEmbeddingModel()
embedder = Embedder('openai:text-embedding-3-small')
async def main():
with embedder.override(model=test_model):
await embedder.embed_query('test')
assert test_model.last_settings is not None
Source code in pydantic_ai_slim/pydantic_ai/embeddings/test.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |
last_settings
class-attribute
instance-attribute
last_settings: EmbeddingSettings | None = None
The settings used in the most recent embed call.
__init__
__init__(
model_name: str = "test",
*,
provider_name: str = "test",
dimensions: int = 8,
settings: EmbeddingSettings | None = None
)
Initialize the test embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
The model name to report in results. |
'test'
|
provider_name
|
str
|
The provider name to report in results. |
'test'
|
dimensions
|
int
|
The number of dimensions for the generated embeddings. |
8
|
settings
|
EmbeddingSettings | None
|
Optional default settings for the model. |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/test.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
WrapperEmbeddingModel
dataclass
Bases: EmbeddingModel
Base class for embedding models that wrap another model.
Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model.
By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
wrapped
instance-attribute
wrapped: EmbeddingModel = (
infer_embedding_model(wrapped)
if isinstance(wrapped, str)
else wrapped
)
The underlying embedding model being wrapped.
__init__
__init__(wrapped: EmbeddingModel | str)
Initialize the wrapper with an embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
wrapped
|
EmbeddingModel | str
|
The model to wrap. Can be an
|
required |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py
30 31 32 33 34 35 36 37 38 39 40 41 | |
settings
property
settings: EmbeddingSettings | None
Get the settings from the wrapped embedding model.
KnownEmbeddingModelName
module-attribute
KnownEmbeddingModelName = TypeAliasType(
"KnownEmbeddingModelName",
Literal[
"google-gla:gemini-embedding-001",
"google-vertex:gemini-embedding-001",
"google-vertex:text-embedding-005",
"google-vertex:text-multilingual-embedding-002",
"openai:text-embedding-ada-002",
"openai:text-embedding-3-small",
"openai:text-embedding-3-large",
"cohere:embed-v4.0",
"cohere:embed-english-v3.0",
"cohere:embed-english-light-v3.0",
"cohere:embed-multilingual-v3.0",
"cohere:embed-multilingual-light-v3.0",
"voyageai:voyage-4-large",
"voyageai:voyage-4",
"voyageai:voyage-4-lite",
"voyageai:voyage-3-large",
"voyageai:voyage-3.5",
"voyageai:voyage-3.5-lite",
"voyageai:voyage-code-3",
"voyageai:voyage-finance-2",
"voyageai:voyage-law-2",
"voyageai:voyage-code-2",
"bedrock:amazon.titan-embed-text-v1",
"bedrock:amazon.titan-embed-text-v2:0",
"bedrock:cohere.embed-english-v3",
"bedrock:cohere.embed-multilingual-v3",
"bedrock:cohere.embed-v4:0",
"bedrock:amazon.nova-2-multimodal-embeddings-v1:0",
],
)
Known model names that can be used with the model parameter of Embedder.
KnownEmbeddingModelName is provided as a concise way to specify an embedding model.
infer_embedding_model
infer_embedding_model(
model: EmbeddingModel | KnownEmbeddingModelName | str,
*,
provider_factory: Callable[
[str], Provider[Any]
] = infer_provider
) -> EmbeddingModel
Infer the model from the name.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | |
Embedder
dataclass
High-level interface for generating text embeddings.
The Embedder class provides a convenient way to generate vector embeddings from text
using various embedding model providers. It handles model inference, settings management,
and optional OpenTelemetry instrumentation.
Example:
from pydantic_ai import Embedder
embedder = Embedder('openai:text-embedding-3-small')
async def main():
result = await embedder.embed_query('What is machine learning?')
print(result.embeddings[0][:5]) # First 5 dimensions
#> [1.0, 1.0, 1.0, 1.0, 1.0]
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 | |
__init__
__init__(
model: EmbeddingModel | KnownEmbeddingModelName | str,
*,
settings: EmbeddingSettings | None = None,
defer_model_check: bool = True,
instrument: InstrumentationSettings | bool | None = None
) -> None
Initialize an Embedder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
EmbeddingModel | KnownEmbeddingModelName | str
|
The embedding model to use. Can be specified as:
|
required |
settings
|
EmbeddingSettings | None
|
Optional |
None
|
defer_model_check
|
bool
|
Whether to defer model validation until first use.
Set to |
True
|
instrument
|
InstrumentationSettings | bool | None
|
OpenTelemetry instrumentation settings. Set to |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | |
instrument
instance-attribute
instrument: InstrumentationSettings | bool | None = (
instrument
)
Options to automatically instrument with OpenTelemetry.
Set to True to use default instrumentation settings, which will use Logfire if it's configured.
Set to an instance of InstrumentationSettings to customize.
If this isn't set, then the last value set by
Embedder.instrument_all()
will be used, which defaults to False.
See the Debugging and Monitoring guide for more info.
instrument_all
staticmethod
instrument_all(
instrument: InstrumentationSettings | bool = True,
) -> None
Set the default instrumentation options for all embedders where instrument is not explicitly set.
This is useful for enabling instrumentation globally without modifying each embedder individually.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instrument
|
InstrumentationSettings | bool
|
Instrumentation settings to use as the default. Set to |
True
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
201 202 203 204 205 206 207 208 209 210 211 212 213 | |
model
property
model: EmbeddingModel | KnownEmbeddingModelName | str
The embedding model used by this embedder.
override
override(
*,
model: (
EmbeddingModel
| KnownEmbeddingModelName
| str
| Unset
) = UNSET
) -> Iterator[None]
Context manager to temporarily override the embedding model.
Useful for testing or dynamically switching models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
EmbeddingModel | KnownEmbeddingModelName | str | Unset
|
The embedding model to use within this context. |
UNSET
|
Example:
from pydantic_ai import Embedder
embedder = Embedder('openai:text-embedding-3-small')
async def main():
# Temporarily use a different model
with embedder.override(model='openai:text-embedding-3-large'):
result = await embedder.embed_query('test')
print(len(result.embeddings[0])) # 3072 dimensions for large model
#> 3072
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 | |
embed_query
async
embed_query(
query: str | Sequence[str],
*,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Embed one or more query texts.
Use this method when embedding search queries that will be compared against document embeddings. Some models optimize embeddings differently based on whether the input is a query or document.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str | Sequence[str]
|
A single query string or sequence of query strings to embed. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to override the embedder's default settings for this call. |
None
|
Returns:
| Type | Description |
|---|---|
EmbeddingResult
|
An |
EmbeddingResult
|
and metadata about the operation. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 | |
embed_documents
async
embed_documents(
documents: str | Sequence[str],
*,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Embed one or more document texts.
Use this method when embedding documents that will be stored and later searched against. Some models optimize embeddings differently based on whether the input is a query or document.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
documents
|
str | Sequence[str]
|
A single document string or sequence of document strings to embed. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to override the embedder's default settings for this call. |
None
|
Returns:
| Type | Description |
|---|---|
EmbeddingResult
|
An |
EmbeddingResult
|
and metadata about the operation. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | |
embed
async
embed(
inputs: str | Sequence[str],
*,
input_type: EmbedInputType,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Embed text inputs with explicit input type specification.
This is the low-level embedding method. For most use cases, prefer
embed_query() or
embed_documents().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
str | Sequence[str]
|
A single string or sequence of strings to embed. |
required |
input_type
|
EmbedInputType
|
The type of input, either |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to override the embedder's default settings for this call. |
None
|
Returns:
| Type | Description |
|---|---|
EmbeddingResult
|
An |
EmbeddingResult
|
and metadata about the operation. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 | |
max_input_tokens
async
max_input_tokens() -> int | None
Get the maximum number of tokens the model can accept as input.
Returns:
| Type | Description |
|---|---|
int | None
|
The maximum token count, or |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
317 318 319 320 321 322 323 324 | |
count_tokens
async
Count the number of tokens in the given text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to tokenize and count. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The number of tokens in the text. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the model doesn't support token counting. |
UserError
|
If the model or tokenizer is not supported. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 | |
embed_query_sync
embed_query_sync(
query: str | Sequence[str],
*,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Synchronous version of embed_query().
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
342 343 344 345 346 | |
embed_documents_sync
embed_documents_sync(
documents: str | Sequence[str],
*,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Synchronous version of embed_documents().
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
348 349 350 351 352 | |
embed_sync
embed_sync(
inputs: str | Sequence[str],
*,
input_type: EmbedInputType,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Synchronous version of embed().
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
354 355 356 357 358 | |
max_input_tokens_sync
max_input_tokens_sync() -> int | None
Synchronous version of max_input_tokens().
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
360 361 362 | |
count_tokens_sync
Synchronous version of count_tokens().
Source code in pydantic_ai_slim/pydantic_ai/embeddings/__init__.py
364 365 366 | |
EmbeddingModel
Bases: ABC
Abstract base class for embedding models.
Implement this class to create a custom embedding model. For most use cases, use one of the built-in implementations:
OpenAIEmbeddingModelCohereEmbeddingModelGoogleEmbeddingModelBedrockEmbeddingModelSentenceTransformerEmbeddingModel
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
__init__
__init__(
*, settings: EmbeddingSettings | None = None
) -> None
Initialize the model with optional settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
EmbeddingSettings | None
|
Model-specific settings that will be used as defaults for this model. |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
23 24 25 26 27 28 29 30 31 32 33 | |
system
abstractmethod
property
system: str
The embedding model provider/system identifier (e.g., 'openai', 'cohere').
embed
abstractmethod
async
embed(
inputs: str | Sequence[str],
*,
input_type: EmbedInputType,
settings: EmbeddingSettings | None = None
) -> EmbeddingResult
Generate embeddings for the given inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
str | Sequence[str]
|
A single string or sequence of strings to embed. |
required |
input_type
|
EmbedInputType
|
Whether the inputs are queries or documents. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to override the model's defaults. |
None
|
Returns:
| Type | Description |
|---|---|
EmbeddingResult
|
An |
EmbeddingResult
|
the embeddings and metadata. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
prepare_embed
prepare_embed(
inputs: str | Sequence[str],
settings: EmbeddingSettings | None = None,
) -> tuple[list[str], EmbeddingSettings]
Prepare the inputs and settings for embedding.
This method normalizes inputs to a list and merges settings.
Subclasses should call this at the start of their embed() implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
str | Sequence[str]
|
A single string or sequence of strings. |
required |
settings
|
EmbeddingSettings | None
|
Optional settings to merge with defaults. |
None
|
Returns:
| Type | Description |
|---|---|
tuple[list[str], EmbeddingSettings]
|
A tuple of (normalized inputs list, merged settings). |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
max_input_tokens
async
max_input_tokens() -> int | None
Get the maximum number of tokens that can be input to the model.
Returns:
| Type | Description |
|---|---|
int | None
|
The maximum token count, or |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
95 96 97 98 99 100 101 | |
count_tokens
async
Count the number of tokens in the given text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The text to tokenize and count. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The number of tokens. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the model doesn't support token counting. |
UserError
|
If the model or tokenizer is not supported. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/base.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 | |
EmbedInputType
module-attribute
EmbedInputType = Literal['query', 'document']
The type of input to the embedding model.
'query': Text that will be used as a search query'document': Text that will be stored and searched against
Some embedding models optimize differently for queries vs documents.
EmbeddingResult
dataclass
The result of an embedding operation.
This class contains the generated embeddings along with metadata about the operation, including the original inputs, model information, usage statistics, and timing.
Example:
from pydantic_ai import Embedder
embedder = Embedder('openai:text-embedding-3-small')
async def main():
result = await embedder.embed_query('What is AI?')
# Access embeddings by index
print(len(result.embeddings[0]))
#> 1536
# Access embeddings by original input text
print(result['What is AI?'] == result.embeddings[0])
#> True
# Check usage
print(f'Tokens used: {result.usage.input_tokens}')
#> Tokens used: 3
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
embeddings
instance-attribute
The computed embedding vectors, one per input text.
Each embedding is a sequence of floats representing the text in vector space.
input_type
instance-attribute
input_type: EmbedInputType
Whether the inputs were embedded as queries or documents.
model_name
instance-attribute
model_name: str
The name of the model that generated these embeddings.
provider_name
instance-attribute
provider_name: str
The name of the provider (e.g., 'openai', 'cohere').
timestamp
class-attribute
instance-attribute
When the embedding request was made.
usage
class-attribute
instance-attribute
usage: RequestUsage = field(default_factory=RequestUsage)
Token usage statistics for this request.
provider_details
class-attribute
instance-attribute
Provider-specific details from the response.
provider_response_id
class-attribute
instance-attribute
provider_response_id: str | None = None
Unique identifier for this response from the provider, if available.
__getitem__
Get the embedding for an input by index or by the original input text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
int | str
|
Either an integer index or the original input string. |
required |
Returns:
| Type | Description |
|---|---|
Sequence[float]
|
The embedding vector for the specified input. |
Raises:
| Type | Description |
|---|---|
IndexError
|
If the index is out of range. |
ValueError
|
If the string is not found in the inputs. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
cost
cost() -> PriceCalculation
Calculate the cost of the embedding request.
Uses genai-prices for pricing data.
Returns:
| Type | Description |
|---|---|
PriceCalculation
|
A price calculation object with |
Raises:
| Type | Description |
|---|---|
LookupError
|
If pricing data is not available for this model/provider. |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/result.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
EmbeddingSettings
Bases: TypedDict
Common settings for configuring embedding models.
These settings apply across multiple embedding model providers. Not all settings are supported by all models - check the specific model's documentation for details.
Provider-specific settings classes (e.g.,
OpenAIEmbeddingSettings,
CohereEmbeddingSettings)
extend this with additional provider-prefixed options.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/settings.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |
dimensions
instance-attribute
dimensions: int
The number of dimensions for the output embeddings.
Supported by:
- OpenAI
- Cohere
- Sentence Transformers
- Bedrock
- VoyageAI
truncate
instance-attribute
truncate: bool
Whether to truncate inputs that exceed the model's context length.
Defaults to False. If True, inputs that are too long will be truncated.
If False, an error will be raised for inputs that exceed the context length.
For more control over truncation, you can use
max_input_tokens() and
count_tokens() to implement
your own truncation logic.
Provider-specific truncation settings (e.g., cohere_truncate, bedrock_cohere_truncate)
take precedence if specified.
Supported by:
- Cohere
- Bedrock (Cohere and Nova models)
- VoyageAI
extra_headers
instance-attribute
Extra headers to send to the model.
Supported by:
- OpenAI
- Cohere
extra_body
instance-attribute
extra_body: object
Extra body to send to the model.
Supported by:
- OpenAI
- Cohere
merge_embedding_settings
merge_embedding_settings(
base: EmbeddingSettings | None,
overrides: EmbeddingSettings | None,
) -> EmbeddingSettings | None
Merge two sets of embedding settings, with overrides taking precedence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base
|
EmbeddingSettings | None
|
Base settings (typically from the embedder or model). |
required |
overrides
|
EmbeddingSettings | None
|
Settings that should override the base (typically per-call settings). |
required |
Returns:
| Type | Description |
|---|---|
EmbeddingSettings | None
|
Merged settings, or |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/settings.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | |
OpenAIEmbeddingModelName
module-attribute
OpenAIEmbeddingModelName = str | EmbeddingModel
Possible OpenAI embeddings model names.
See the OpenAI embeddings documentation for available models.
OpenAIEmbeddingSettings
Bases: EmbeddingSettings
Settings used for an OpenAI embedding model request.
All fields from EmbeddingSettings are supported.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/openai.py
36 37 38 39 40 | |
OpenAIEmbeddingModel
dataclass
Bases: EmbeddingModel
OpenAI embedding model implementation.
This model works with OpenAI's embeddings API and any OpenAI-compatible providers.
Example:
from pydantic_ai.embeddings.openai import OpenAIEmbeddingModel
from pydantic_ai.providers.openai import OpenAIProvider
# Using OpenAI directly
model = OpenAIEmbeddingModel('text-embedding-3-small')
# Using an OpenAI-compatible provider
model = OpenAIEmbeddingModel(
'text-embedding-3-small',
provider=OpenAIProvider(base_url='https://my-provider.com/v1'),
)
Source code in pydantic_ai_slim/pydantic_ai/embeddings/openai.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | |
__init__
__init__(
model_name: OpenAIEmbeddingModelName,
*,
provider: (
OpenAIEmbeddingsCompatibleProvider
| Literal["openai"]
| Provider[AsyncOpenAI]
) = "openai",
settings: EmbeddingSettings | None = None
)
Initialize an OpenAI embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
OpenAIEmbeddingModelName
|
The name of the OpenAI model to use. See OpenAI's embedding models for available options. |
required |
provider
|
OpenAIEmbeddingsCompatibleProvider | Literal['openai'] | Provider[AsyncOpenAI]
|
The provider to use for authentication and API access. Can be:
See OpenAI-compatible providers for a list of supported providers. |
'openai'
|
settings
|
EmbeddingSettings | None
|
Model-specific |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/openai.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | |
LatestCohereEmbeddingModelNames
module-attribute
LatestCohereEmbeddingModelNames = Literal[
"embed-v4.0",
"embed-english-v3.0",
"embed-english-light-v3.0",
"embed-multilingual-v3.0",
"embed-multilingual-light-v3.0",
]
Latest Cohere embeddings models.
See the Cohere Embed documentation for available models and their capabilities.
CohereEmbeddingModelName
module-attribute
CohereEmbeddingModelName = (
str | LatestCohereEmbeddingModelNames
)
Possible Cohere embeddings model names.
CohereEmbeddingSettings
Bases: EmbeddingSettings
Settings used for a Cohere embedding model request.
All fields from EmbeddingSettings are supported,
plus Cohere-specific settings prefixed with cohere_.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/cohere.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | |
cohere_input_type
instance-attribute
cohere_input_type: EmbedInputType
The Cohere-specific input type for the embedding.
Overrides the standard input_type argument. Options include:
'search_query', 'search_document', 'classification', 'clustering', and 'image'.
cohere_truncate
instance-attribute
cohere_truncate: V2EmbedRequestTruncate
The truncation strategy to use:
'NONE'(default): Raise an error if input exceeds max tokens.'END': Truncate the end of the input text.'START': Truncate the start of the input text.
Note: This setting overrides the standard truncate boolean setting when specified.
CohereEmbeddingModel
dataclass
Bases: EmbeddingModel
Cohere embedding model implementation.
This model works with Cohere's embeddings API, which offers multilingual support and various model sizes.
Example:
from pydantic_ai.embeddings.cohere import CohereEmbeddingModel
model = CohereEmbeddingModel('embed-v4.0')
Source code in pydantic_ai_slim/pydantic_ai/embeddings/cohere.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | |
__init__
__init__(
model_name: CohereEmbeddingModelName,
*,
provider: (
Literal["cohere"] | Provider[AsyncClientV2]
) = "cohere",
settings: EmbeddingSettings | None = None
)
Initialize a Cohere embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
CohereEmbeddingModelName
|
The name of the Cohere model to use. See Cohere Embed documentation for available models. |
required |
provider
|
Literal['cohere'] | Provider[AsyncClientV2]
|
The provider to use for authentication and API access. Can be:
|
'cohere'
|
settings
|
EmbeddingSettings | None
|
Model-specific |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/cohere.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
LatestGoogleGLAEmbeddingModelNames
module-attribute
LatestGoogleGLAEmbeddingModelNames = Literal[
"gemini-embedding-001"
]
Latest Google Gemini API (GLA) embedding models.
See the Google Embeddings documentation for available models and their capabilities.
LatestGoogleVertexEmbeddingModelNames
module-attribute
LatestGoogleVertexEmbeddingModelNames = Literal[
"gemini-embedding-001",
"text-embedding-005",
"text-multilingual-embedding-002",
]
Latest Google Vertex AI embedding models.
See the Vertex AI Embeddings documentation for available models and their capabilities.
LatestGoogleEmbeddingModelNames
module-attribute
LatestGoogleEmbeddingModelNames = (
LatestGoogleGLAEmbeddingModelNames
| LatestGoogleVertexEmbeddingModelNames
)
All latest Google embedding models (union of GLA and Vertex AI models).
GoogleEmbeddingModelName
module-attribute
GoogleEmbeddingModelName = (
str | LatestGoogleEmbeddingModelNames
)
Possible Google embeddings model names.
GoogleEmbeddingSettings
Bases: EmbeddingSettings
Settings used for a Google embedding model request.
All fields from EmbeddingSettings are supported,
plus Google-specific settings prefixed with google_.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/google.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
google_task_type
instance-attribute
google_task_type: str
The task type for the embedding.
Overrides the automatic task type selection based on input_type.
See Google's task type documentation
for available options.
google_title
instance-attribute
google_title: str
Optional title for the content being embedded.
Only applicable when task_type is RETRIEVAL_DOCUMENT.
GoogleEmbeddingModel
dataclass
Bases: EmbeddingModel
Google embedding model implementation.
This model works with Google's embeddings API via the google-genai SDK,
supporting both the Gemini API (Google AI Studio) and Vertex AI.
Example:
from pydantic_ai.embeddings.google import GoogleEmbeddingModel
from pydantic_ai.providers.google import GoogleProvider
# Using Gemini API (requires GOOGLE_API_KEY env var)
model = GoogleEmbeddingModel('gemini-embedding-001')
# Using Vertex AI
model = GoogleEmbeddingModel(
'gemini-embedding-001',
provider=GoogleProvider(vertexai=True, project='my-project', location='us-central1'),
)
Source code in pydantic_ai_slim/pydantic_ai/embeddings/google.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 | |
__init__
__init__(
model_name: GoogleEmbeddingModelName,
*,
provider: (
Literal["google-gla", "google-vertex"]
| Provider[Client]
) = "google-gla",
settings: EmbeddingSettings | None = None
)
Initialize a Google embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
GoogleEmbeddingModelName
|
The name of the Google model to use. See Google Embeddings documentation for available models. |
required |
provider
|
Literal['google-gla', 'google-vertex'] | Provider[Client]
|
The provider to use for authentication and API access. Can be:
|
'google-gla'
|
settings
|
EmbeddingSettings | None
|
Model-specific |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/google.py
105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | |
LatestBedrockEmbeddingModelNames
module-attribute
LatestBedrockEmbeddingModelNames = Literal[
"amazon.titan-embed-text-v1",
"amazon.titan-embed-text-v2:0",
"cohere.embed-english-v3",
"cohere.embed-multilingual-v3",
"cohere.embed-v4:0",
"amazon.nova-2-multimodal-embeddings-v1:0",
]
Latest Bedrock embedding model names.
See the Bedrock docs for available embedding models.
BedrockEmbeddingModelName
module-attribute
BedrockEmbeddingModelName = (
str | LatestBedrockEmbeddingModelNames
)
Possible Bedrock embedding model names.
BedrockEmbeddingSettings
Bases: EmbeddingSettings
Settings used for a Bedrock embedding model request.
All fields from EmbeddingSettings are supported,
plus Bedrock-specific settings prefixed with bedrock_.
All settings are optional - if not specified, model defaults are used.
Note on dimensions parameter support:
- Titan v1 (
amazon.titan-embed-text-v1): Not supported (fixed: 1536) - Titan v2 (
amazon.titan-embed-text-v2:0): Supported (default: 1024, accepts 256/384/1024) - Cohere v3 (
cohere.embed-english-v3,cohere.embed-multilingual-v3): Not supported (fixed: 1024) - Cohere v4 (
cohere.embed-v4:0): Supported (default: 1536, accepts 256/512/1024/1536) - Nova (
amazon.nova-2-multimodal-embeddings-v1:0): Supported (default: 3072, accepts 256/384/1024/3072)
Unsupported settings are silently ignored.
Note on truncate parameter support:
- Titan models (
amazon.titan-embed-text-v1,amazon.titan-embed-text-v2:0): Not supported - Cohere models (all versions): Supported (default:
False, maps to'END'whenTrue) - Nova (
amazon.nova-2-multimodal-embeddings-v1:0): Supported (default:False, maps to'END'whenTrue)
For fine-grained truncation control, use model-specific settings: bedrock_cohere_truncate or bedrock_nova_truncate.
Example
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingSettings
# Use model defaults
settings = BedrockEmbeddingSettings()
# Customize specific settings for Titan v2:0
settings = BedrockEmbeddingSettings(
dimensions=512,
bedrock_titan_normalize=True,
)
# Customize specific settings for Cohere v4
settings = BedrockEmbeddingSettings(
dimensions=512,
bedrock_cohere_max_tokens=1000,
)
Source code in pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 | |
bedrock_titan_normalize
instance-attribute
bedrock_titan_normalize: bool
Whether to normalize embedding vectors for Titan models.
Supported by: amazon.titan-embed-text-v2:0 (default: True)
Not supported by: amazon.titan-embed-text-v1 (silently ignored)
When enabled, vectors are normalized for direct cosine similarity calculations.
bedrock_cohere_max_tokens
instance-attribute
bedrock_cohere_max_tokens: int
The maximum number of tokens to embed for Cohere models.
Supported by: cohere.embed-v4:0 (default: 128000)
Not supported by: cohere.embed-english-v3, cohere.embed-multilingual-v3
(silently ignored)
bedrock_cohere_input_type
instance-attribute
bedrock_cohere_input_type: Literal[
"search_document",
"search_query",
"classification",
"clustering",
]
The input type for Cohere models.
Supported by: All Cohere models (cohere.embed-english-v3, cohere.embed-multilingual-v3, cohere.embed-v4:0)
By default, embed_query() uses 'search_query' and embed_documents() uses 'search_document'.
Also accepts 'classification' or 'clustering'.
bedrock_cohere_truncate
instance-attribute
bedrock_cohere_truncate: Literal['NONE', 'START', 'END']
The truncation strategy for Cohere models. Overrides base truncate setting.
Supported by: All Cohere models (cohere.embed-english-v3, cohere.embed-multilingual-v3, cohere.embed-v4:0)
Default: 'NONE'
'NONE': Raise an error if input exceeds max tokens.'START': Truncate the start of the input.'END': Truncate the end of the input.
bedrock_nova_truncate
instance-attribute
bedrock_nova_truncate: Literal['NONE', 'START', 'END']
The truncation strategy for Nova models. Overrides base truncate setting.
Supported by: amazon.nova-2-multimodal-embeddings-v1:0
Default: 'NONE'
'NONE': Raise an error if input exceeds max tokens.'START': Truncate the start of the input.'END': Truncate the end of the input.
bedrock_nova_embedding_purpose
instance-attribute
bedrock_nova_embedding_purpose: Literal[
"GENERIC_INDEX",
"GENERIC_RETRIEVAL",
"TEXT_RETRIEVAL",
"CLASSIFICATION",
"CLUSTERING",
]
The embedding purpose for Nova models.
Supported by: amazon.nova-2-multimodal-embeddings-v1:0
By default, embed_query() uses 'GENERIC_RETRIEVAL' and embed_documents() uses 'GENERIC_INDEX'.
Also accepts 'TEXT_RETRIEVAL', 'CLASSIFICATION', or 'CLUSTERING'.
Note: Multimodal-specific purposes ('IMAGE_RETRIEVAL', 'VIDEO_RETRIEVAL',
'DOCUMENT_RETRIEVAL', 'AUDIO_RETRIEVAL') are not supported as this
embedding client only accepts text input.
bedrock_max_concurrency
instance-attribute
bedrock_max_concurrency: int
Maximum number of concurrent requests for models that don't support batch embedding.
Applies to: amazon.titan-embed-text-v1, amazon.titan-embed-text-v2:0,
amazon.nova-2-multimodal-embeddings-v1:0
When embedding multiple texts with models that only support single-text requests, this controls how many requests run in parallel. Defaults to 5.
BedrockEmbeddingModel
dataclass
Bases: EmbeddingModel
Bedrock embedding model implementation.
This model works with AWS Bedrock's embedding models including Amazon Titan Embeddings and Cohere Embed models.
Example:
from pydantic_ai.embeddings.bedrock import BedrockEmbeddingModel
from pydantic_ai.providers.bedrock import BedrockProvider
# Using default AWS credentials
model = BedrockEmbeddingModel('amazon.titan-embed-text-v2:0')
# Using explicit credentials
model = BedrockEmbeddingModel(
'cohere.embed-english-v3',
provider=BedrockProvider(
region_name='us-east-1',
aws_access_key_id='...',
aws_secret_access_key='...',
),
)
Source code in pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py
482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 | |
__init__
__init__(
model_name: BedrockEmbeddingModelName,
*,
provider: (
Literal["bedrock"] | Provider[BaseClient]
) = "bedrock",
settings: EmbeddingSettings | None = None
)
Initialize a Bedrock embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
BedrockEmbeddingModelName
|
The name of the Bedrock embedding model to use. See Bedrock embedding models for available options. |
required |
provider
|
Literal['bedrock'] | Provider[BaseClient]
|
The provider to use for authentication and API access. Can be:
|
'bedrock'
|
settings
|
EmbeddingSettings | None
|
Model-specific |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py
515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 | |
max_input_tokens
async
max_input_tokens() -> int | None
Get the maximum number of tokens that can be input to the model.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/bedrock.py
661 662 663 | |
LatestVoyageAIEmbeddingModelNames
module-attribute
LatestVoyageAIEmbeddingModelNames = Literal[
"voyage-4-large",
"voyage-4",
"voyage-4-lite",
"voyage-3-large",
"voyage-3.5",
"voyage-3.5-lite",
"voyage-code-3",
"voyage-finance-2",
"voyage-law-2",
"voyage-code-2",
]
Latest VoyageAI embedding models.
See VoyageAI Embeddings for available models and their capabilities.
VoyageAIEmbeddingModelName
module-attribute
VoyageAIEmbeddingModelName = (
str | LatestVoyageAIEmbeddingModelNames
)
Possible VoyageAI embedding model names.
VoyageAIEmbedInputType
module-attribute
VoyageAIEmbedInputType = Literal[
"query", "document", "none"
]
VoyageAI embedding input types.
'query': For search queries; prepends retrieval-optimized prefix.'document': For documents; prepends document retrieval prefix.'none': Direct embedding without any prefix.
VoyageAIEmbeddingSettings
Bases: EmbeddingSettings
Settings used for a VoyageAI embedding model request.
All fields from EmbeddingSettings are supported,
plus VoyageAI-specific settings prefixed with voyageai_.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | |
voyageai_input_type
instance-attribute
voyageai_input_type: VoyageAIEmbedInputType
The VoyageAI-specific input type for the embedding.
Overrides the standard input_type argument. Options include:
'query', 'document', or 'none' for direct embedding without prefix.
VoyageAIEmbeddingModel
dataclass
Bases: EmbeddingModel
VoyageAI embedding model implementation.
VoyageAI provides state-of-the-art embedding models optimized for retrieval, with specialized models for code, finance, and legal domains.
Example:
from pydantic_ai.embeddings.voyageai import VoyageAIEmbeddingModel
model = VoyageAIEmbeddingModel('voyage-3.5')
Source code in pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | |
__init__
__init__(
model_name: VoyageAIEmbeddingModelName,
*,
provider: (
Literal["voyageai"] | Provider[AsyncClient]
) = "voyageai",
settings: EmbeddingSettings | None = None
)
Initialize a VoyageAI embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
VoyageAIEmbeddingModelName
|
The name of the VoyageAI model to use. See VoyageAI models for available options. |
required |
provider
|
Literal['voyageai'] | Provider[AsyncClient]
|
The provider to use for authentication and API access. Can be:
|
'voyageai'
|
settings
|
EmbeddingSettings | None
|
Model-specific |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/voyageai.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
SentenceTransformersEmbeddingSettings
Bases: EmbeddingSettings
Settings used for a Sentence-Transformers embedding model request.
All fields from EmbeddingSettings are supported,
plus Sentence-Transformers-specific settings prefixed with sentence_transformers_.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
sentence_transformers_device
instance-attribute
sentence_transformers_device: str
Device to run inference on.
Examples: 'cpu', 'cuda', 'cuda:0', 'mps' (Apple Silicon).
sentence_transformers_normalize_embeddings
instance-attribute
sentence_transformers_normalize_embeddings: bool
Whether to L2-normalize embeddings.
When True, all embeddings will have unit length, which is useful for
cosine similarity calculations.
sentence_transformers_batch_size
instance-attribute
sentence_transformers_batch_size: int
Batch size to use during encoding.
Larger batches may be faster but require more memory.
SentenceTransformerEmbeddingModel
dataclass
Bases: EmbeddingModel
Local embedding model using the sentence-transformers library.
This model runs embeddings locally on your machine, which is useful for:
- Privacy-sensitive applications where data shouldn't leave your infrastructure
- Reducing API costs for high-volume embedding workloads
- Offline or air-gapped environments
Models are downloaded from Hugging Face on first use. See the Sentence-Transformers documentation for available models.
Example:
from sentence_transformers import SentenceTransformer
from pydantic_ai.embeddings.sentence_transformers import (
SentenceTransformerEmbeddingModel,
)
# Using a model name (downloads from Hugging Face)
model = SentenceTransformerEmbeddingModel('all-MiniLM-L6-v2')
# Using an existing SentenceTransformer instance
st_model = SentenceTransformer('all-MiniLM-L6-v2')
model = SentenceTransformerEmbeddingModel(st_model)
Source code in pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
__init__
__init__(
model: SentenceTransformer | str,
*,
settings: EmbeddingSettings | None = None
) -> None
Initialize a Sentence-Transformers embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
SentenceTransformer | str
|
The model to use. Can be:
|
required |
settings
|
EmbeddingSettings | None
|
Model-specific
|
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/sentence_transformers.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
TestEmbeddingModel
dataclass
Bases: EmbeddingModel
A mock embedding model for testing.
This model returns deterministic embeddings (all 1.0 values) and tracks
the settings used in the last call via the last_settings attribute.
Example:
from pydantic_ai import Embedder
from pydantic_ai.embeddings import TestEmbeddingModel
test_model = TestEmbeddingModel()
embedder = Embedder('openai:text-embedding-3-small')
async def main():
with embedder.override(model=test_model):
await embedder.embed_query('test')
assert test_model.last_settings is not None
Source code in pydantic_ai_slim/pydantic_ai/embeddings/test.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | |
__init__
__init__(
model_name: str = "test",
*,
provider_name: str = "test",
dimensions: int = 8,
settings: EmbeddingSettings | None = None
)
Initialize the test embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
The model name to report in results. |
'test'
|
provider_name
|
str
|
The provider name to report in results. |
'test'
|
dimensions
|
int
|
The number of dimensions for the generated embeddings. |
8
|
settings
|
EmbeddingSettings | None
|
Optional default settings for the model. |
None
|
Source code in pydantic_ai_slim/pydantic_ai/embeddings/test.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
last_settings
class-attribute
instance-attribute
last_settings: EmbeddingSettings | None = None
The settings used in the most recent embed call.
WrapperEmbeddingModel
dataclass
Bases: EmbeddingModel
Base class for embedding models that wrap another model.
Use this as a base class to create custom embedding model wrappers that modify behavior (e.g., caching, logging, rate limiting) while delegating to an underlying model.
By default, all methods are passed through to the wrapped model. Override specific methods to customize behavior.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
__init__
__init__(wrapped: EmbeddingModel | str)
Initialize the wrapper with an embedding model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
wrapped
|
EmbeddingModel | str
|
The model to wrap. Can be an
|
required |
Source code in pydantic_ai_slim/pydantic_ai/embeddings/wrapper.py
30 31 32 33 34 35 36 37 38 39 40 41 | |
wrapped
instance-attribute
wrapped: EmbeddingModel = (
infer_embedding_model(wrapped)
if isinstance(wrapped, str)
else wrapped
)
The underlying embedding model being wrapped.
settings
property
settings: EmbeddingSettings | None
Get the settings from the wrapped embedding model.
instrument_embedding_model
instrument_embedding_model(
model: EmbeddingModel,
instrument: InstrumentationSettings | bool,
) -> EmbeddingModel
Instrument an embedding model with OpenTelemetry/logfire.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py
30 31 32 33 34 35 36 37 38 | |
InstrumentedEmbeddingModel
dataclass
Bases: WrapperEmbeddingModel
Embedding model which wraps another model so that requests are instrumented with OpenTelemetry.
See the Debugging and Monitoring guide for more info.
Source code in pydantic_ai_slim/pydantic_ai/embeddings/instrumented.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
instrumentation_settings
instance-attribute
instrumentation_settings: InstrumentationSettings = (
options or InstrumentationSettings()
)
Instrumentation settings for this model.