Hugging Face

Hugging Face is an AI platform with all major open source models, datasets, MCPs, and demos. You can use Inference Providers to run open source models like DeepSeek R1 on scalable serverless infrastructure.

Install

To use HuggingFaceModel, you need to either install pydantic-ai, or install pydantic-ai-slim with the huggingface optional group:

pipuv

pip install "pydantic-ai-slim[huggingface]"

uv add "pydantic-ai-slim[huggingface]"

Configuration

To use Hugging Face inference, you'll need to set up an account which will give you free tier allowance on Inference Providers. To setup inference, follow these steps:

Go to Hugging Face and sign up for an account.
Create a new access token in Hugging Face.
Set the HF_TOKEN environment variable to the token you just created.

Once you have a Hugging Face access token, you can set it as an environment variable:

export HF_TOKEN='hf_token'

Usage

You can then use HuggingFaceModel by name:

from pydantic_ai import Agent

agent = Agent('huggingface:Qwen/Qwen3-235B-A22B')
...

Or initialise the model directly with just the model name:

from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B')
agent = Agent(model)
...

By default, the HuggingFaceModel uses the HuggingFaceProvider that will select automatically the first of the inference providers (Cerebras, Together AI, Cohere..etc) available for the model, sorted by your preferred order in https://hf.co/settings/inference-providers.

Configure the provider

If you want to pass parameters in code to the provider, you can programmatically instantiate the HuggingFaceProvider and pass it to the model:

from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

model = HuggingFaceModel('Qwen/Qwen3-235B-A22B', provider=HuggingFaceProvider(api_key='hf_token', provider_name='nebius'))
agent = Agent(model)
...

Custom Hugging Face client

HuggingFaceProvider also accepts a custom AsyncInferenceClient client via the hf_client parameter, so you can customise the headers, bill_to (billing to an HF organization you're a member of), base_url etc. as defined in the Hugging Face Hub python library docs.

from huggingface_hub import AsyncInferenceClient

from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.providers.huggingface import HuggingFaceProvider

client = AsyncInferenceClient(
    bill_to='openai',
    api_key='hf_token',
    provider='fireworks-ai',
)

model = HuggingFaceModel(
    'Qwen/Qwen3-235B-A22B',
    provider=HuggingFaceProvider(hf_client=client),
)
agent = Agent(model)
...