Providers And Configuration#

All Churro OCR backends use the same builder entry point:

from churro_ocr.providers import OCRBackendSpec, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="litellm",
        model="vertex_ai/gemini-2.5-flash",
    )
)

Which OCR Backend Should You Use?#

Provider

Install extra

Good default when

litellm

llm

you want hosted multimodal models routed through LiteLLM

openai-compatible

local

you have a local or self-hosted OpenAI-style server

hf

hf

you want local Transformers inference in-process

vllm

vllm

you want higher-throughput local serving

azure

azure

you want Azure Document Intelligence OCR

mistral

mistral

you want Mistral OCR

Hosted Providers#

LiteLLM#

from churro_ocr.providers import OCRBackendSpec, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="litellm",
        model="vertex_ai/gemini-2.5-flash",
    )
)

Override transport or completion settings when you need to:

from churro_ocr.providers import LiteLLMTransportConfig, OCRBackendSpec, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="litellm",
        model="gpt-4.1-mini",
        transport=LiteLLMTransportConfig(
            api_base="https://example.invalid/v1",
            api_key="secret",
            api_version="2025-01-01-preview",
            completion_kwargs={"temperature": 0},
        ),
    )
)

Azure Document Intelligence#

from churro_ocr.providers import (
    AzureDocumentIntelligenceOptions,
    OCRBackendSpec,
    build_ocr_backend,
)

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="azure",
        options=AzureDocumentIntelligenceOptions(
            endpoint="https://<resource>.cognitiveservices.azure.com/",
            api_key="<azure-doc-intelligence-key>",
        ),
    )
)

Mistral OCR#

from churro_ocr.providers import MistralOptions, OCRBackendSpec, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="mistral",
        model="mistral-ocr-latest",
        options=MistralOptions(api_key="<mistral-api-key>"),
    )
)

Local And Self-Hosted Providers#

OpenAI-compatible#

from churro_ocr.providers import (
    LiteLLMTransportConfig,
    OCRBackendSpec,
    build_ocr_backend,
)

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="openai-compatible",
        model="local-model",
        transport=LiteLLMTransportConfig(
            api_base="http://127.0.0.1:8000/v1",
            api_key="dummy",
        ),
    )
)

Hugging Face#

from churro_ocr.providers import HuggingFaceOptions, OCRBackendSpec, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="hf",
        model="stanford-oval/churro-3B",
        options=HuggingFaceOptions(
            model_kwargs={"device_map": "auto", "torch_dtype": "auto"},
        ),
    )
)

vLLM#

from churro_ocr.providers import OCRBackendSpec, VLLMOptions, build_ocr_backend

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="vllm",
        model="stanford-oval/churro-3B",
        options=VLLMOptions(),
    )
)

OCRBackendSpec Reference#

Field

Meaning

provider

One of litellm, openai-compatible, azure, mistral, hf, or vllm.

model

Required for litellm, openai-compatible, hf, and vllm. Optional for azure. Defaults to mistral-ocr-latest when omitted for mistral.

profile

None, a built-in profile name, or a custom OCRModelProfile.

transport

Shared request transport config for LiteLLM-based providers.

options

Provider-specific dataclass matching provider.

Provider Option Dataclasses#

Type

Used by

Required fields

Notes

LiteLLMTransportConfig

litellm, openai-compatible, LLMPageDetector

None at the dataclass level

Use this for transport, credentials, and completion settings.

OpenAICompatibleOptions

openai-compatible

None

Use model_prefix when your local server expects a provider prefix.

HuggingFaceOptions

hf

None

Carries runtime, processor, generation, and template options.

VLLMOptions

vllm

None

Carries runtime and sampling settings for vLLM.

AzureDocumentIntelligenceOptions

azure

endpoint, api_key

model is optional for Azure OCR in OCRBackendSpec.

MistralOptions

mistral

api_key

model defaults to mistral-ocr-latest when omitted.

Advanced Customization#

Custom Profiles And Templates#

Most users should rely on the built-in model profiles. If you need to override prompt rendering for a custom Hugging Face model, pass a custom OCRModelProfile.

from churro_ocr import HFChatTemplate
from churro_ocr.providers import (
    HuggingFaceOptions,
    OCRBackendSpec,
    OCRModelProfile,
    build_ocr_backend,
)

backend = build_ocr_backend(
    OCRBackendSpec(
        provider="hf",
        model="your-org/your-vlm",
        profile=OCRModelProfile(
            profile_name="custom",
            template=HFChatTemplate(
                system_message="Transcribe the page exactly.",
                user_prompt=None,
            ),
        ),
        options=HuggingFaceOptions(model_kwargs={"device_map": "auto"}),
    )
)

Prompt And Template Exports#

Useful public template exports:

Export

Module

Use case

HFChatTemplate

churro_ocr.templates

Build a Hugging Face chat-style multimodal prompt.

DEFAULT_OCR_TEMPLATE

churro_ocr.templates

Generic OCR prompt template used by the default model profile.

CHURRO_3B_XML_TEMPLATE

churro_ocr.templates

Built-in template for stanford-oval/churro-3B.

DOTS_OCR_1_5_OCR_TEMPLATE

churro_ocr.templates

Built-in template for kristaller486/dots.ocr-1.5.

OCRPromptTemplate

churro_ocr.templates

Base protocol for custom profile integration.

Useful public prompt exports:

Export

Module

Use case

DEFAULT_OCR_SYSTEM_PROMPT

churro_ocr.prompts

Default system instruction for generic OCR prompting.

DEFAULT_OCR_USER_PROMPT

churro_ocr.prompts

Default user prompt for plain OCR output.

DEFAULT_MARKDOWN_OCR_USER_PROMPT

churro_ocr.prompts

Default user prompt when markdown-style OCR output is preferred.

DEFAULT_OCR_OUTPUT_TAG

churro_ocr.prompts

Shared tag name used by the default OCR postprocessor.

DEFAULT_BOUNDARY_DETECTION_PROMPT

churro_ocr.prompts

Default prompt used by LLM-based page and text-block boundary detection helpers.

strip_ocr_output_tag(...)

churro_ocr.prompts

Remove the default OCR wrapper tag from model output.