Providers And Configuration#
All Churro OCR backends use the same builder entry point:
from churro_ocr.providers import OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="litellm",
model="vertex_ai/gemini-2.5-flash",
)
)
Which OCR Backend Should You Use?#
Provider |
Install extra |
Good default when |
|---|---|---|
|
|
you want hosted multimodal models routed through LiteLLM |
|
|
you have a local or self-hosted OpenAI-style server |
|
|
you want local Transformers inference in-process |
|
|
you want higher-throughput local serving |
|
|
you want Azure Document Intelligence OCR |
|
|
you want Mistral OCR |
Recommended Starting Points#
Situation |
Good default |
Why |
|---|---|---|
hosted OCR |
|
easiest hosted path with the standard builder interface |
local OCR |
|
first-party local model support in-process |
higher-throughput local serving |
|
better fit when you want a served local backend |
Hosted Providers#
LiteLLM#
from churro_ocr.providers import OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="litellm",
model="vertex_ai/gemini-2.5-flash",
)
)
Override transport or completion settings when you need to:
from churro_ocr.providers import LiteLLMTransportConfig, OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="litellm",
model="gpt-4.1-mini",
transport=LiteLLMTransportConfig(
api_base="https://example.invalid/v1",
api_key="secret",
api_version="2025-01-01-preview",
completion_kwargs={"temperature": 0},
),
)
)
Azure Document Intelligence#
from churro_ocr.providers import (
AzureDocumentIntelligenceOptions,
OCRBackendSpec,
build_ocr_backend,
)
backend = build_ocr_backend(
OCRBackendSpec(
provider="azure",
options=AzureDocumentIntelligenceOptions(
endpoint="https://<resource>.cognitiveservices.azure.com/",
api_key="<azure-doc-intelligence-key>",
),
)
)
Mistral OCR#
from churro_ocr.providers import MistralOptions, OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="mistral",
model="mistral-ocr-latest",
options=MistralOptions(api_key="<mistral-api-key>"),
)
)
Local And Self-Hosted Providers#
OpenAI-compatible#
from churro_ocr.providers import (
LiteLLMTransportConfig,
OCRBackendSpec,
build_ocr_backend,
)
backend = build_ocr_backend(
OCRBackendSpec(
provider="openai-compatible",
model="local-model",
transport=LiteLLMTransportConfig(
api_base="http://127.0.0.1:8000/v1",
api_key="dummy",
),
)
)
Hugging Face#
from churro_ocr.providers import HuggingFaceOptions, OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="hf",
model="stanford-oval/churro-3B",
options=HuggingFaceOptions(
model_kwargs={"device_map": "auto", "torch_dtype": "auto"},
),
)
)
vLLM#
from churro_ocr.providers import OCRBackendSpec, VLLMOptions, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="vllm",
model="stanford-oval/churro-3B",
options=VLLMOptions(),
)
)
OCRBackendSpec Reference#
Field |
Meaning |
|---|---|
|
One of |
|
Required for |
|
|
|
Shared request transport config for LiteLLM-based providers. |
|
Provider-specific dataclass matching |
Provider Option Dataclasses#
Type |
Used by |
Required fields |
Notes |
|---|---|---|---|
|
|
None at the dataclass level |
Use this for transport, credentials, and completion settings. |
|
|
None |
Use |
|
|
None |
Carries runtime, processor, generation, and template options. |
|
|
None |
Carries runtime and sampling settings for vLLM. |
|
|
|
|
|
|
|
|
Advanced Customization#
Custom Profiles And Templates#
Most users should rely on the built-in model profiles. If you need to override prompt rendering for a custom Hugging Face model, pass a custom OCRModelProfile.
from churro_ocr import HFChatTemplate
from churro_ocr.providers import (
HuggingFaceOptions,
OCRBackendSpec,
OCRModelProfile,
build_ocr_backend,
)
backend = build_ocr_backend(
OCRBackendSpec(
provider="hf",
model="your-org/your-vlm",
profile=OCRModelProfile(
profile_name="custom",
template=HFChatTemplate(
system_message="Transcribe the page exactly.",
user_prompt=None,
),
),
options=HuggingFaceOptions(model_kwargs={"device_map": "auto"}),
)
)
Prompt And Template Exports#
Useful public template exports:
Export |
Module |
Use case |
|---|---|---|
|
|
Build a Hugging Face chat-style multimodal prompt. |
|
|
Generic OCR prompt template used by the default model profile. |
|
|
Built-in template for |
|
|
Built-in template for |
|
|
Base protocol for custom profile integration. |
Useful public prompt exports:
Export |
Module |
Use case |
|---|---|---|
|
|
Default system instruction for generic OCR prompting. |
|
|
Default user prompt for plain OCR output. |
|
|
Default user prompt when markdown-style OCR output is preferred. |
|
|
Shared tag name used by the default OCR postprocessor. |
|
|
Default prompt used by LLM-based page and text-block boundary detection helpers. |
|
|
Remove the default OCR wrapper tag from model output. |