Providers And Configuration#
Use this page to choose a backend and install the matching runtime. For custom profiles, prompt templates, and response helpers, continue with Advanced Customization.
All Churro OCR backends use the same builder entry point:
from churro_ocr.providers import OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="hf",
model="stanford-oval/churro-3B",
)
)
Runtime Install Matrix#
Install the base package first as shown in Getting Started.
Commands on this page assume the CLI is installed and available as churro-ocr.
Provider or feature |
Install command |
Good default when |
|---|---|---|
|
|
you want hosted multimodal OCR routed through LiteLLM |
|
|
you have a local or self-hosted OpenAI-style server |
|
|
you want local Transformers inference in-process |
|
|
you want Azure Document Intelligence OCR or page detection |
|
|
you want Mistral OCR |
|
|
you want |
|
|
you want every optional runtime in one environment |
hf and all also install a PyTorch runtime.
Pass --torch-backend <name> when you need a specific build, for example churro-ocr install hf --torch-backend cu126.
Recommended Starting Points#
Situation |
Good default |
Why |
|---|---|---|
local OCR with no API account |
|
matches the quickest credential-free onboarding path |
hosted OCR |
|
easiest hosted path with the standard builder interface |
layout-heavy local OCR |
|
built-in profile matches Chandra’s layout-oriented defaults |
higher-throughput local serving |
|
good when you already run a served local backend such as vLLM or llama.cpp |
managed OCR APIs |
|
provider-managed OCR without local model weights |
Minimal Provider Examples#
Hugging Face#
from churro_ocr.providers import OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="hf",
model="stanford-oval/churro-3B",
)
)
Built-in model-specific profiles are resolved automatically for known models such as stanford-oval/churro-3B, datalab-to/chandra-ocr-2, deepseek-ai/DeepSeek-OCR-2, FireRedTeam/FireRed-OCR, nanonets/Nanonets-OCR2-3B, baidu/Qianfan-OCR, zai-org/GLM-OCR, kristaller486/dots.ocr-1.5, rednote-hilab/dots.mocr, infly/Infinity-Parser-7B, opendatalab/MinerU2.5-2509-1.2B, PaddlePaddle/PaddleOCR-VL-1.5, LiquidAI/LFM2.5-VL-1.6B, and the supported olmOCR checkpoints.
For FireRedTeam/FireRed-OCR, the built-in hf and openai-compatible backends use the model’s published Markdown-conversion prompt. The OCR result preserves the raw markdown in metadata, and repo-local benchmark evaluation normalizes that markdown or embedded HTML back to plain text before metrics are computed.
For nanonets/Nanonets-OCR2-3B, the built-in hf and openai-compatible backends use the model’s published structured-markdown OCR prompt. The OCR result preserves the raw markdown in metadata, and tagged markdown or embedded HTML is normalized back to plain text for evaluation-friendly output.
For baidu/Qianfan-OCR, the built-in hf and openai-compatible backends use the published Parse this document to Markdown. prompt. The OCR result preserves the raw markdown in metadata, and repo-local benchmark evaluation normalizes that markdown or embedded HTML back to plain text before metrics are computed.
For zai-org/GLM-OCR, the built-in hf and openai-compatible backends both use the model’s documented Text Recognition: prompt
For infly/Infinity-Parser-7B, the built-in hf and openai-compatible backends use the documented markdown-conversion prompt and treat the response as markdown or embedded HTML. The OCR result preserves the raw markdown in metadata, and repo-local benchmark evaluation normalizes that markdown or HTML back to plain text before metrics are computed.
For opendatalab/MinerU2.5-2509-1.2B, the built-in hf and openai-compatible backends both run the model’s two-step layout-plus-block pipeline and return markdown with embedded HTML tables when needed. Repo-local benchmark evaluation normalizes that markdown or HTML back to plain text before metrics are computed.
LiteLLM#
from churro_ocr.providers import OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="litellm",
model="vertex_ai/gemini-2.5-flash",
)
)
OpenAI-compatible#
from churro_ocr.providers import (
LiteLLMTransportConfig,
OCRBackendSpec,
build_ocr_backend,
)
backend = build_ocr_backend(
OCRBackendSpec(
provider="openai-compatible",
model="local-model",
transport=LiteLLMTransportConfig(
api_base="http://127.0.0.1:8000/v1",
),
)
)
If you want to use vLLM or llama.cpp, serve it separately and point this backend at that server’s OpenAI-compatible endpoint. See the official vLLM serving docs or the official llama.cpp serving docs.
Azure Document Intelligence#
from churro_ocr.providers import (
AzureDocumentIntelligenceOptions,
OCRBackendSpec,
build_ocr_backend,
)
backend = build_ocr_backend(
OCRBackendSpec(
provider="azure",
options=AzureDocumentIntelligenceOptions(
endpoint="https://<resource>.cognitiveservices.azure.com/",
api_key="<azure-doc-intelligence-key>",
),
)
)
Mistral OCR#
from churro_ocr.providers import MistralOptions, OCRBackendSpec, build_ocr_backend
backend = build_ocr_backend(
OCRBackendSpec(
provider="mistral",
model="mistral-ocr-2512",
options=MistralOptions(api_key="<mistral-api-key>"),
)
)
Next Steps#
Use OCR Workflows for Python recipes built on these backends.
Use CLI for shell commands, quick checks, and page extraction.
Use Advanced Customization for custom
OCRModelProfilework, prompt/template exports, and response helpers.Use the Provider APIs, templates API, and prompts API when you need exact type definitions and signatures.