CLI#

Use the CLI when you want to validate a backend, transcribe one image, or extract page crops without writing Python.

Use churro-ocr --help or python -m churro_ocr --help to inspect the top-level commands.

Install the CLI#

Python 3.12 or newer is required.

uv tool install churro-ocr

If you are adding churro-ocr to a project instead, use uv add churro-ocr and prefix the commands on this page with uv run.

Install a Runtime#

Choose the optional runtime that matches the backend or feature you want to use:

Target	Command	Use it when
`hf`	`churro-ocr install hf`	you want local Transformers OCR in-process
`llm`	`churro-ocr install llm`	you want hosted multimodal OCR through LiteLLM-backed providers
`local`	`churro-ocr install local`	you have a local or self-hosted OpenAI-style server
`azure`	`churro-ocr install azure`	you want Azure Document Intelligence OCR or page detection
`mistral`	`churro-ocr install mistral`	you want Mistral OCR
`pdf`	`churro-ocr install pdf`	you want `extract-pages --pdf` or PDF workflows in Python
`all`	`churro-ocr install all`	you want every optional runtime in one environment

Use --torch-backend with hf or all when you need a specific PyTorch build:

churro-ocr install hf --torch-backend cu126

The examples below use the local hf path first. For backend choice and Python setup, continue with Providers And Configuration.

First Successful Transcription#

churro-ocr transcribe \
  --image scan.png \
  --backend hf \
  --model stanford-oval/churro-3B

`transcribe` Examples#

Write OCR Text To A File#

churro-ocr transcribe \
  --image scan.png \
  --backend hf \
  --model stanford-oval/churro-3B \
  --output output.txt

This writes the OCR text to output.txt and prints that written path to stdout.

OCR With LiteLLM#

churro-ocr transcribe \
  --image scan.png \
  --backend litellm \
  --model vertex_ai/gemini-2.5-flash

OCR With A Local OpenAI-compatible Server#

churro-ocr transcribe \
  --image scan.png \
  --backend openai-compatible \
  --model local-model \
  --base-url http://127.0.0.1:8000/v1

For vLLM or llama.cpp, serve the model separately with its OpenAI-compatible server and then use this same openai-compatible route. See the official vLLM serving docs or the official llama.cpp serving docs.

`extract-pages` Examples#

Extract Pages From An Image#

churro-ocr extract-pages \
  --image spread.jpg \
  --output-dir pages/

This writes sequential PNG files such as page_0000.png, page_0001.png, and so on, and prints each written path to stdout.

Extract Pages With Azure Page Detection#

churro-ocr extract-pages \
  --image spread.jpg \
  --output-dir pages/ \
  --page-detector azure \
  --endpoint https://<resource>.cognitiveservices.azure.com/ \
  --api-key <azure-doc-intelligence-key>

Extract Pages From A PDF#

Install pdf first if you have not already:

churro-ocr install pdf

Then extract rasterized PDF pages as PNG files:

churro-ocr extract-pages \
  --pdf document.pdf \
  --output-dir pages/ \
  --dpi 300 \
  --trim-margin 30

Use Page Detection when you want the Python API for detection only. Use OCR Workflows when you want page detection and OCR together in Python.

Command Contracts#

`transcribe` Backends#

`--backend` value	Required flags	Notes
`litellm`	`--model`	Uses LiteLLM credentials and routing. `--base-url`, `--api-key`, and `--api-version` are optional transport overrides.
`openai-compatible`	`--model`, `--base-url`	For local or self-hosted OpenAI-style servers. `--api-key` is optional.
`azure`	`--endpoint`, `--api-key`	`--model` is optional.
`mistral`	`--api-key`, `--model`	`--model` must be either `mistral-ocr-2505` or `mistral-ocr-2512`.
`hf`	`--model`	Local Transformers OCR.

`extract-pages` Detectors#

`--page-detector` value	Required flags	Notes
`none`	none	Default behavior. Treats the whole image or rasterized PDF page as one crop.
`llm`	`--model`	Uses `LLMPageDetector`. `--base-url`, `--api-key`, and `--api-version` are optional transport overrides.
`azure`	`--endpoint`, `--api-key`	Uses Azure Document Intelligence layout detection.

Additional Rules#

transcribe requires exactly one --image.
--output writes OCR text to a file and prints the written path.
extract-pages requires exactly one of --image or --pdf.
--dpi only affects the --pdf path because PDFs are rasterized before page detection.
--trim-margin expands each detected crop by the requested number of pixels, clipped to image bounds.