Getting Started#

churro-ocr is the Python package and CLI for OCR on one-page images, photographed spreads, and PDFs. This page takes the shortest path to one successful local transcription before branching into task-specific guides.

Prerequisites#

  • Python 3.12 or newer

  • uv available on PATH

Install the CLI#

For the CLI-first workflow used in this guide, install Churro with UV as a tool.

uv tool install churro-ocr

If you are adding churro-ocr to a project instead, use uv add churro-ocr and prefix the CLI commands below with uv run.

Install the First Runtime#

The canonical getting-started path uses the local Hugging Face backend and the stanford-oval/churro-3B model.

churro-ocr install hf

Use --torch-backend with hf when you need a specific PyTorch build:

churro-ocr install hf --torch-backend cu126

For hosted providers, self-hosted OpenAI-compatible servers, Azure, Mistral, or PDF support, continue with Providers And Configuration.

First Successful Run#

churro-ocr transcribe \
  --image scan.png \
  --backend hf \
  --model stanford-oval/churro-3B

This prints the OCR text to stdout. Add --output output.txt when you want the CLI to write the text to a file instead.

If You’re Writing Python Next#

Goal

Start with

OCR one page or one image

OCRClient

Detect page crops only

DocumentPageDetector

Run an end-to-end image or PDF OCR workflow

DocumentOCRPipeline

Tune provider options directly

build_ocr_backend(...) + OCRBackendSpec

For the page-and-pipeline mental model behind those types, read Core Concepts.

Where To Go Next#

CLI

Stay in the shell for OCR checks, page extraction, and runtime installs.

CLI
OCR Workflows

Use the Python API for single-page OCR, PDFs, photographed spreads, and async flows.

OCR Workflows
Page Detection

Extract page crops without OCR, or choose a detector backend for boundary discovery.

Page Detection
Providers And Configuration

Choose another backend, install its runtime, and see minimal provider setup examples.

Providers And Configuration
Core Concepts

Learn the DocumentPage and pipeline model that ties the APIs together.

Core Concepts

Working From the Source Code#

If you are developing from a clone instead of installing from PyPI, use the contributor instructions in Contributing.