`churro_ocr.prompts`#

Public prompt defaults used by churro-ocr backends.

churro_ocr.prompts.parse_chandra_response(text)[source]#

Extract plain text and metadata from a Chandra HTML-layout response.

churro_ocr.prompts.parse_olmocr_response(text)[source]#

Extract plain text and metadata from an olmOCR YAML-front-matter response.

churro_ocr.prompts.strip_ocr_output_tag(text, *, output_tag=DEFAULT_OCR_OUTPUT_TAG)[source]#

Remove outer OCR output tags and any stray tag tokens when present.

Parameters:

Returns:

OCR text with the outer wrapper removed when present.

Return type:

str

churro_ocr.prompts.strip_rich_ocr_markup_to_plain_text(text)[source]#

Best-effort plain-text conversion for OCR markdown/HTML output.

churro_ocr.prompts#