churro_ocr.templates#

Template helpers for model-specific OCR input rendering.

churro_ocr.templates.build_ocr_conversation(template, page)[source]#

Build an OCR conversation from a template or template callable.

Parameters:
Returns:

Structured OCR conversation for page.

Return type:

list[dict[str, Any]]

class churro_ocr.templates.HFChatTemplate[source]#

Bases: object

Template for processor/tokenizer chat-template OCR models.

Parameters:
  • system_message – Optional system message prepended to the conversation.

  • user_prompt – Optional user-side text prompt appended with the image.

  • include_image – Whether to include the page image in the user message.

__init__(system_message=None, user_prompt=None, include_image=True)#
Parameters:
  • system_message (str | None)

  • user_prompt (str | None)

  • include_image (bool)

Return type:

None

build_conversation(page)[source]#

Build a structured multimodal conversation for one OCR page.

Parameters:

page (DocumentPage) – Page to represent in the conversation.

Returns:

Conversation payload suitable for chat-template OCR models.

Return type:

list[dict[str, Any]]

class churro_ocr.templates.OCRPromptTemplate[source]#

Bases: Protocol

Protocol for OCR templates that build model conversations.

__init__(*args, **kwargs)#
build_conversation(page)[source]#

Build a model conversation for one page.

Parameters:

page (DocumentPage) – Page to convert into a model-specific prompt payload.

Returns:

Structured conversation ready for backend-specific rendering.

Return type:

list[dict[str, Any]]