olmocr is a toolkit for converting PDFs and other image-based document formats
into clean, readable, plain text format.

Features:
* Convert PDF, PNG, and JPEG based documents into clean Markdown
* Support for equations, tables, handwriting, and complex formatting
* Automatically removes headers and footers
* Convert into text with a natural reading order, even in the presence of
  figures, multi-column layouts, and insets
* Efficient, less than $200 USD per million pages converted
