datalab-to
/

chandra

+---
+license: openrail
+library_name: transformers
+tags:
+- ocr
+- vlm
+---
+# Chandra
+Chandra is an OCR model that outputs markdown, HTML, and JSON.  It is highly accurate at extracting text from images and PDFs, while preserving layout information.
+You can try Chandra in the free playground [here](https://www.datalab.to/playground), or at a hosted API [here](https://www.datalab.to/).
+## Features
+- Convert documents to markdown, html, or json with detailed layout information
+- Good handwriting support
+- Reconstructs forms accurately, including checkboxes
+- Good support for tables, math, and complex layouts
+- Extracts images and diagrams, with captions and structured data
+- Support for 40+ languages
+## Quickstart
+The easiest way to start is with the CLI tools:
+```shell
+pip install chandra-ocr
+# With VLLM
+chandra_vllm
+chandra input.pdf ./output
+# With HuggingFace
+chandra input.pdf ./output --method hf
+# Interactive streamlit app
+chandra_app
+```
+## Benchmarks
+| **Model** |  ArXiv   | Old Scans Math |  Tables  | Old Scans | Headers and Footers | Multi column | Long tiny text |   Base   |    Overall     |
+|:----------|:--------:|:--------------:|:--------:|:---------:|:-------------------:|:------------:|:--------------:|:--------:|:--------------:|
+| Datalab Chandra v0.1.0 |   81.4   |    **80.3**    | **89.4** | **50.0**  |        88.3         |   **81.0**   |    **91.6**    | **99.9** | **82.7 ± 0.9** |
+| Datalab Marker v1.10.0 | **83.8** |      69.7      |   74.8   |   32.3    |        86.6         |     79.4     |      85.7      |   99.6   |   76.5 ± 1.0   |
+| Mistral OCR API |   77.2   |      67.5      |   60.6   |   29.3    |        93.6         |     71.3     |      77.1      |   99.4   |   72.0 ± 1.1   |
+| Deepseek OCR |   75.2   |      67.9      |   79.1   |   32.9    |        96.1         |     66.3     |      78.5      |   97.7   |   74.2 ± 1.0   |
+| GPT-4o (Anchored) |   53.5   |      74.5      |   70.0   |   40.7    |        93.8         |     69.3     |      60.6      |   96.8   |   69.9 ± 1.1   |
+| Gemini Flash 2 (Anchored) |   54.5   |      56.1      |   72.1   |   34.2    |        64.7         |     61.5     |      71.5      |   95.6   |   63.8 ± 1.2   |
+| Qwen 3 VL |   70.2   |      75.1      |   45.6   |   37.5    |        89.1         |     62.1     |      43.0      |   94.3   |   64.6 ± 1.1   |
+| olmOCR v0.3.0 |   78.6   |      79.9      |   72.9   |   43.9    |      **95.1**       |     77.3     |      81.2      |   98.9   |   78.5 ± 1.1   |
+## Examples
+| Type | Name | Link |
+|------|------|------|
+| Tables | Water Damage Form | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/tables/water_damage.png) |
+| Tables | 10K Filing | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/tables/10k.png) |
+| Forms | Handwritten Form | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/forms/handwritten_form.png) |
+| Forms | Lease Agreement | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/forms/lease.png) |
+| Handwriting | Doctor Note | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/handwriting/doctor_note.png) |
+| Handwriting | Math Homework | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/handwriting/math_hw.png) |
+| Books | Geography Textbook | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/books/geo_textbook_page.png) |
+| Books | Exercise Problems | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/books/exercises.png) |
+| Math | Attention Diagram | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/attn_all.png) |
+| Math | Worksheet | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/worksheet.png) |
+| Math | EGA Page | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/ega.png) |
+| Newspapers | New York Times | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/newspapers/nyt.png) |
+| Newspapers | LA Times | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/newspapers/la_times.png) |
+| Other | Transcript | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/transcript.png) |
+| Other | Flowchart | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/flowchart.png) |
+## Usage
+### Installation
+```shell
+pip install chandra-ocr
+```
+### From code
+```python
+from chandra.model import InferenceManager
+from chandra.model.schema import BatchInputItem
+# Run chandra_vllm to start a vLLM server first if you pass vllm, else pass hf
+# you can also start your own vllm server with the datalab-to/chandra model
+manager = InferenceManager(method="vllm")
+batch = [
+    BatchInputItem(
+        image=PIL_IMAGE,
+        prompt_type="ocr_layout"
+    )
+]
+result = manager.generate(batch)[0]
+print(result.markdown)
+```
+### With transformers
+```python
+from transformers import AutoModel, AutoProcessor
+from chandra.model.hf import generate_hf
+from chandra.model.schema import BatchInputItem
+from chandra.output import parse_markdown
+model = AutoModel.from_pretrained("datalab-to/chandra").cuda()
+model.processor = AutoProcessor.from_pretrained("datalab-to/chandra")
+batch = [
+    BatchInputItem(
+        image=PIL_IMAGE,
+        prompt_type="ocr_layout"
+    )
+]
+result = generate_hf(batch, model)[0]
+markdown = parse_markdown(result.raw)
+```
+# Credits
+Thank you to the following open source projects:
+- [Huggingface Transformers](https://github.com/huggingface/transformers)
+- [VLLM](https://github.com/vllm-project/vllm)
+- [olmocr](github.com/allenai/olmocr)
+- [Qwen 3 VL](https://github.com/QwenLM/Qwen3)