vikp commited on
Commit
93ec167
·
verified ·
1 Parent(s): 752042c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail
3
+ library_name: transformers
4
+ tags:
5
+ - ocr
6
+ - vlm
7
+ ---
8
+
9
+ # Chandra
10
+
11
+ Chandra is an OCR model that outputs markdown, HTML, and JSON. It is highly accurate at extracting text from images and PDFs, while preserving layout information.
12
+
13
+ You can try Chandra in the free playground [here](https://www.datalab.to/playground), or at a hosted API [here](https://www.datalab.to/).
14
+
15
+ ## Features
16
+
17
+ - Convert documents to markdown, html, or json with detailed layout information
18
+ - Good handwriting support
19
+ - Reconstructs forms accurately, including checkboxes
20
+ - Good support for tables, math, and complex layouts
21
+ - Extracts images and diagrams, with captions and structured data
22
+ - Support for 40+ languages
23
+
24
+ ## Quickstart
25
+
26
+ The easiest way to start is with the CLI tools:
27
+
28
+ ```shell
29
+ pip install chandra-ocr
30
+
31
+ # With VLLM
32
+ chandra_vllm
33
+ chandra input.pdf ./output
34
+
35
+ # With HuggingFace
36
+ chandra input.pdf ./output --method hf
37
+
38
+ # Interactive streamlit app
39
+ chandra_app
40
+ ```
41
+
42
+ ## Benchmarks
43
+
44
+ | **Model** | ArXiv | Old Scans Math | Tables | Old Scans | Headers and Footers | Multi column | Long tiny text | Base | Overall |
45
+ |:----------|:--------:|:--------------:|:--------:|:---------:|:-------------------:|:------------:|:--------------:|:--------:|:--------------:|
46
+ | Datalab Chandra v0.1.0 | 81.4 | **80.3** | **89.4** | **50.0** | 88.3 | **81.0** | **91.6** | **99.9** | **82.7 ± 0.9** |
47
+ | Datalab Marker v1.10.0 | **83.8** | 69.7 | 74.8 | 32.3 | 86.6 | 79.4 | 85.7 | 99.6 | 76.5 ± 1.0 |
48
+ | Mistral OCR API | 77.2 | 67.5 | 60.6 | 29.3 | 93.6 | 71.3 | 77.1 | 99.4 | 72.0 ± 1.1 |
49
+ | Deepseek OCR | 75.2 | 67.9 | 79.1 | 32.9 | 96.1 | 66.3 | 78.5 | 97.7 | 74.2 ± 1.0 |
50
+ | GPT-4o (Anchored) | 53.5 | 74.5 | 70.0 | 40.7 | 93.8 | 69.3 | 60.6 | 96.8 | 69.9 ± 1.1 |
51
+ | Gemini Flash 2 (Anchored) | 54.5 | 56.1 | 72.1 | 34.2 | 64.7 | 61.5 | 71.5 | 95.6 | 63.8 ± 1.2 |
52
+ | Qwen 3 VL | 70.2 | 75.1 | 45.6 | 37.5 | 89.1 | 62.1 | 43.0 | 94.3 | 64.6 ± 1.1 |
53
+ | olmOCR v0.3.0 | 78.6 | 79.9 | 72.9 | 43.9 | **95.1** | 77.3 | 81.2 | 98.9 | 78.5 ± 1.1 |
54
+
55
+ ## Examples
56
+
57
+ | Type | Name | Link |
58
+ |------|------|------|
59
+ | Tables | Water Damage Form | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/tables/water_damage.png) |
60
+ | Tables | 10K Filing | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/tables/10k.png) |
61
+ | Forms | Handwritten Form | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/forms/handwritten_form.png) |
62
+ | Forms | Lease Agreement | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/forms/lease.png) |
63
+ | Handwriting | Doctor Note | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/handwriting/doctor_note.png) |
64
+ | Handwriting | Math Homework | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/handwriting/math_hw.png) |
65
+ | Books | Geography Textbook | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/books/geo_textbook_page.png) |
66
+ | Books | Exercise Problems | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/books/exercises.png) |
67
+ | Math | Attention Diagram | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/attn_all.png) |
68
+ | Math | Worksheet | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/worksheet.png) |
69
+ | Math | EGA Page | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/math/ega.png) |
70
+ | Newspapers | New York Times | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/newspapers/nyt.png) |
71
+ | Newspapers | LA Times | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/newspapers/la_times.png) |
72
+ | Other | Transcript | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/transcript.png) |
73
+ | Other | Flowchart | [View](https://github.com/datalab-to/chandra/blob/master/assets/examples/other/flowchart.png) |
74
+
75
+ ## Usage
76
+
77
+ ### Installation
78
+
79
+ ```shell
80
+ pip install chandra-ocr
81
+ ```
82
+
83
+ ### From code
84
+
85
+ ```python
86
+
87
+ from chandra.model import InferenceManager
88
+ from chandra.model.schema import BatchInputItem
89
+
90
+ # Run chandra_vllm to start a vLLM server first if you pass vllm, else pass hf
91
+ # you can also start your own vllm server with the datalab-to/chandra model
92
+ manager = InferenceManager(method="vllm")
93
+ batch = [
94
+ BatchInputItem(
95
+ image=PIL_IMAGE,
96
+ prompt_type="ocr_layout"
97
+ )
98
+ ]
99
+ result = manager.generate(batch)[0]
100
+ print(result.markdown)
101
+ ```
102
+
103
+ ### With transformers
104
+
105
+ ```python
106
+ from transformers import AutoModel, AutoProcessor
107
+ from chandra.model.hf import generate_hf
108
+ from chandra.model.schema import BatchInputItem
109
+ from chandra.output import parse_markdown
110
+
111
+ model = AutoModel.from_pretrained("datalab-to/chandra").cuda()
112
+ model.processor = AutoProcessor.from_pretrained("datalab-to/chandra")
113
+
114
+ batch = [
115
+ BatchInputItem(
116
+ image=PIL_IMAGE,
117
+ prompt_type="ocr_layout"
118
+ )
119
+ ]
120
+
121
+ result = generate_hf(batch, model)[0]
122
+ markdown = parse_markdown(result.raw)
123
+ ```
124
+
125
+ # Credits
126
+
127
+ Thank you to the following open source projects:
128
+
129
+ - [Huggingface Transformers](https://github.com/huggingface/transformers)
130
+ - [VLLM](https://github.com/vllm-project/vllm)
131
+ - [olmocr](github.com/allenai/olmocr)
132
+ - [Qwen 3 VL](https://github.com/QwenLM/Qwen3)