YuanLiuuuuuu commited on
Commit
1e0d977
·
verified ·
1 Parent(s): e2b2087

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -32,7 +32,7 @@ POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Docume
32
  </a>
33
  </p>
34
 
35
- We are delighted to announce that the WePOINTS family has welcomed a new member: [POINTS-Reader](https://github.com/Tencent/POINTS-Reader), a vision-language model for end-to-end document conversion.
36
 
37
  ## News
38
 
@@ -51,7 +51,7 @@ We are delighted to announce that the WePOINTS family has welcomed a new member:
51
 
52
  ## Results
53
 
54
- We take the following results from [OmniDocBench](https://github.com/opendatalab/OmniDocBench/tree/main) and POINTS-Reader for comparison:
55
 
56
  <table style="width: 92%; margin: auto; border-collapse: collapse;">
57
  <thead>
@@ -607,9 +607,9 @@ prompt = (
607
  image_path = '/path/to/your/local/image'
608
  model_path = 'tencent/POINTS-Reader'
609
  model = AutoModelForCausalLM.from_pretrained(model_path,
610
- trust_remote_code=True,
611
- torch_dtype=torch.float16,
612
- device_map='cuda')
613
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
614
  image_processor = Qwen2ImageProcessorForPOINTSV15.from_pretrained(model_path)
615
  content = [
@@ -647,8 +647,8 @@ We will create a Pull Request to SGLang, please stay tuned.
647
 
648
  ## Known Issues
649
 
650
- - **Complex Document Parsing**: POINTS-Reader can struggle with complex layouts (e.g., newspapers), often producing repeated or missing content.
651
- - **Handwritten Document Parsing**: It also has difficulty handling handwritten inputs (e.g., receipts, notes), which can lead to recognition errors or omissions.
652
  - **Multi-language Document Parsing**: POINTS-Reader currently supports only English and Chinese, limiting its effectiveness on other languages.
653
 
654
  ## Citation
 
32
  </a>
33
  </p>
34
 
35
+ We are delighted to announce that the WePOINTS family has welcomed a new member: POINTS-Reader, a vision-language model for end-to-end document conversion.
36
 
37
  ## News
38
 
 
51
 
52
  ## Results
53
 
54
+ For comparison, we use the results reported by [OmniDocBench](https://github.com/opendatalab/OmniDocBench/tree/main) and POINTS-Reader. Compared with the version submitted to EMNLP 2025, the current release provides (1) improved performance and (2) support for Chinese documents. Both enhancements build upon the methods proposed in this paper.
55
 
56
  <table style="width: 92%; margin: auto; border-collapse: collapse;">
57
  <thead>
 
607
  image_path = '/path/to/your/local/image'
608
  model_path = 'tencent/POINTS-Reader'
609
  model = AutoModelForCausalLM.from_pretrained(model_path,
610
+ trust_remote_code=True,
611
+ torch_dtype=torch.float16,
612
+ device_map='cuda')
613
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
614
  image_processor = Qwen2ImageProcessorForPOINTSV15.from_pretrained(model_path)
615
  content = [
 
647
 
648
  ## Known Issues
649
 
650
+ - **Complex Document Parsing**: POINTS-Reader can struggle with complex layouts (e.g., newspapers), often producing repeated or missing content.
651
+ - **Handwritten Document Parsing**: It also has difficulty handling handwritten inputs (e.g., receipts, notes), which can lead to recognition errors or omissions.
652
  - **Multi-language Document Parsing**: POINTS-Reader currently supports only English and Chinese, limiting its effectiveness on other languages.
653
 
654
  ## Citation