ufal
/

k4tel commited on
Commit
b141f06
Β·
verified Β·
1 Parent(s): c01bc40

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -30
README.md CHANGED
@@ -26,21 +26,20 @@ There are currently 2 version of the model available for download, both of them
26
  but different data annotations. The latest `v5.3` is considered to be default and can be found in the `main` branch
27
  of HF 😊 hub [^1] πŸ”—
28
 
29
- | Version | Base | Pages | PDFs | Description |
30
- |--------:|----------------------------------|:-----:|:---------:|:------------------------------------------------------------------------------------|
31
- | `v2.0` | `vit-base-patch16-224` | 10073 | **3896** | annotations with mistakes, more heterogenous data |
32
- | `v2.1` | `vit-base-patch16-224` | 11940 | **5002** | `main`: more diverse pages in each category, less annotation mistakes |
33
- | `v2.2` | `vit-base-patch16-224` | 15855 | **5730** | same data as `v2.1` + some restored pages from `v2.0` |
34
- | `v3.2` | `vit-base-patch16-384` | 15855 | **5730** | same data as `v2.2`, but a bit larger model base with higher resolution |
35
- | `v5.2` | `vit-large-patch16-384` | 15855 | **5730** | same data as `v2.2`, but the largest model base with higher resolution |
36
- | `v1.2` | `efficientnetv2_s.in21k` | 15855 | **5730** | same data as `v2.2`, but the smallest model base (CNN) |
37
- | `v4.2` | `efficientnetv2_l.in21k_ft_in1k` | 15855 | **5730** | same data as `v2.2`, CNN base model smaller than the largest, may be more accurate |
38
- | `v2.3` | `vit-base-patch16-224` | 38625 | **37328** | new data annotation phase data, more single-page documents used |
39
- | `v3.3` | `vit-base-patch16-384` | 38625 | **37328** | same data as `v2.3`, but a bit larger model base with higher resolution |
40
- | `v5.3` | `vit-large-patch16-384` | 38625 | **37328** | same data as `v2.3`, but the largest model base with higher resolution |
41
- | `v1.3` | `efficientnetv2_m.in21k_ft_in1k` | 38625 | **37328** | same data as `v2.3`, but the smallest model base (CNN) |
42
- | `v4.3` | `regnety_160.swag_ft_in1k` | 38625 | **37328** | same data as `v2.3`, CNN base model bigger than the smallest,, may be more accurate |
43
- | `v6.3` | `regnety_640.seer` | 38625 | **37328** | same data as `v2.3`, CNN base model smaller than the largest, may be less accurate |
44
 
45
 
46
  | **Version** | **Parameters (M)** | Resolution (px) | Revision |
@@ -53,6 +52,7 @@ of HF 😊 hub [^1] πŸ”—
53
  | `vit-large-patch16-384` | 305 | 384 | v5.X |
54
  | `regnety_640.seer` | 281 | 384 | v6.3 |
55
 
 
56
  | Base Model | Revision | max_cat | Best_Prec (%) | Best_Acc (%) | Fold | Note |
57
  |--------------------------------------------|----------|---------|---------------|--------------|------|--------------|
58
  | **google/vit-base-patch16-224** | **v2.3** | 14,000 | **98.79** | **98.79** | 5 | OK & Small |
@@ -63,7 +63,7 @@ of HF 😊 hub [^1] πŸ”—
63
  | microsoft/dit-large | v11.3 | 14,000 | 98.53 | 98.53 | 2 | |
64
  | timm/regnety_120.sw_in12k_ft_in1k | v12.3 | 14,000 | 98.29 | 98.29 | 3 | |
65
  | **timm/regnety_160.swag_ft_in1k** | **v4.3** | 14,000 | **99.17** | **99.16** | 1 | Best & Small |
66
- | **timm/regnety_640.seer** | **v6.3** | 14,000 | **98.79** | **98.79** | 5 | OK & Large |
67
  | timm/tf_efficientnetv2_l.in21k_ft_in1k | v8.3 | 14,000 | 98.62 | 98.62 | 5 | |
68
  | **timm/tf_efficientnetv2_m.in21k_ft_in1k** | **v1.3** | 14,000 | **98.83** | **98.83** | 1 | Good & Small |
69
  | timm/tf_efficientnetv2_s.in21k | v7.3 | 14,000 | 97.90 | 97.87 | 1 | |
@@ -182,13 +182,14 @@ During training the following transforms were applied randomly with a 50% chance
182
  | `v3.2` | 96.49 | 99.94 |
183
  | `v4.2` | 97.73 | 99.87 |
184
  | `v5.2` | 97.86 | 99.87 |
185
- | `v1.3` | 98.83 | 99.96 |
186
  | `v2.3` | 98.79 | 99.96 |
187
  | `v3.3` | 98.92 | 99.98 |
188
- | `v4.3` | **99.16** | 99.96 |
189
  | `v5.3` | **99.12** | 99.94 |
190
  | `v6.3` | 98.79 | 99.94 |
191
 
 
192
  **v2.2** Evaluation set's accuracy (**Top-1**): **97.54%**
193
 
194
  ![TOP-1 confusion matrix - trained ViT](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20250701-1136_model_v220105p_conf_mat_TOP-1.png?raw=true)
@@ -211,27 +212,27 @@ During training the following transforms were applied randomly with a 50% chance
211
 
212
  **v1.3** Evaluation set's accuracy (**Top-1**): **98.83%**
213
 
214
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v13_conf_mat_TOP-1.png?raw=true)
215
 
216
  **v2.3** Evaluation set's accuracy (**Top-1**): **98.79%**
217
 
218
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v23_conf_mat_TOP-1.png?raw=true)
219
 
220
  **v3.3** Evaluation set's accuracy (**Top-1**): **98.92%**
221
 
222
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v33_conf_mat_TOP-1.png?raw=true)
223
 
224
  **v4.3** Evaluation set's accuracy (**Top-1**): **98.16%**
225
 
226
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v43_conf_mat_TOP-1.png?raw=true)
227
 
228
  **v5.3** Evaluation set's accuracy (**Top-1**): **99.12%**
229
 
230
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v53_conf_mat_TOP-1.png?raw=true)
231
 
232
  **v6.3** Evaluation set's accuracy (**Top-1**): **98.79%**
233
 
234
- ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/model_v63_conf_mat_TOP-1.png?raw=true)
235
 
236
 
237
 
@@ -257,17 +258,17 @@ During training the following transforms were applied randomly with a 50% chance
257
 
258
  - **v4.2** Manually ✍ **checked** evaluation dataset results (TOP-3): [model_TOP-3_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20250710-1921_model_v120106l_TOP-3_EVAL.csv) πŸ”—
259
 
260
- - **v1.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251019-1328_5449_model_v13_TOP-1_EVAL.csv) πŸ“Ž
261
 
262
- - **v2.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1136_5449_model_v23_TOP-1_EVAL.csv) πŸ“Ž
263
 
264
- - **v3.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-0827_5449_model_v33_TOP-1_EVAL.csv) πŸ“Ž
265
 
266
- - **v4.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1135_5449_model_v43_TOP-1_EVAL.csv) πŸ“Ž
267
 
268
- - **v5.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251019-1411_5449_model_v53_TOP-1_EVAL.csv) πŸ“Ž
269
 
270
- - **v6.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1123_5449_model_v63_TOP-1_EVAL.csv) πŸ“Ž
271
 
272
 
273
  #### Table columns
 
26
  but different data annotations. The latest `v5.3` is considered to be default and can be found in the `main` branch
27
  of HF 😊 hub [^1] πŸ”—
28
 
29
+ | Version | Base | Pages | PDFs | Description |
30
+ |--------:|----------------------------------|:-----:|:---------:|:-----------------------------------------------------------------------------------|
31
+ | `v2.0` | `vit-base-patch16-224` | 10073 | **3896** | annotations with mistakes, more heterogenous data |
32
+ | `v2.1` | `vit-base-patch16-224` | 11940 | **5002** | `main`: more diverse pages in each category, less annotation mistakes |
33
+ | `v2.2` | `vit-base-patch16-224` | 15855 | **5730** | same data as `v2.1` + some restored pages from `v2.0` |
34
+ | `v3.2` | `vit-base-patch16-384` | 15855 | **5730** | same data as `v2.2`, but a bit larger model base with higher resolution |
35
+ | `v5.2` | `vit-large-patch16-384` | 15855 | **5730** | same data as `v2.2`, but the largest model base with higher resolution |
36
+ | `v1.2` | `efficientnetv2_s.in21k` | 15855 | **5730** | same data as `v2.2`, but the smallest model base (CNN) |
37
+ | `v4.2` | `efficientnetv2_l.in21k_ft_in1k` | 15855 | **5730** | same data as `v2.2`, CNN base model smaller than the largest, may be more accurate |
38
+ | `v2.3` | `vit-base-patch16-224` | 38625 | **37328** | new data annotation phase data, more single-page documents used, transformer model |
39
+ | `v3.3` | `vit-base-patch16-384` | 38625 | **37328** | same data as `v2.3`, but a bit larger model base with higher resolution |
40
+ | `v5.3` | `vit-large-patch16-384` | 38625 | **37328** | same data as `v2.3`, but the largest model base with higher resolution |
41
+ | `v1.3` | `efficientnetv2_m.in21k_ft_in1k` | 38625 | **37328** | same data as `v2.3`, but the smallest model base (CNN) |
42
+ | `v4.3` | `regnety_160.swag_ft_in1k` | 38625 | **37328** | same data as `v2.3`, CNN base model bigger than the smallest, may be more accurate |
 
43
 
44
 
45
  | **Version** | **Parameters (M)** | Resolution (px) | Revision |
 
52
  | `vit-large-patch16-384` | 305 | 384 | v5.X |
53
  | `regnety_640.seer` | 281 | 384 | v6.3 |
54
 
55
+
56
  | Base Model | Revision | max_cat | Best_Prec (%) | Best_Acc (%) | Fold | Note |
57
  |--------------------------------------------|----------|---------|---------------|--------------|------|--------------|
58
  | **google/vit-base-patch16-224** | **v2.3** | 14,000 | **98.79** | **98.79** | 5 | OK & Small |
 
63
  | microsoft/dit-large | v11.3 | 14,000 | 98.53 | 98.53 | 2 | |
64
  | timm/regnety_120.sw_in12k_ft_in1k | v12.3 | 14,000 | 98.29 | 98.29 | 3 | |
65
  | **timm/regnety_160.swag_ft_in1k** | **v4.3** | 14,000 | **99.17** | **99.16** | 1 | Best & Small |
66
+ | timm/regnety_640.see | v6.3 | 14,000 | 98.79 | 98.79 | 5 | OK & Large |
67
  | timm/tf_efficientnetv2_l.in21k_ft_in1k | v8.3 | 14,000 | 98.62 | 98.62 | 5 | |
68
  | **timm/tf_efficientnetv2_m.in21k_ft_in1k** | **v1.3** | 14,000 | **98.83** | **98.83** | 1 | Good & Small |
69
  | timm/tf_efficientnetv2_s.in21k | v7.3 | 14,000 | 97.90 | 97.87 | 1 | |
 
182
  | `v3.2` | 96.49 | 99.94 |
183
  | `v4.2` | 97.73 | 99.87 |
184
  | `v5.2` | 97.86 | 99.87 |
185
+ | `v1.3` | 96.81 | 99.78 |
186
  | `v2.3` | 98.79 | 99.96 |
187
  | `v3.3` | 98.92 | 99.98 |
188
+ | `v4.3` | 98.92 | **100.0** |
189
  | `v5.3` | **99.12** | 99.94 |
190
  | `v6.3` | 98.79 | 99.94 |
191
 
192
+
193
  **v2.2** Evaluation set's accuracy (**Top-1**): **97.54%**
194
 
195
  ![TOP-1 confusion matrix - trained ViT](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20250701-1136_model_v220105p_conf_mat_TOP-1.png?raw=true)
 
212
 
213
  **v1.3** Evaluation set's accuracy (**Top-1**): **98.83%**
214
 
215
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1835_model_v13_conf_mat_TOP-1.png?raw=true)
216
 
217
  **v2.3** Evaluation set's accuracy (**Top-1**): **98.79%**
218
 
219
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1841_model_v23_conf_mat_TOP-1.png?raw=true)
220
 
221
  **v3.3** Evaluation set's accuracy (**Top-1**): **98.92%**
222
 
223
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1849_model_v33_conf_mat_TOP-1.png?raw=true)
224
 
225
  **v4.3** Evaluation set's accuracy (**Top-1**): **98.16%**
226
 
227
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1856_model_v43_conf_mat_TOP-1.png?raw=true)
228
 
229
  **v5.3** Evaluation set's accuracy (**Top-1**): **99.12%**
230
 
231
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1905_model_v53_conf_mat_TOP-1.png?raw=true)
232
 
233
  **v6.3** Evaluation set's accuracy (**Top-1**): **98.79%**
234
 
235
+ ![TOP-1 confusion matrix](https://github.com/ufal/atrium-page-classification/blob/vit/result/plots/20251020-1913_model_v63_conf_mat_TOP-1.png?raw=true)
236
 
237
 
238
 
 
258
 
259
  - **v4.2** Manually ✍ **checked** evaluation dataset results (TOP-3): [model_TOP-3_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20250710-1921_model_v120106l_TOP-3_EVAL.csv) πŸ”—
260
 
261
+ - **v1.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1825_5449_model_v13_TOP-1_EVAL.csv) πŸ“Ž
262
 
263
+ - **v2.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1835_5449_model_v23_TOP-1_EVAL.csv) πŸ“Ž
264
 
265
+ - **v3.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1841_5449_model_v33_TOP-1_EVAL.csv) πŸ“Ž
266
 
267
+ - **v4.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1849_5449_model_v43_TOP-1_EVAL.csv) πŸ“Ž
268
 
269
+ - **v5.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1856_5449_model_v53_TOP-1_EVAL.csv) πŸ“Ž
270
 
271
+ - **v6.3** Manually ✍ **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1906_5449_model_v63_TOP-1_EVAL.csv) πŸ“Ž
272
 
273
 
274
  #### Table columns