Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -26,21 +26,20 @@ There are currently 2 version of the model available for download, both of them | |
| 26 | 
             
            but different data annotations. The latest `v5.3` is considered to be default and can be found in the `main` branch
         | 
| 27 | 
             
            of HF π hub [^1] π 
         | 
| 28 |  | 
| 29 | 
            -
            | Version | Base                             | Pages |   PDFs    | Description | 
| 30 | 
            -
             | 
| 31 | 
            -
            |  `v2.0` | `vit-base-patch16-224`           | 10073 | **3896**  | annotations with mistakes, more heterogenous data | 
| 32 | 
            -
            |  `v2.1` | `vit-base-patch16-224`           | 11940 | **5002**  | `main`: more diverse pages in each category, less annotation mistakes | 
| 33 | 
            -
            |  `v2.2` | `vit-base-patch16-224`           | 15855 | **5730**  | same data as `v2.1` + some restored pages from `v2.0` | 
| 34 | 
            -
            |  `v3.2` | `vit-base-patch16-384`           | 15855 | **5730**  | same data as `v2.2`, but a bit larger model base with higher resolution | 
| 35 | 
            -
            |  `v5.2` | `vit-large-patch16-384`          | 15855 | **5730**  | same data as `v2.2`, but the largest model base with higher resolution | 
| 36 | 
            -
            |  `v1.2` | `efficientnetv2_s.in21k`         | 15855 | **5730**  | same data as `v2.2`, but the smallest model base (CNN) | 
| 37 | 
            -
            |  `v4.2` | `efficientnetv2_l.in21k_ft_in1k` | 15855 | **5730**  | same data as `v2.2`, CNN base model smaller than the largest, may be more accurate | 
| 38 | 
            -
            |  `v2.3` | `vit-base-patch16-224`           | 38625 | **37328** | new data annotation phase data, more single-page documents used | 
| 39 | 
            -
            |  `v3.3` | `vit-base-patch16-384`           | 38625 | **37328** | same data as `v2.3`, but a bit larger model base with higher resolution | 
| 40 | 
            -
            |  `v5.3` | `vit-large-patch16-384`          | 38625 | **37328** | same data as `v2.3`, but the largest model base with higher resolution | 
| 41 | 
            -
            |  `v1.3` | `efficientnetv2_m.in21k_ft_in1k` | 38625 | **37328** | same data as `v2.3`, but the smallest model base (CNN) | 
| 42 | 
            -
            |  `v4.3` | `regnety_160.swag_ft_in1k`       | 38625 | **37328** | same data as `v2.3`, CNN base model bigger than the smallest | 
| 43 | 
            -
            |  `v6.3` | `regnety_640.seer`               | 38625 | **37328** | same data as `v2.3`, CNN base model smaller than the largest, may be less accurate  |
         | 
| 44 |  | 
| 45 |  | 
| 46 | 
             
            | **Version**                      | **Parameters (M)** | Resolution (px) | Revision |
         | 
| @@ -53,6 +52,7 @@ of HF π hub [^1] π | |
| 53 | 
             
            | `vit-large-patch16-384`          | 305                | 384             | v5.X     |
         | 
| 54 | 
             
            | `regnety_640.seer`               | 281                | 384             | v6.3     |
         | 
| 55 |  | 
|  | |
| 56 | 
             
            | Base Model                                 | Revision | max_cat | Best_Prec (%) | Best_Acc (%) | Fold | Note         |
         | 
| 57 | 
             
            |--------------------------------------------|----------|---------|---------------|--------------|------|--------------|
         | 
| 58 | 
             
            | **google/vit-base-patch16-224**            | **v2.3** | 14,000  | **98.79**     | **98.79**    | 5    | OK & Small   |
         | 
| @@ -63,7 +63,7 @@ of HF π hub [^1] π | |
| 63 | 
             
            | microsoft/dit-large                        | v11.3    | 14,000  | 98.53         | 98.53        | 2    |              |
         | 
| 64 | 
             
            | timm/regnety_120.sw_in12k_ft_in1k          | v12.3    | 14,000  | 98.29         | 98.29        | 3    |              |
         | 
| 65 | 
             
            | **timm/regnety_160.swag_ft_in1k**          | **v4.3** | 14,000  | **99.17**     | **99.16**    | 1    | Best & Small |
         | 
| 66 | 
            -
            |  | 
| 67 | 
             
            | timm/tf_efficientnetv2_l.in21k_ft_in1k     | v8.3     | 14,000  | 98.62         | 98.62        | 5    |              |
         | 
| 68 | 
             
            | **timm/tf_efficientnetv2_m.in21k_ft_in1k** | **v1.3** | 14,000  | **98.83**     | **98.83**    | 1    | Good & Small |
         | 
| 69 | 
             
            | timm/tf_efficientnetv2_s.in21k             | v7.3     | 14,000  | 97.90         | 97.87        | 1    |              |
         | 
| @@ -182,13 +182,14 @@ During training the following transforms were applied randomly with a 50% chance | |
| 182 | 
             
            | `v3.2`       | 96.49     | 99.94     |
         | 
| 183 | 
             
            | `v4.2`       | 97.73     | 99.87     |
         | 
| 184 | 
             
            | `v5.2`       | 97.86     | 99.87     |
         | 
| 185 | 
            -
            | `v1.3`       |  | 
| 186 | 
             
            | `v2.3`       | 98.79     | 99.96     |
         | 
| 187 | 
             
            | `v3.3`       | 98.92     | 99.98     |
         | 
| 188 | 
            -
            | `v4.3`       | ** | 
| 189 | 
             
            | `v5.3`       | **99.12** | 99.94     |
         | 
| 190 | 
             
            | `v6.3`       | 98.79     | 99.94     |
         | 
| 191 |  | 
|  | |
| 192 | 
             
            **v2.2** Evaluation set's accuracy (**Top-1**):  **97.54%** 
         | 
| 193 |  | 
| 194 | 
             
            
         | 
| @@ -211,27 +212,27 @@ During training the following transforms were applied randomly with a 50% chance | |
| 211 |  | 
| 212 | 
             
            **v1.3** Evaluation set's accuracy (**Top-1**):  **98.83%** 
         | 
| 213 |  | 
| 214 | 
            -
            :  **98.79%** 
         | 
| 217 |  | 
| 218 | 
            -
            :  **98.92%** 
         | 
| 221 |  | 
| 222 | 
            -
            :  **98.16%** 
         | 
| 225 |  | 
| 226 | 
            -
            :  **99.12%** 
         | 
| 229 |  | 
| 230 | 
            -
            :  **98.79%** 
         | 
| 233 |  | 
| 234 | 
            -
            : [model_TOP-3_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20250710-1921_model_v120106l_TOP-3_EVAL.csv) π
         | 
| 259 |  | 
| 260 | 
            -
            - **v1.3** Manually β  **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/ | 
| 261 |  | 
| 262 | 
            -
            - **v2.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020- | 
| 263 |  | 
| 264 | 
            -
            - **v3.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020- | 
| 265 |  | 
| 266 | 
            -
            - **v4.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020- | 
| 267 |  | 
| 268 | 
            -
            - **v5.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/ | 
| 269 |  | 
| 270 | 
            -
            - **v6.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020- | 
| 271 |  | 
| 272 |  | 
| 273 | 
             
            #### Table columns
         | 
|  | |
| 26 | 
             
            but different data annotations. The latest `v5.3` is considered to be default and can be found in the `main` branch
         | 
| 27 | 
             
            of HF π hub [^1] π 
         | 
| 28 |  | 
| 29 | 
            +
            | Version | Base                             | Pages |   PDFs    | Description                                                                        |
         | 
| 30 | 
            +
            |--------:|----------------------------------|:-----:|:---------:|:-----------------------------------------------------------------------------------|
         | 
| 31 | 
            +
            |  `v2.0` | `vit-base-patch16-224`           | 10073 | **3896**  | annotations with mistakes, more heterogenous data                                  |
         | 
| 32 | 
            +
            |  `v2.1` | `vit-base-patch16-224`           | 11940 | **5002**  | `main`: more diverse pages in each category, less annotation mistakes              |
         | 
| 33 | 
            +
            |  `v2.2` | `vit-base-patch16-224`           | 15855 | **5730**  | same data as `v2.1` + some restored pages from `v2.0`                              |
         | 
| 34 | 
            +
            |  `v3.2` | `vit-base-patch16-384`           | 15855 | **5730**  | same data as `v2.2`, but a bit larger model base with higher resolution            |
         | 
| 35 | 
            +
            |  `v5.2` | `vit-large-patch16-384`          | 15855 | **5730**  | same data as `v2.2`, but the largest model base with higher resolution             |
         | 
| 36 | 
            +
            |  `v1.2` | `efficientnetv2_s.in21k`         | 15855 | **5730**  | same data as `v2.2`, but the smallest model base (CNN)                             |
         | 
| 37 | 
            +
            |  `v4.2` | `efficientnetv2_l.in21k_ft_in1k` | 15855 | **5730**  | same data as `v2.2`, CNN base model smaller than the largest, may be more accurate |
         | 
| 38 | 
            +
            |  `v2.3` | `vit-base-patch16-224`           | 38625 | **37328** | new data annotation phase data, more single-page documents used, transformer model |
         | 
| 39 | 
            +
            |  `v3.3` | `vit-base-patch16-384`           | 38625 | **37328** | same data as `v2.3`, but a bit larger model base with higher resolution            |
         | 
| 40 | 
            +
            |  `v5.3` | `vit-large-patch16-384`          | 38625 | **37328** | same data as `v2.3`, but the largest model base with higher resolution             |
         | 
| 41 | 
            +
            |  `v1.3` | `efficientnetv2_m.in21k_ft_in1k` | 38625 | **37328** | same data as `v2.3`, but the smallest model base (CNN)                             |
         | 
| 42 | 
            +
            |  `v4.3` | `regnety_160.swag_ft_in1k`       | 38625 | **37328** | same data as `v2.3`, CNN base model bigger than the smallest, may be more accurate |
         | 
|  | |
| 43 |  | 
| 44 |  | 
| 45 | 
             
            | **Version**                      | **Parameters (M)** | Resolution (px) | Revision |
         | 
|  | |
| 52 | 
             
            | `vit-large-patch16-384`          | 305                | 384             | v5.X     |
         | 
| 53 | 
             
            | `regnety_640.seer`               | 281                | 384             | v6.3     |
         | 
| 54 |  | 
| 55 | 
            +
             | 
| 56 | 
             
            | Base Model                                 | Revision | max_cat | Best_Prec (%) | Best_Acc (%) | Fold | Note         |
         | 
| 57 | 
             
            |--------------------------------------------|----------|---------|---------------|--------------|------|--------------|
         | 
| 58 | 
             
            | **google/vit-base-patch16-224**            | **v2.3** | 14,000  | **98.79**     | **98.79**    | 5    | OK & Small   |
         | 
|  | |
| 63 | 
             
            | microsoft/dit-large                        | v11.3    | 14,000  | 98.53         | 98.53        | 2    |              |
         | 
| 64 | 
             
            | timm/regnety_120.sw_in12k_ft_in1k          | v12.3    | 14,000  | 98.29         | 98.29        | 3    |              |
         | 
| 65 | 
             
            | **timm/regnety_160.swag_ft_in1k**          | **v4.3** | 14,000  | **99.17**     | **99.16**    | 1    | Best & Small |
         | 
| 66 | 
            +
            | timm/regnety_640.see                       | v6.3     | 14,000  | 98.79         | 98.79        | 5    | OK & Large   |
         | 
| 67 | 
             
            | timm/tf_efficientnetv2_l.in21k_ft_in1k     | v8.3     | 14,000  | 98.62         | 98.62        | 5    |              |
         | 
| 68 | 
             
            | **timm/tf_efficientnetv2_m.in21k_ft_in1k** | **v1.3** | 14,000  | **98.83**     | **98.83**    | 1    | Good & Small |
         | 
| 69 | 
             
            | timm/tf_efficientnetv2_s.in21k             | v7.3     | 14,000  | 97.90         | 97.87        | 1    |              |
         | 
|  | |
| 182 | 
             
            | `v3.2`       | 96.49     | 99.94     |
         | 
| 183 | 
             
            | `v4.2`       | 97.73     | 99.87     |
         | 
| 184 | 
             
            | `v5.2`       | 97.86     | 99.87     |
         | 
| 185 | 
            +
            | `v1.3`       | 96.81     | 99.78     |
         | 
| 186 | 
             
            | `v2.3`       | 98.79     | 99.96     |
         | 
| 187 | 
             
            | `v3.3`       | 98.92     | 99.98     |
         | 
| 188 | 
            +
            | `v4.3`       | 98.92     | **100.0** |
         | 
| 189 | 
             
            | `v5.3`       | **99.12** | 99.94     |
         | 
| 190 | 
             
            | `v6.3`       | 98.79     | 99.94     |
         | 
| 191 |  | 
| 192 | 
            +
             | 
| 193 | 
             
            **v2.2** Evaluation set's accuracy (**Top-1**):  **97.54%** 
         | 
| 194 |  | 
| 195 | 
             
            
         | 
|  | |
| 212 |  | 
| 213 | 
             
            **v1.3** Evaluation set's accuracy (**Top-1**):  **98.83%** 
         | 
| 214 |  | 
| 215 | 
            +
            
         | 
| 216 |  | 
| 217 | 
             
            **v2.3** Evaluation set's accuracy (**Top-1**):  **98.79%** 
         | 
| 218 |  | 
| 219 | 
            +
            
         | 
| 220 |  | 
| 221 | 
             
            **v3.3** Evaluation set's accuracy (**Top-1**):  **98.92%** 
         | 
| 222 |  | 
| 223 | 
            +
            
         | 
| 224 |  | 
| 225 | 
             
            **v4.3** Evaluation set's accuracy (**Top-1**):  **98.16%** 
         | 
| 226 |  | 
| 227 | 
            +
            
         | 
| 228 |  | 
| 229 | 
             
            **v5.3** Evaluation set's accuracy (**Top-1**):  **99.12%** 
         | 
| 230 |  | 
| 231 | 
            +
            
         | 
| 232 |  | 
| 233 | 
             
            **v6.3** Evaluation set's accuracy (**Top-1**):  **98.79%** 
         | 
| 234 |  | 
| 235 | 
            +
            
         | 
| 236 |  | 
| 237 |  | 
| 238 |  | 
|  | |
| 258 |  | 
| 259 | 
             
            - **v4.2** Manually β **checked** evaluation dataset results (TOP-3): [model_TOP-3_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20250710-1921_model_v120106l_TOP-3_EVAL.csv) π
         | 
| 260 |  | 
| 261 | 
            +
            - **v1.3** Manually β  **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1825_5449_model_v13_TOP-1_EVAL.csv) π
         | 
| 262 |  | 
| 263 | 
            +
            - **v2.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1835_5449_model_v23_TOP-1_EVAL.csv) π
         | 
| 264 |  | 
| 265 | 
            +
            - **v3.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1841_5449_model_v33_TOP-1_EVAL.csv) π
         | 
| 266 |  | 
| 267 | 
            +
            - **v4.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1849_5449_model_v43_TOP-1_EVAL.csv) π
         | 
| 268 |  | 
| 269 | 
            +
            - **v5.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1856_5449_model_v53_TOP-1_EVAL.csv) π
         | 
| 270 |  | 
| 271 | 
            +
            - **v6.3** Manually β **checked** evaluation dataset (TOP-1): [model_TOP-1_EVAL.csv](https://github.com/ufal/atrium-page-classification/blob/vit/result/tables/20251020-1906_5449_model_v63_TOP-1_EVAL.csv) π
         | 
| 272 |  | 
| 273 |  | 
| 274 | 
             
            #### Table columns
         | 

