Spaces:

Metric-AI
/

ArmBench-LLM

Running

App Files Files Community

Bagratuni commited on Mar 12

Commit

1e273fd

1 Parent(s): be5a444

commit

Browse files

Files changed (1) hide show

app.py +8 -14

app.py CHANGED Viewed

@@ -76,9 +76,12 @@ def main():
                     To submit a model for evaluation, please follow these steps:
                     1. **Evaluate your model**:
                         - Follow the evaluation script provided here: [https://github.com/Anania-AI/Arm-LLM-Benchmark](https://github.com/Anania-AI/Arm-LLM-Benchmark)
                     2. **Format your submission file**:
-                        - After evaluation, you will get a `result.json` file. Ensure the file follows this format:
-                        ```json
                         {
                             "mmlu_results": [
                                 {
@@ -95,18 +98,9 @@ def main():
                                 ...
                             ]
                         }
-                        ```
-                    3. **Important Notes**:
-                        - For **mmlu_results**:
-                            - The following categories must be included in the mmlu_results for the model to be considered valid:
-                            - ```["Biology", "Business", "Chemistry", "Computer Science", "Economics", "Engineering", "Health", "History", "Law", "Math", "Other", "Philosophy", "Physics", "Psychology", "Average"] ```
-                            - If any of these categories are missing, the model will not be added to the evaluation.
-                        - For **unified_exam_results**:
-                            - The following categories must be included in the unified_exam_results for the model to be considered valid:
-                            - ```["Average", "Armenian language and literature", "Armenian history", "Mathematics"] ```
-                            - If any of these categories are missing, the model will not be added to the evaluation.
-                    4. **Submit your model**:
-                        - Add the `Arm-LLM-Bench` tag and the `result.json` file to your model card.
                         - Click on the "Refresh Data" button in this app, and you will see your model's results.
                     """
                 )

                     To submit a model for evaluation, please follow these steps:
                     1. **Evaluate your model**:
                         - Follow the evaluation script provided here: [https://github.com/Anania-AI/Arm-LLM-Benchmark](https://github.com/Anania-AI/Arm-LLM-Benchmark)
+                        - For more details about the evaluation process, read the README in the Arm-LLM-Benchmark GitHub repository.
                     2. **Format your submission file**:
+                        - After evaluation, you will get a `results.json` file. Ensure the file follows this format:
+                    ```json
                         {
                             "mmlu_results": [
                                 {
                                 ...
                             ]
                         }
+                    ```
+                    3. **Submit your model**:
+                        - Add the `Arm-LLM-Bench` tag and the `results.json` file to your model card.
                         - Click on the "Refresh Data" button in this app, and you will see your model's results.
                     """
                 )