abdev-leaderboard

Running

App Files Files Community

loodvanniekerkginkgo commited on 2 days ago

Commit

767c884

1 Parent(s): 094a347

Updating documentation (h/t Diya Mohan)

Browse files

Files changed (5) hide show

about.py +19 -5
app.py +1 -1
assets/prediction_explainer_cv.png +0 -3
assets/{prediction_explainer.png → prediction_explainer_v3.png} +2 -2
constants.py +1 -0

about.py CHANGED Viewed

@@ -6,6 +6,7 @@ from constants import (
     FAQ_TAB_NAME,
     SLACK_URL,
     TUTORIAL_URL,
 )
 WEBSITE_HEADER = f"""
@@ -59,7 +60,7 @@ ABOUT_TEXT = f"""
 1. **Create a Hugging Face account** [here](https://huggingface.co/join) if you don't have one yet (this is used to track unique submissions and to access the GDPa1 dataset).
 2. **Register your team** on the [Competition Registration](https://datapoints.ginkgo.bio/ai-competitions/2025-abdev-competition) page.
-3. **Build a model** using cross-validation on the [GDPa1](https://huggingface.co/datasets/ginkgo-datapoints/GDPa1) dataset, using the `hierarchical_cluster_IgG_isotype_stratified_fold` column to split the dataset into folds, and write out all cross-validation predictions to a CSV file.
 4. **Use your model to make predictions** on the private test set (download the 80 private test set sequences from the {SUBMIT_TAB_NAME} tab).
 5. **Submit your training and test set predictions** on the {SUBMIT_TAB_NAME} tab by uploading both your cross-validation and private test set CSV files.
@@ -69,6 +70,13 @@ Check out our introductory tutorial on training an antibody developability predi
 ---
 #### Acknowledgements
 We gratefully acknowledge [Tamarind Bio](https://www.tamarind.bio/)'s help in running the following models which are on the leaderboard:
@@ -84,11 +92,14 @@ We're working on getting more public models added, so that participants have mor
 #### How to contribute?
-We'd like to add more existing developability models to the leaderboard. Some examples of models we'd like to add:
 - Absolute folding stability models (for Thermostability)
 - PROPERMAB
 - AbMelt (requires GROMACS for MD simulations)
 If you would like to form a team or discuss ideas, join the [Slack community]({SLACK_URL}) co-hosted by Bits in Bio.
 """
@@ -131,7 +142,7 @@ FAQS = {
     ),
     "Do I need to submit my code / methods in order to participate?": (
         "No, there are no requirements to submit code / methods and submitted predictions remain private. "
-        "We also have an optional field for including a short model description. "
         "Top performing participants will be requested to identify themselves at the end of the tournament. "
         "There will be one prize for the best open-source reproducible model, which will require code / methods to be available."
     ),
@@ -153,10 +164,13 @@ FAQS = {
         "We reserve the right to award the open-source prize to a predictor with competitive results for a subset of properties (e.g. a top polyreactivity model)."
     ),
     "How does the open-source prize work?": (
-        "Participants who open-source their training code and methods will be eligible for the open-source prize (as well as the other prizes)."
     ),
     "Can I use proprietary tools like AlphaFold3 for the open-source prize?": (
-        "Yes, using tools that have published their inference code under proprietary licenses is allowed (like AlphaFold3 and PROPERMAB), as long as code is available and fully reproducible."
     ),
     "What do I need to submit?": (
         'There is a tab on the Hugging Face competition page to upload predictions for datasets - for each dataset participants need to submit a CSV containing a column for each property they would like to predict (e.g. called "HIC"), '

     FAQ_TAB_NAME,
     SLACK_URL,
     TUTORIAL_URL,
+    GITHUB_URL,
 )
 WEBSITE_HEADER = f"""
 1. **Create a Hugging Face account** [here](https://huggingface.co/join) if you don't have one yet (this is used to track unique submissions and to access the GDPa1 dataset).
 2. **Register your team** on the [Competition Registration](https://datapoints.ginkgo.bio/ai-competitions/2025-abdev-competition) page.
+3. **Build a model** using cross-validation on the [GDPa1](https://huggingface.co/datasets/ginkgo-datapoints/GDPa1) dataset, using the `hierarchical_cluster_IgG_isotype_stratified_fold` column to split the dataset into folds, and write out all cross-validation predictions to a CSV file. You may also use outside datasets, but still need to report these cross-validation predictions.
 4. **Use your model to make predictions** on the private test set (download the 80 private test set sequences from the {SUBMIT_TAB_NAME} tab).
 5. **Submit your training and test set predictions** on the {SUBMIT_TAB_NAME} tab by uploading both your cross-validation and private test set CSV files.
 ---
+#### Data and models
+You may use any data and models you like for the main competition, since all code/methods can be kept private and you just submit predictions.
+For the open-source prize, you must train on the GDPa1 dataset using cross-validation and must use all public models/data.
+---
 #### Acknowledgements
 We gratefully acknowledge [Tamarind Bio](https://www.tamarind.bio/)'s help in running the following models which are on the leaderboard:
 #### How to contribute?
+Check out the GitHub repository ({GITHUB_URL}) for a bunch of runnable models and Jupyter notebooks to get started, or to contribute your own models.
+We'd like to add more existing developability models to the leaderboard. Some examples of models we'd like to onboard (also tracked in the GitHub repository):
 - Absolute folding stability models (for Thermostability)
 - PROPERMAB
 - AbMelt (requires GROMACS for MD simulations)
 If you would like to form a team or discuss ideas, join the [Slack community]({SLACK_URL}) co-hosted by Bits in Bio.
 """
     ),
     "Do I need to submit my code / methods in order to participate?": (
         "No, there are no requirements to submit code / methods and submitted predictions remain private. "
+        "We have an optional field for including a short model description in the submission tab. "
         "Top performing participants will be requested to identify themselves at the end of the tournament. "
         "There will be one prize for the best open-source reproducible model, which will require code / methods to be available."
     ),
         "We reserve the right to award the open-source prize to a predictor with competitive results for a subset of properties (e.g. a top polyreactivity model)."
     ),
     "How does the open-source prize work?": (
+        "Participants who train on GDPa1 and open-source their training code and methods and have reproducible results will be eligible for the open-source prize (as well as the other prizes)."
     ),
     "Can I use proprietary tools like AlphaFold3 for the open-source prize?": (
+        "Yes, using tools that have published their inference code under proprietary licenses is allowed (like AlphaFold3 and PROPERMAB), as long as code is available and fully reproducible. Although fully open models (open to commercial use) are highly preferred though. For other prizes, you can use any private models/data you like."
+    ),
+    "Can I train on other public/private datasets?": (
+        "Yes, you can use any private models/data you like for the 5 main assay prizes, since all code/methods can be kept private and you just submit predictions. For the open-source prize, you must train on the GDPa1 dataset using cross-validation and must use all public models/data. Models with proprietary licenses but open code are allowed, but fully open models are highly preferred."
     ),
     "What do I need to submit?": (
         'There is a tab on the Hugging Face competition page to upload predictions for datasets - for each dataset participants need to submit a CSV containing a column for each property they would like to predict (e.g. called "HIC"), '

app.py CHANGED Viewed

@@ -120,7 +120,7 @@ with gr.Blocks(theme=gr.themes.Default(text_size=sizes.text_lg)) as demo:
         with gr.TabItem(ABOUT_TAB_NAME, elem_id="abdev-benchmark-tab-table"):
             gr.Markdown(ABOUT_INTRO)
             gr.Image(
-                value="./assets/prediction_explainer_cv.png",
                 show_label=False,
                 show_download_button=False,
                 show_share_button=False,

         with gr.TabItem(ABOUT_TAB_NAME, elem_id="abdev-benchmark-tab-table"):
             gr.Markdown(ABOUT_INTRO)
             gr.Image(
+                value="./assets/prediction_explainer_v3.png",
                 show_label=False,
                 show_download_button=False,
                 show_share_button=False,

assets/prediction_explainer_cv.png DELETED Viewed

Git LFS Details

SHA256: 1028b5a4034bbeb403b6a015f831dd5715baaca4698ced2b4fff85da00116297
Pointer size: 130 Bytes
Size of remote file: 79.6 kB

assets/{prediction_explainer.png → prediction_explainer_v3.png} RENAMED Viewed

File without changes

constants.py CHANGED Viewed

@@ -44,6 +44,7 @@ REGISTRATION_CODE = os.environ.get("REGISTRATION_CODE")
 TERMS_URL = "https://euphsfcyogalqiqsawbo.supabase.co/storage/v1/object/public/gdpweb/pdfs/2025%20Ginkgo%20Antibody%20Developability%20Prediction%20Competition%202025-08-28-v2.pdf"
 SLACK_URL = "https://join.slack.com/t/bitsinbio/shared_invite/zt-3dqigle2b-e0dEkfPPzzWL055j_8N_eQ"
 TUTORIAL_URL = "https://huggingface.co/blog/ginkgo-datapoints/making-antibody-embeddings-and-predictions"
 # Input CSV file requirements
 REQUIRED_COLUMNS: list[str] = [

 TERMS_URL = "https://euphsfcyogalqiqsawbo.supabase.co/storage/v1/object/public/gdpweb/pdfs/2025%20Ginkgo%20Antibody%20Developability%20Prediction%20Competition%202025-08-28-v2.pdf"
 SLACK_URL = "https://join.slack.com/t/bitsinbio/shared_invite/zt-3dqigle2b-e0dEkfPPzzWL055j_8N_eQ"
 TUTORIAL_URL = "https://huggingface.co/blog/ginkgo-datapoints/making-antibody-embeddings-and-predictions"
+GITHUB_URL = "https://github.com/ginkgobioworks/abdev-benchmark"
 # Input CSV file requirements
 REQUIRED_COLUMNS: list[str] = [