Commit
Β·
d8b25f9
1
Parent(s):
9a87acd
Added more explainers
Browse files
about.py
CHANGED
|
@@ -42,7 +42,9 @@ Here we invite the community to submit and develop better predictors, which will
|
|
| 42 |
#### π Prizes
|
| 43 |
|
| 44 |
For each of the 5 properties in the competition, there is a prize for the model with the highest performance for that property on the private test set.
|
| 45 |
-
There is also an 'open-source' prize for the best model trained on the GDPa1 dataset
|
|
|
|
|
|
|
| 46 |
For each of these 6 prizes, participants have the choice between
|
| 47 |
- **$10 000 in data generation credits** with [Ginkgo Datapoints](https://datapoints.ginkgo.bio/), or
|
| 48 |
- A **$2000 cash prize**.
|
|
@@ -124,7 +126,7 @@ FAQS = {
|
|
| 124 |
"No, there are no requirements to submit code / methods and submitted predictions remain private. "
|
| 125 |
"We also have an optional field for including a short model description. "
|
| 126 |
"Top performing participants will be requested to identify themselves at the end of the tournament. "
|
| 127 |
-
"There will be one prize for the best open-source model, which will require code / methods to be available."
|
| 128 |
),
|
| 129 |
"How exactly can I evaluate my model?": (
|
| 130 |
"You can easily calculate the Spearman correlation coefficient on the GDPa1 dataset yourself before uploading to the leaderboard. "
|
|
@@ -172,25 +174,28 @@ SUBMIT_INSTRUCTIONS = f"""
|
|
| 172 |
You do **not** need to predict all 5 properties β each property has its own leaderboard and prize.
|
| 173 |
|
| 174 |
## Instructions
|
| 175 |
-
1. **Upload
|
| 176 |
-
- **GDPa1 Cross-Validation predictions** (using cross-validation folds)
|
| 177 |
-
- **Private Test Set predictions** (final test submission)
|
| 178 |
2. Each CSV should contain `antibody_name` + one column per property you are predicting (e.g. `"antibody_name,Titer,PR_CHO"` if your model predicts Titer and Polyreactivity).
|
| 179 |
- List of valid property names: `{', '.join(ASSAY_LIST)}`.
|
| 180 |
-
|
|
|
|
| 181 |
|
| 182 |
The GDPa1 results should appear on the leaderboard within a minute, and can also be calculated manually using average Spearman rank correlation across the 5 folds.
|
| 183 |
|
| 184 |
## Cross-validation
|
| 185 |
|
| 186 |
-
For the GDPa1 cross-validation predictions
|
| 187 |
-
|
| 188 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 189 |
|
| 190 |
## Test set
|
| 191 |
|
| 192 |
-
The **private test set
|
| 193 |
-
ποΈ
|
| 194 |
|
| 195 |
Submissions close on **1 November 2025**.
|
| 196 |
"""
|
|
|
|
| 42 |
#### π Prizes
|
| 43 |
|
| 44 |
For each of the 5 properties in the competition, there is a prize for the model with the highest performance for that property on the private test set.
|
| 45 |
+
There is also an 'open-source' prize for the best reproducible model: one that is trained on the GDPa1 dataset (reporting cross-validation results) and assessed on the private test set where authors provide all training code and data.
|
| 46 |
+
This will be judged by a panel (i.e. by default the model with the highest average Spearman correlation across all properties will be selected, but a really good model on just one property may be better for the community).
|
| 47 |
+
|
| 48 |
For each of these 6 prizes, participants have the choice between
|
| 49 |
- **$10 000 in data generation credits** with [Ginkgo Datapoints](https://datapoints.ginkgo.bio/), or
|
| 50 |
- A **$2000 cash prize**.
|
|
|
|
| 126 |
"No, there are no requirements to submit code / methods and submitted predictions remain private. "
|
| 127 |
"We also have an optional field for including a short model description. "
|
| 128 |
"Top performing participants will be requested to identify themselves at the end of the tournament. "
|
| 129 |
+
"There will be one prize for the best open-source reproducible model, which will require code / methods to be available."
|
| 130 |
),
|
| 131 |
"How exactly can I evaluate my model?": (
|
| 132 |
"You can easily calculate the Spearman correlation coefficient on the GDPa1 dataset yourself before uploading to the leaderboard. "
|
|
|
|
| 174 |
You do **not** need to predict all 5 properties β each property has its own leaderboard and prize.
|
| 175 |
|
| 176 |
## Instructions
|
| 177 |
+
1. **Upload two CSV files**: one with GDPa1 cross-validation predictions, and one with private test set predictions
|
|
|
|
|
|
|
| 178 |
2. Each CSV should contain `antibody_name` + one column per property you are predicting (e.g. `"antibody_name,Titer,PR_CHO"` if your model predicts Titer and Polyreactivity).
|
| 179 |
- List of valid property names: `{', '.join(ASSAY_LIST)}`.
|
| 180 |
+
- Include the `"hierarchical_cluster_IgG_isotype_stratified_fold"` column if submitting cross-validation predictions.
|
| 181 |
+
3. You can resubmit as often as you like; only your latest submission will count for both the leaderboard and final test set scoring.
|
| 182 |
|
| 183 |
The GDPa1 results should appear on the leaderboard within a minute, and can also be calculated manually using average Spearman rank correlation across the 5 folds.
|
| 184 |
|
| 185 |
## Cross-validation
|
| 186 |
|
| 187 |
+
For the GDPa1 cross-validation predictions:
|
| 188 |
+
1. Split the dataset using the `"hierarchical_cluster_IgG_isotype_stratified_fold"` column
|
| 189 |
+
2. Train on 4 folds and predict on the held-out fold
|
| 190 |
+
3. Collect held-out predictions for all 5 folds into one dataframe
|
| 191 |
+
4. Write this dataframe to a .csv file and submit as your GDPa1 cross-validation predictions
|
| 192 |
+
|
| 193 |
+
The leaderboard will show the average Spearman rank correlation across the 5 folds. For a code example, check out our tutorial on training an antibody developability prediction model with cross-validation [here]({TUTORIAL_URL}).
|
| 194 |
|
| 195 |
## Test set
|
| 196 |
|
| 197 |
+
The **private test set submissions will not be scored automatically**, to avoid test set hacking. They will be evaluated after submissions close to determine winners.
|
| 198 |
+
ποΈ We will release one interim scoring of the latest private test set submissions on **October 13th**. Use this opportunity to see how your model is performing on the heldout test set and refine accordingly.
|
| 199 |
|
| 200 |
Submissions close on **1 November 2025**.
|
| 201 |
"""
|