-
rulins/rl_rag_longform_rubrics_only_with_system_prompt
Viewer • Updated • 1k • 17 -
rulins/rl_rag_surveyqa_validation_longform_rubrics_only_with_system_prompt
Viewer • Updated • 703 • 2 -
rulins/rl_rag_surveyqa_validation_longform_averaged_outcome_with_system_prompt
Viewer • Updated • 703 • 7
Rulin Shao
rulins
AI & ML interests
None yet
Recent Activity
updated
a dataset
4 days ago
rl-rag/1_sample_toy_rag_survey
published
a dataset
4 days ago
rl-rag/1_sample_toy_rag_survey
updated
a dataset
6 days ago
rl-rag/1_sample_toy
Organizations
p-bench
-
rulins/math-llama31-8b-n64-filtered
Viewer • Updated • 20 • 16 -
rulins/5math-qwen-math-n64-mv-filtered-10each
Viewer • Updated • 40 • 27 -
rulins/5math-5assessors-n64-filtered-20each
Viewer • Updated • 80 • 32 -
rulins/math-qwen25-math-7b-n64-filtered-10each
Viewer • Updated • 40 • 30
rl-rag
-
rulins/rl_rag_longform_rubrics_only_with_system_prompt
Viewer • Updated • 1k • 17 -
rulins/rl_rag_surveyqa_validation_longform_rubrics_only_with_system_prompt
Viewer • Updated • 703 • 2 -
rulins/rl_rag_surveyqa_validation_longform_averaged_outcome_with_system_prompt
Viewer • Updated • 703 • 7
p-bench
-
rulins/math-llama31-8b-n64-filtered
Viewer • Updated • 20 • 16 -
rulins/5math-qwen-math-n64-mv-filtered-10each
Viewer • Updated • 40 • 27 -
rulins/5math-5assessors-n64-filtered-20each
Viewer • Updated • 80 • 32 -
rulins/math-qwen25-math-7b-n64-filtered-10each
Viewer • Updated • 40 • 30
models
65
rulins/rar_cb_bs_16_rollout_8__1__1759453746_checkpoints_step_100
Updated
rulins/rar_cb_bs_16_rollout_8_revision_4_margin_0.2_true__1__1759453858_checkpoints_step_50
333k
•
Updated
•
10
rulins/rar_cb_bs_16_rollout_8_revision_4_4models__1__1759460360_checkpoints_step_50
333k
•
Updated
•
13
rulins/rar_cb_bs_16_rollout_8_revision_4__1__1759453858_checkpoints_step_50
333k
•
Updated
•
10
rulins/rar_cb_bs_16_rollout_8_margin_0.2_true__1__1759453832_checkpoints_step_50
333k
•
Updated
•
13
rulins/rar_cb_bs_16_rollout_8_margin_0.2_false__1__1759453832_checkpoints_step_50
333k
•
Updated
•
12
rulins/rar_cb_bs_16_rollout_8_adaptive_rubric__1__1759453832_checkpoints_step_50
333k
•
Updated
•
14
rulins/rar_cb_bs_16_rollout_8__1__1759453746_checkpoints_step_50
333k
•
Updated
•
11
rulins/rar_cb_bs_16_rollout_8_revision_4_margin_0.2_true_adaptive__1__1759453902_checkpoints_step_25
333k
•
Updated
•
11
rulins/rar_cb_bs_16_rollout_8_revision_4_margin_0.2_true__1__1759453858_checkpoints_step_25
333k
•
Updated
•
26
datasets
111
rulins/multi_question_synthetic_single_source_asearcher_base_5q
Viewer
•
Updated
•
2k
•
26
rulins/multi_question_synthetic_single_source_asearcher_base_10q
Viewer
•
Updated
•
2k
•
19
rulins/multi_question_synthetic_single_source_asearcher_base_30q
Viewer
•
Updated
•
2k
•
29
rulins/multi_question_synthetic_single_source_2wiki_5q
Viewer
•
Updated
•
3.06k
•
2
rulins/multi_question_synthetic_single_source_2wiki_3q
Viewer
•
Updated
•
5.1k
•
9
rulins/multi_question_synthetic_single_source_tqa_5q
Viewer
•
Updated
•
31.3k
•
2
rulins/multi_question_synthetic_single_source_2wiki_15q
Viewer
•
Updated
•
1.02k
•
13
rulins/multi_question_synthetic_single_source_2wiki_10q
Viewer
•
Updated
•
1.53k
•
3
rulins/multi_question_synthetic_single_source_tqa_15q
Viewer
•
Updated
•
10.4k
•
32
rulins/multi_question_synthetic_single_source_tqa_10q
Viewer
•
Updated
•
15.6k
•
5