rlsamplingJF/Llama-3.2-3B-finemath_part1-rm-lr1e-6-constant-warmup_0.05-bs32-gc1.0-cc0.01-ls0-initial 3B • Updated about 1 hour ago
rlsamplingJF/Llama-3.2-3B-finemath_part1-rm-lr1e-6-constant-warmup_0.05-bs32-gc1.0-cc0.01-ls0-step109 3B • Updated about 2 hours ago
rlsamplingJF/Llama-3.2-3B-finemath_part1-rm-lr1e-6-constant-warmup_0.05-bs16-gc1.0-initial 3B • Updated about 3 hours ago
rlsamplingJF/Llama-3.2-3B-finemath_part1-rm-lr1e-6-constant-warmup_0.05-bs16-gc1.0-step220 3B • Updated about 10 hours ago • 9
rlsamplingJF/Qwen2.5-3B-Instruct-finemath-highquality-part1-seed2028-initial 3B • Updated 25 days ago • 30
rlsamplingJF/myllama-1B-20BT-finemath-highquality-part1-seed2026-initial 0.9B • Updated 25 days ago • 29
rlsamplingJF/Llama-3.2-3B-finemath-highquality-rm-run2-lr3e-5-cosine-bs32-gc1.0-initial 3B • Updated Nov 1 • 19
rlsamplingJF/Llama-3.2-3B-finemath-highquality-rm-run2-lr3e-5-cosine-bs32-gc1.0 3B • Updated Nov 1 • 39
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step84 7B • Updated Oct 16 • 8
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step36 7B • Updated Oct 16 • 26
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-initial 7B • Updated Sep 28 • 27
rlsamplingJF/posttraining_sentence_Qwen2.5-7B-Instruct-finemath-rm-run1-lr1e-6-constant-bs8-gc10.0-step96 7B • Updated Sep 24 • 7
rlsamplingJF/cpt_rm_training_8BT_filtered_Qwen2.5-3B-finemath-rm-run4-lr1e-4-cosine-bs32-gc1.0-step15 3B • Updated Sep 22 • 7
rlsamplingJF/cpt_rm_training_8BT_filtered_llama-3.2-3b-finemath-rm-run5-lr3e-5-cosine-bs32-gc1.0-step30 3B • Updated Sep 22 • 8