HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_4_steps_2_samples_step_100 4B • Updated 1 day ago • 24
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_vanilla_like_step_90 4B • Updated 8 days ago • 18
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_thinking_step_90 4B • Updated 8 days ago • 63
HerrHruby/acemath_rl_4b_inst_hard_2_steps_complete_16k_thinking_step_50 4B • Updated 10 days ago • 15
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_230_steps 2B • Updated 28 days ago • 1.21k
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_170_steps 2B • Updated 28 days ago • 300
HerrHruby/2k_hard_5_05_8_steps_e2e_explore_small_boxed_all_bonus_pos_turn_num_2_mbzs_stable_130_steps 2B • Updated 29 days ago • 27
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_1p7b_deepscaler_16k_18k_2048_toks_560_steps 2B • Updated Sep 19 • 2
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_38k_2048_toks_2100_steps 2B • Updated Sep 15 • 2
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_2048_toks_560_steps 2B • Updated Sep 13 • 2
HerrHruby/sft_qwen_3_1p7b_reasoning_cache_e2e_deepscaler_16k_2048_toks_400_steps 2B • Updated Sep 13 • 2