asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-ngram-spec8 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec6 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec8 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-ngram-spec2 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-drgrpo-lora-ngram-spec5 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-rloo-lora-ngram-spec5 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec5 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-drgrpo-lora-eagle3-spec5 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-rloo-lora-eagle3-spec5 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-ngram-spec4 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec4 Reinforcement Learning • Updated 13 days ago
asatheesh/deepmath-qwen3-4b-instruct-grpo-lora-eagle3-spec2 Reinforcement Learning • Updated 13 days ago