Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
RoadMa's picture
4 1 1

RoadMa

RoadQAQ
John6666's profile picture
·

AI & ML interests

None yet

Recent Activity

updated a dataset about 20 hours ago
RoadQAQ/sft_for_rl
published a dataset about 20 hours ago
RoadQAQ/sft_for_rl
updated a collection 3 months ago
Data for DataFlex
View all activity

Organizations

OpenDCAI's profile picture

RoadQAQ 's collections 1

ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 7 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 166
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 435
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 637 • 8
ReLIFT
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
  • RoadQAQ/ReLIFT-Qwen2.5-7B-Zero

    Question Answering • 8B • Updated Jun 18, 2025 • 7 • 2
  • RoadQAQ/ReLIFT-Qwen2.5-Math-1.5B-Zero

    Question Answering • 2B • Updated Jun 12, 2025 • 166
  • RoadQAQ/ReLIFT-Qwen2.5-Math-7B-Zero

    Question Answering • 8B • Updated Aug 27, 2025 • 435
  • Elliott/Openr1-Math-46k-8192

    Viewer • Updated Apr 23, 2025 • 45.8k • 637 • 8
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs