SPARK: Synergistic Policy And Reward Co-Evolving Framework Paper • 2509.22624 • Published Sep 26 • 17 • 2
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published Feb 14 • 34 • 5