A Competition Winning Deep Reinforcement Learning Agent in microRTS Paper • 2402.08112 • Published Feb 12, 2024
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published Jan 31, 2025 • 10