Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
MercedeSnape
's Collections
RAG
future
kg
memory
Evolve
reasoning evaluation
agent reasoning
mm thinking
agent training
RL agent
agent env
model paradigm
mas
agent training
updated
about 22 hours ago
Upvote
-
Don't Just Fine-tune the Agent, Tune the Environment
Paper
•
2510.10197
•
Published
Oct 11
•
28
Note
从问题实例而非SFT / RL 方法post-training
Upvote
-
Share collection
View history
Collection guide
Browse collections