RL Compositionality
					Collection
				
From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones. https://huggingface.co/papers/2509.25123
					• 
				5 items
				• 
				Updated
					
				
The model after Stage 1 RFT.
Base model
meta-llama/Llama-3.1-8B