Papers
arxiv:2601.00747

The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving

Published on Jan 2
· Submitted by
Max Ruiz Luyten
on Jan 5
Authors:
,

Abstract

Large language model training methods that optimize for correctness can cause reasoning path diversity collapse, but a new variational framework provides principled solutions to maintain both accuracy and creativity.

AI-generated summary

State-of-the-art large language model (LLM) pipelines rely on bootstrapped reasoning loops: sampling diverse chains of thought and reinforcing the highest-scoring ones, mainly optimizing correctness. We analyze how this design choice is sensitive to the collapse of the model's distribution over reasoning paths, slashing semantic entropy and undermining creative problem-solving. To analyze this failure, we introduce Distributional Creative Reasoning (DCR), a unified variational objective that casts training as gradient flow through probability measures on solution traces. STaR, GRPO, and DPO, as well as entropy bonuses, and other methods, all constitute special cases of the same loss. The framework delivers three core results: (i) the diversity decay theorem, describing how correctness-based objectives lead to distinct modes of diversity decay for STaR, GRPO, and DPO; (ii) designs that ensure convergence to a stable and diverse policy, effectively preventing collapse; and (iii) simple, actionable recipes to achieve this in practice. DCR thus offers the first principled recipe for LLMs that remain both correct and creative.

Community

Paper submitter

For those of you interested in RLVR, here is a paper that formally characterizes the mechanism behind "diversity collapse" in reasoning models trained with scalar rewards (such as STaR, GRPO, and DPO).

The paper introduces a variational framework based on Shahshahani gradient flow to prove that optimizing solely for correctness inherently erodes the diversity of reasoning paths, leading to a "reasoning monoculture." To address this, they propose Distributional Creative Reasoning (DCR), which incorporates a diversity energy functional (using entropy and kernel-based novelty) into the objective, mathematically guaranteeing the maintenance of a diverse portfolio of successful reasoning strategies while still optimizing for utility.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

arXiv lens breakdown of this paper 👉 https://arxivlens.com/PaperView/Details/the-reasoning-creativity-trade-off-toward-creativity-driven-problem-solving-5605-80c56a19

  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.00747 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.00747 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.00747 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.