DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models Paper • 2410.09344 • Published Oct 12, 2024 • 1
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs Paper • 2504.00993 • Published Apr 1, 2025 • 3
Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning Paper • 2510.03669 • Published Oct 4, 2025 • 2
When RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMs Paper • 2602.00344 • Published 7 days ago • 3
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models Paper • 2410.09344 • Published Oct 12, 2024 • 1
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs Paper • 2504.00993 • Published Apr 1, 2025 • 3
Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning Paper • 2510.03669 • Published Oct 4, 2025 • 2
When RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMs Paper • 2602.00344 • Published 7 days ago • 3
On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral Paper • 2512.04220 • Published Dec 3, 2025 • 15
LLDS-Search Collection On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral • 12 items • Updated 21 days ago
LLDS-Search Collection On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral • 12 items • Updated 21 days ago