wenlong deng

dwenlong

AI & ML interests

None yet

Recent Activity

authored a paper about 11 hours ago

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

authored a paper about 11 hours ago

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

authored a paper about 12 hours ago

Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning

View all activity

Organizations

authored 2 papers about 11 hours ago

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

Paper • 2410.09344 • Published Oct 12, 2024 • 1

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Paper • 2504.00993 • Published Apr 1, 2025 • 3

authored 2 papers about 12 hours ago

Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning

Paper • 2510.03669 • Published Oct 4, 2025 • 2

When RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMs

Paper • 2602.00344 • Published 7 days ago • 3

upvoted 4 papers 1 day ago

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

Paper • 2410.09344 • Published Oct 12, 2024 • 1

MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs

Paper • 2504.00993 • Published Apr 1, 2025 • 3

Token Hidden Reward: Steering Exploration-Exploitation in Group Relative Deep Reinforcement Learning

Paper • 2510.03669 • Published Oct 4, 2025 • 2

When RAG Hurts: Diagnosing and Mitigating Attention Distraction in Retrieval-Augmented LVLMs

Paper • 2602.00344 • Published 7 days ago • 3

liked 2 models 5 days ago

mradermacher/LLDS-A-GRPO-Qwen2.5-7B-Base-i1-GGUF

8B • Updated 22 days ago • 7.15k • 2

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 23 days ago • 33 • 1

upvoted a paper 6 days ago

On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral

Paper • 2512.04220 • Published Dec 3, 2025 • 15

updated a collection 21 days ago

LLDS-Search

Collection

On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral • 12 items • Updated 21 days ago

updated a model 21 days ago

dwenlong/skewr-entropy

2B • Updated 21 days ago • 5

published a model 21 days ago

dwenlong/skewr-entropy

2B • Updated 21 days ago • 5

updated a model 21 days ago

dwenlong/skewr-entropy-05

2B • Updated 21 days ago • 10

published a model 21 days ago

dwenlong/skewr-entropy-05

2B • Updated 21 days ago • 10

updated a model 21 days ago

dwenlong/skewr-entropy-01

2B • Updated 21 days ago • 7

published a model 21 days ago

dwenlong/skewr-entropy-01

2B • Updated 21 days ago • 7

updated a collection 21 days ago

LLDS-Search

Collection

On GRPO Collapse in Search-R1: The Lazy Likelihood-Displacement Death Spiral • 12 items • Updated 21 days ago

wenlong deng

AI & ML interests

Recent Activity

Organizations

dwenlong's activity