Yufeng Zhao

epsilondylan

AI & ML interests

LLM Reasoning

Recent Activity

upvoted a paper about 1 month ago

P1: Mastering Physics Olympiads with Reinforcement Learning

upvoted a paper 3 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

upvoted a paper 4 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

View all activity

Organizations

upvoted a paper about 1 month ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

upvoted a paper 3 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 114

upvoted 3 papers 4 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published Sep 11, 2025 • 80

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9, 2025 • 31

upvoted a collection 4 months ago

CompassVerifier

Collection

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward • 5 items • Updated Aug 31, 2025 • 7

updated a dataset 4 months ago

opencompass/ReasonZoo

Updated Aug 27, 2025 • 129

published a dataset 4 months ago

opencompass/ReasonZoo

Updated Aug 27, 2025 • 129

upvoted a paper 4 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 259

authored a paper 4 months ago

Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis

Paper • 2508.15754 • Published Aug 21, 2025 • 4

upvoted a paper 4 months ago

Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis

Paper • 2508.15754 • Published Aug 21, 2025 • 4

upvoted a paper 5 months ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5, 2025 • 37

upvoted 3 papers 6 months ago

upvoted a paper 7 months ago

Deciphering Trajectory-Aided LLM Reasoning: An Optimization Perspective

Paper • 2505.19815 • Published May 26, 2025 • 36

Yufeng Zhao

AI & ML interests

Recent Activity

Organizations

epsilondylan's activity