Ximing Lu
Ximing
AI & ML interests
None yet
Recent Activity
submitted
a paper
2 days ago
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
authored
a paper
26 days ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization