arxiv:2601.22975
Jian Hu
chuyi777
AI & ML interests
Reinforcement Learning
Recent Activity
updated
a dataset
about 3 hours ago
OpenRLHF/aime-2024
updated
a dataset
about 3 hours ago
OpenRLHF/dapo-math-17k
authored
a paper
2 days ago
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text