zuijiang's picture

1 16 4

zuijiang

zuijiang

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 17 hours ago

CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

upvoted a paper 1 day ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

upvoted a paper 1 day ago

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

View all activity

Organizations

Papers 5

arxiv:2504.00502

arxiv:2503.18034

arxiv:2502.04675

arxiv:2502.02458

models 1

zuijiang/llava-qwen1.5-14B-chat

Text Generation • 15B • Updated Jul 1, 2024

datasets 3

zuijiang/alpaca-alpaca-clean

Viewer • Updated Aug 26, 2024 • 51.8k • 3

zuijiang/mistral-alpaca-clean

Viewer • Updated Aug 25, 2024 • 51.8k • 53

zuijiang/ocr_vqa

Viewer • Updated May 30, 2024 • 208k • 47