arxiv:2601.21244
YijuGuo
AI & ML interests
LLM Alignment
Recent Activity
upvoted
a
paper
about 14 hours ago
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
upvoted
a
paper
4 days ago
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research
authored
a paper
9 days ago
Controllable Preference Optimization: Toward Controllable
Multi-Objective Alignment