JacobHicks 's Collections read later
updated
Sharing is Caring: Efficient LM Post-Training with Collective RL
Experience Sharing
Paper
• 2509.08721
• Published
• 662
A.S.E: A Repository-Level Benchmark for Evaluating Security in
AI-Generated Code
Paper
• 2508.18106
• Published
• 349
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action
Model
Paper
• 2509.09372
• Published
• 246
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
• 2509.02547
• Published
• 232
A Survey of Reinforcement Learning for Large Reasoning Models
Paper
• 2509.08827
• Published
• 190
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper
• 2509.03867
• Published
• 211
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
• 2508.05748
• Published
• 141
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published
• 80
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
• 2509.13305
• Published
• 91
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement
Learning
Paper
• 2509.22647
• Published
• 33
Scaling Agents via Continual Pre-training
Paper
• 2509.13310
• Published
• 117
PaddleOCR 3.0 Technical Report
Paper
• 2507.05595
• Published
• 22
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI
Agents
Paper
• 2509.06917
• Published
• 43
AgentScope 1.0: A Developer-Centric Framework for Building Agentic
Applications
Paper
• 2508.16279
• Published
• 53
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
• 2403.13372
• Published
• 179