read later - a JacobHicks Collection

JacobHicks 's Collections

read later

updated Oct 2, 2025

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Paper • 2509.08721 • Published Sep 10, 2025 • 662
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25, 2025 • 349
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Paper • 2509.09372 • Published Sep 11, 2025 • 246
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 232
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 190
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4, 2025 • 211
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 141
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

Paper • 2509.13313 • Published Sep 16, 2025 • 80
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

Paper • 2509.13305 • Published Sep 16, 2025 • 91
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Paper • 2509.22647 • Published Sep 26, 2025 • 33
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16, 2025 • 117
PaddleOCR 3.0 Technical Report

Paper • 2507.05595 • Published Jul 8, 2025 • 22
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8, 2025 • 43
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22, 2025 • 53
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 179