Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents Paper • 2509.06917 • Published Sep 8, 2025 • 43
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework Paper • 2308.08155 • Published Aug 16, 2023 • 11
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29, 2024 • 51
Running on CPU Upgrade Featured 2.98k The Smol Training Playbook 📚 2.98k The secrets to building world-class LLMs
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Paper • 2402.14207 • Published Feb 22, 2024 • 10
VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models Paper • 2509.19803 • Published Sep 24, 2025 • 120
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published Mar 3, 2025 • 30
You Have Thirteen Hours in Which to Solve the Labyrinth: Enhancing AI Game Masters with Function Calling Paper • 2409.06949 • Published Sep 11, 2024 • 1
Instruction-Driven Game Engine: A Poker Case Study Paper • 2410.13441 • Published Oct 17, 2024 • 2
Generative Agents: Interactive Simulacra of Human Behavior Paper • 2304.03442 • Published Apr 7, 2023 • 14
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26, 2024 • 35
WebGames: Challenging General-Purpose Web-Browsing AI Agents Paper • 2502.18356 • Published Feb 25, 2025 • 14
SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially? Paper • 2503.12349 • Published Mar 16, 2025 • 44