Agent Eval - a alexngai Collection

alexngai 's Collections

Latent Reasoning

Autonomous Research

Memory/Search/Retrieval/RAG

Automated Research

Test-Time Compute/Optimal Scaling

Self-Improving Agents

Codegen Benchmarks

Agent Eval

updated about 12 hours ago

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 95
OAgents: An Empirical Study of Building Effective Agents

Paper • 2506.15741 • Published Jun 17 • 35