DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation Paper • 2512.19012 • Published 10 days ago • 16
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published 17 days ago • 49
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published 28 days ago • 167
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation Paper • 2511.23127 • Published Nov 28, 2025 • 43
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published Oct 20, 2025 • 62
TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model Paper • 2510.16449 • Published Oct 18, 2025 • 34
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26, 2025 • 65
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18, 2025 • 34
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19, 2025 • 126
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28, 2025 • 48
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24, 2025 • 63
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published Apr 29, 2025 • 43
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 191
Atla Selene Mini: A General Purpose Evaluation Model Paper • 2501.17195 • Published Jan 27, 2025 • 35
MatAnyone: Stable Video Matting with Consistent Memory Propagation Paper • 2501.14677 • Published Jan 24, 2025 • 34
ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published Feb 3, 2025 • 28
TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation Paper • 2502.07870 • Published Feb 11, 2025 • 45