DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 13 days ago • 51
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1, 2025 • 94
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents Paper • 2507.03112 • Published Jul 3, 2025 • 33 • 2
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents Paper • 2507.03112 • Published Jul 3, 2025 • 33
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers Paper • 2506.07986 • Published Jun 9, 2025 • 19
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning Paper • 2505.19099 • Published May 25, 2025 • 7
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning Paper • 2505.19099 • Published May 25, 2025 • 7 • 3
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models Paper • 2505.02847 • Published May 1, 2025 • 29
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27, 2025 • 18
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting Paper • 2012.04529 • Published Dec 8, 2020
Efficient Crowd Counting via Structured Knowledge Transfer Paper • 2003.10120 • Published Mar 23, 2020
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation Paper • 2407.05890 • Published Jul 8, 2024
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27, 2025 • 18 • 2
S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published Feb 18, 2025 • 29
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression Paper • 2212.02746 • Published Dec 6, 2022