DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published Oct 19 • 104 • 4
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model Paper • 2506.13642 • Published Jun 16 • 26 • 2
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 52 • 4