kotoba-tech/kotoba-whisper-v2.0 Automatic Speech Recognition • 0.8B • Updated Oct 23, 2024 • 17.4k • 88
facebook/dinov3-convnext-small-pretrain-lvd1689m Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 34.8k • 22
Configuration error Featured 1.45k EasyControl Ghibli 🦀 1.45k New Ghibli EasyControl model is now released!!
Running on Zero Featured 194 Chat with Kimi-VL-A3B-Thinking-2506 🤔 194 Chat with Kimi-VL: respond to text, images, video, PDFs
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 205
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users Paper • 2503.02268 • Published Mar 4, 2025 • 11
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 334k • 1.58k