-
microsoft/OmniParser
Image-Text-to-Text • Updated • 333 • 1.71k -
Maya: An Instruction Finetuned Multilingual Multimodal Model
Paper • 2412.07112 • Published • 28 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
ModernVBERT: Towards Smaller Visual Document Retrievers
Paper • 2510.01149 • Published • 32
Frank Sommers PRO
fsommers
AI & ML interests
None yet
Recent Activity
upvoted an article 12 days ago
Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model upvoted an article 21 days ago
Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders liked
a model 22 days ago
zai-org/GLM-OCR