GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation Paper • 2512.17495 • Published 10 days ago • 19
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers Paper • 2512.16615 • Published 10 days ago • 4
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27 • 215
Taming Generative Synthetic Data for X-ray Prohibited Item Detection Paper • 2511.15299 • Published Nov 19 • 2
Taming Generative Synthetic Data for X-ray Prohibited Item Detection Paper • 2511.15299 • Published Nov 19 • 2 • 2
Taming Generative Synthetic Data for X-ray Prohibited Item Detection Paper • 2511.15299 • Published Nov 19 • 2