Hongxu Yin's picture

1 4

Hongxu Yin

yinhongxu

·

AI & ML interests

None yet

Recent Activity

authored a paper 1 day ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

authored a paper 1 day ago

Scaling RL to Long Videos

authored a paper 1 day ago

NaVILA: Legged Robot Vision-Language-Action Model for Navigation

View all activity

Organizations

authored 12 papers 1 day ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 93

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 159

NaVILA: Legged Robot Vision-Language-Action Model for Navigation

Paper • 2412.04453 • Published Dec 5, 2024

EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos

Paper • 2507.12440 • Published Jul 16

3D Aware Region Prompted Vision Language Model

Paper • 2509.13317 • Published Sep 16 • 14

Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations

Paper • 2508.18132 • Published Aug 25

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15

SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models

Paper • 2406.01584 • Published Jun 3, 2024

WorldModelBench: Judging Video Generation Models As World Models

Paper • 2502.20694 • Published Feb 28

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 14 days ago • 100

upvoted a paper 1 day ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 14 days ago • 100

upvoted a paper about 2 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17 • 89

authored 6 papers 9 months ago

FasterViT: Fast Vision Transformers with Hierarchical Attention

Paper • 2306.06189 • Published Jun 9, 2023 • 31

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

Paper • 2306.14306 • Published Jun 25, 2023

Global Vision Transformer Pruning with Hessian-Aware Saliency

Paper • 2110.04869 • Published Oct 10, 2021

DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 30

RegionGPT: Towards Region Understanding Vision Language Model

Paper • 2403.02330 • Published Mar 4, 2024 • 2

Global Context Vision Transformers

Paper • 2206.09959 • Published Jun 20, 2022