小明's picture

25 42

小明

xiaoming

·

xiaominghero

AI & ML interests

nlp

Recent Activity

upvoted a paper 3 days ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

upvoted a collection 5 days ago

liked a model 14 days ago

stepfun-ai/Step-3.5-Flash

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published 4 days ago • 173

upvoted a collection 5 days ago

UltraData

Ultra Scale, Ultra Quality, Ultra Coverage • 9 items • Updated 6 days ago • 74

upvoted 2 papers about 1 month ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 193

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published Jan 9 • 84

upvoted an article about 1 month ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3, 2025

•

58

upvoted 2 papers about 2 months ago

Step-DeepResearch Technical Report

Paper • 2512.20491 • Published Dec 23, 2025 • 86

Step-GUI Technical Report

Paper • 2512.15431 • Published Dec 17, 2025 • 132

upvoted an article 2 months ago

Article

We Got Claude to Fine-Tune an Open Source LLM

Dec 4, 2025

•

592

upvoted a collection 5 months ago

MobileLLM-R1

MobileLLM-R1, a series of sub-billion parameter reasoning models • 10 items • Updated Nov 21, 2025 • 27

upvoted 2 papers 6 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 213

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 297

upvoted a collection 6 months ago

Nemotron-Pre-Training-Datasets

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 11 days ago • 96

upvoted a paper 6 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14, 2025 • 145

upvoted an article 6 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

756

upvoted a collection 7 months ago

SmolDocling datasets

Datasets used to train SmolDocling • 6 items • Updated Jul 31, 2025 • 31

upvoted 3 papers 7 months ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31, 2025 • 115

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 70

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published Jul 22, 2025 • 63

upvoted a collection 7 months ago

Kimi-K2

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 19 days ago • 172

upvoted a paper 8 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 273