Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

Full-text search

Active filters: text-generation-inference

nvidia/Nemotron-Orchestrator-8B

Text Generation • 8B • Updated 6 days ago • 2.83k • 371

microsoft/Fara-7B

Image-Text-to-Text • 8B • Updated 7 days ago • 31.5k • 425

meta-llama/Llama-3.1-8B-Instruct

Text Generation • 8B • Updated Sep 25, 2024 • 5.19M • • 5.09k

open-thoughts/OpenThinker-Agent-v1

Text Generation • 8B • Updated 1 day ago • 95 • 38

maya-research/maya1

Text-to-Speech • 3B • Updated 26 days ago • 68.4k • • 805

aquif-ai/aquif-3.5-Max-1205

Text Generation • 42B • Updated 1 day ago • 127 • 23

google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.03M • 1.02k

Qwen/Qwen3-4B-Instruct-2507

Text Generation • 4B • Updated Sep 17 • 6.21M • • 528

Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated Jul 26 • 7.59M • • 849

Qwen/Qwen2.5-7B-Instruct

Text Generation • 8B • Updated Jan 12 • 7.25M • • 932

Qwen/Qwen3-8B

Text Generation • 8B • Updated Jul 26 • 4.76M • • 792

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Apr 6 • 3.34M • • 1.38k

Qwen/Qwen3-Embedding-8B

Feature Extraction • 8B • Updated Jul 7 • 830k • • 475

meta-llama/Llama-3.1-8B

Text Generation • 8B • Updated Oct 16, 2024 • 731k • • 1.96k

meta-llama/Meta-Llama-3-8B-Instruct

Text Generation • 8B • Updated Jun 18 • 1.23M • • 4.31k

meta-llama/Llama-3.2-3B-Instruct

Text Generation • 3B • Updated Oct 24, 2024 • 1.72M • • 1.86k

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 1.21M • • 12.9k

google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21 • 1.42M • • 1.73k

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • 1B • Updated Mar 17, 2024 • 1.85M • 1.47k

meta-llama/Llama-3.3-70B-Instruct

Text Generation • 71B • Updated Dec 21, 2024 • 412k • • 2.59k

nvidia/Cosmos-Reason1-7B

Image-Text-to-Text • 8B • Updated Aug 14 • 108k • 217

dphn/Dolphin-Mistral-24B-Venice-Edition

Text Generation • 24B • Updated Sep 8 • 9.67k • • 324

Genius-Society/hoyoMusic

Updated Oct 30 • 28

google/gemma-2-2b-it

Text Generation • 3B • Updated Aug 27, 2024 • 759k • • 1.24k

Qwen/Qwen3-1.7B

Text Generation • 2B • Updated Jul 26 • 4.14M • • 345

Qwen/Qwen3-Embedding-0.6B

Feature Extraction • 0.6B • Updated Jun 20 • 4.44M • • 767

WeiboAI/VibeThinker-1.5B

Text Generation • 2B • Updated 14 days ago • 27.9k • 498

thu-pacman/PCMind-2.1-Kaiyuan-2B

Text Generation • 2B • Updated 18 minutes ago • 185 • 9

Qwen/Qwen2.5-0.5B-Instruct

Text Generation • 0.5B • Updated Sep 25, 2024 • 2.14M • 403

meta-llama/Llama-3.2-1B

Text Generation • 1B • Updated Oct 24, 2024 • 3.16M • 2.21k