On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 1 day ago • 22
Quark Quantized PTPC FP8 Models Collection PTPC model quantized by quark • 7 items • Updated about 3 hours ago
Instella ✨ Collection Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs. • 13 items • Updated 5 days ago • 10
Instella: Fully Open Language Models with Stellar Performance Paper • 2511.10628 • Published 27 days ago • 4 • 2