T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground Paper • 2512.10430 • Published 25 days ago • 113
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success Paper • 2508.04280 • Published Aug 6, 2025 • 35
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Paper • 2504.13835 • Published Apr 18, 2025 • 38
CrowdSelect: Synthetic Instruction Data Selection with Multi-LLM Wisdom Paper • 2503.01836 • Published Mar 3, 2025 • 16
System Message Generation for User Preferences using Open-Source Models Paper • 2502.11330 • Published Feb 17, 2025 • 15
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published Feb 5, 2025 • 60
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3, 2025 • 113
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20, 2024 • 95
Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian Paper • 2405.13929 • Published May 22, 2024 • 54