Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs Paper • 2506.17080 • Published Jun 20, 2025 • 7
view article Article Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers? Apr 4, 2025 • 16
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19, 2025 • 41