view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 14 days ago • 90
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior Paper • 2512.20757 • Published 8 days ago • 16
EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published 13 days ago • 17
supertoken Collection The initial checkpoints for the token comparison research. • 20 items • Updated May 22 • 2
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5 • 59