arxiv:2505.14045
Yingli Shen
ylshen
AI & ML interests
Postdoctoral Researcher @ THUNLP, Tsinghua University.
Researching Multilingual Large Language Models.
Recent Activity
updated
a dataset
3 days ago
openbmb/DCAD-2000
authored
a paper
about 2 months ago
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data
Cleaning as Anomaly Detection
authored
a paper
about 2 months ago
From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way
Parallel Corpora