Project: Turkish Embeddings from Scratch and CPT Decoders
Infrastructure: MareNostrum 5 (BSC)
AI & ML interests
Where data finds its mind
Recent Activity
Papers
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs
TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval
TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval
Funding: EuroHPC JU Benchmark Access Grant No. EHPC-BEN-2024B11-003
Infrastructure: IT4Innovations National Supercomputing Center (Karolina)
A Large-Scale Benchmark for Legal Text Embeddings
FP8 Rowwise and BF16 tensorwise models with optimized recipes for large-scale training efficiency and convergence stability.
-
newmindai/Llama-3.1-8B-Instruct-w16a16-tw
Text Generation • 8B • Updated • 2 -
newmindai/Llama-3.1-8B-Instruct-w16a8-rw
Text Generation • 8B • Updated • 1 -
newmindai/Llama-3.1-8B-Instruct_w16a8_rw_with_gw_hp
Text Generation • 8B • Updated • 14 -
newmindai/Llama-3.1-8B-Instruct-w16a8-mxtw
Text Generation • 8B • Updated • 39
-
newmindai/ModernBERT-tr-uncased-stsb-HD
Token Classification • 0.1B • Updated • 6 • 1 -
newmindai/TurkEmbed4STS-HD
Token Classification • 0.3B • Updated • 20 • 1 -
newmindai/lettucedect-210m-eurobert-tr-v1
Token Classification • 0.2B • Updated • 33 • 1 -
Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications
Paper • 2509.17671 • Published • 10
Project: Turkish Embeddings from Scratch and CPT Decoders
Infrastructure: MareNostrum 5 (BSC)
A Large-Scale Benchmark for Legal Text Embeddings
TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval
FP8 Rowwise and BF16 tensorwise models with optimized recipes for large-scale training efficiency and convergence stability.
-
newmindai/Llama-3.1-8B-Instruct-w16a16-tw
Text Generation • 8B • Updated • 2 -
newmindai/Llama-3.1-8B-Instruct-w16a8-rw
Text Generation • 8B • Updated • 1 -
newmindai/Llama-3.1-8B-Instruct_w16a8_rw_with_gw_hp
Text Generation • 8B • Updated • 14 -
newmindai/Llama-3.1-8B-Instruct-w16a8-mxtw
Text Generation • 8B • Updated • 39
Funding: EuroHPC JU Benchmark Access Grant No. EHPC-BEN-2024B11-003
Infrastructure: IT4Innovations National Supercomputing Center (Karolina)
-
newmindai/ModernBERT-tr-uncased-stsb-HD
Token Classification • 0.1B • Updated • 6 • 1 -
newmindai/TurkEmbed4STS-HD
Token Classification • 0.3B • Updated • 20 • 1 -
newmindai/lettucedect-210m-eurobert-tr-v1
Token Classification • 0.2B • Updated • 33 • 1 -
Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications
Paper • 2509.17671 • Published • 10