AfroXLMR-Social / README.md
Tadesse's picture
Update README.md
3a9273e verified
metadata
license: mit
tags:
  - generated_from_trainer
model-index:
  - name: afroxlmr-social
    results: []
language:
  - am
  - so
  - sw
  - zu
  - ha
  - yo
  - ti
  - ts
  - om
  - tw
  - pcm

AfroXLMR-Social

AfroXLMR-Social a social domain specialized model for 19 African Languages.

Pre-training corpus

AfriSocial corpus Available at: https://huggingface.co/datasets/Tadesse/AfriSocial

Languages

  • Amharic (amh)
  • Somali (som)
  • Afrikaans (afr)
  • Hausa (hau)
  • Igbo (ibo)
  • Yorùbá (yor)
  • Kinyarwanda (kin)
  • Tigrinya (tir)
  • Oromo (orm)
  • Twi (twi)
  • Nigerian Pidgin (pcm)
  • Algerian Arabic (arq)
  • Moroccan Arabic (ary)
  • Mozambican Portug. (ptMZ)
  • Swahili (swa)
  • Makhuwa (vmw)
  • Xitsonga (tso)
  • Xhosa (xho)
  • Zulu (zul)

Acknowledgment

BibTeX entry and citation info.

@misc{belay2025afroxlmrsocialadaptingpretrainedlanguage,
      title={AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text}, 
      author={Tadesse Destaw Belay and Israel Abebe Azime and Ibrahim Said Ahmad and David Ifeoluwa Adelani and Idris Abdulmumin and Abinew Ali Ayele and Shamsuddeen Hassan Muhammad and Seid Muhie Yimam},
      year={2025},
      eprint={2503.18247},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2503.18247}, 
}