We’re excited to share our pre-trained ALBERT models for Arabic!
Trained on a corpus of ~4.4B words, these models are the first to use sentence piece tokenization for Arabic.
Thanks to Google for providing TPUs for the training and for huggingface to host the models.
Read the blog for detailed information.