Production-ready project (October 2022 - Ongoing)
An Open-Source Massively Multilingual Large Language Model
LOLA is a massively multilingual large language model trained on more than 160 languages using a sparse Mixture-of-Experts Transformer architecture. Our architecture addresses the challenge of harnessing linguistic diversity efficiently while avoiding the common pitfalls of multilinguality. LOLA demonstrates competitive performance in natural language generation and understanding tasks. Its expert-routing mechanism leverages implicit phylogenetic linguistic patterns, potentially alleviating the curse of multilinguality. As an open-source model, LOLA promotes reproducibility and serves as a robust foundation for future research. Our findings pave the way for compute-efficient multilingual models with scalable, strong performance across languages.
To learn more about LOLA, read our preprint: https://arxiv.org/abs/2409.11272
HuggingFace: hf.co/dice-research/lola_v1
GitHub: github.com/dice-group/LOLA