← Go back

How to Train Your Large Language Model (HTYLLM)

Project Group Master

The DICE Group has been actively involved in the development and application of Large Language Models (LLMs) across various fields. Following the successful publication of our first massively multilingual LLM, LOLA, we are now aiming to scale our research to cover even more languages and modalities. With this goal in mind, we offer students a unique opportunity to collaborate on developing the next generation of multilingual and multimodal language models. This project will not only push the boundaries of current LLM capabilities but also provide hands-on experience in cutting-edge Natural Language Processing (NLP) and Machine Learning (ML) techniques.

Project Goal

Our project group aims to train a large, open-source multilingual language model and address the challenges posed by the curse of multilinguality. Specifically, our goals include:

  • Support 500+ Languages: Ensure the model can handle a wide range of languages from different linguistic families.
  • Ensure Computational Efficiency: Optimize the model to run efficiently by exploring sparse architectures.
  • Enable Multimodal Capabilities: Integrate support for multiple modalities such as text, images, and audio.
  • Maintain Linguistic Extensibility: Design the model to be easily adaptable to new languages and linguistic features.

For more information, check out the slides: HTYLLM_PG_SoSe_25.pdf.

FAQs

Q: What is the selection process for this project?
A: Candidates will need to submit an assignment and undergo an interview as part of the selection process.

Q: Is there a seminar connected to this PG?
A: No.

Q: What are the prerequisites for this PG?
A: The ideal candidate should possess foundational knowledge in NLP and ML, along with strong programming skills in Python and shell scripting. Additionally, proficiency in Linux is essential. The ability to learn quickly and adapt to new technologies and methodologies is also critical as the PG domain is expected to have steep learning curve.

In case you have further questions, feel free to contact Nikit Srivastava.

Course in PAUL

L.079.07063 Project Group: How to Train Your Large Language Model (in English)