Welcome to the official GitHub repository for the Arctic Large Language Models (LLMs) Workshop hosted by the Bio-AI Lab on the 27th and 28th of October, 2023. This repository serves as a comprehensive guide and resource center for all attendees and those interested in the world of LLMs.
This two-day workshop provided a blend of hands-on sessions and enlightening lectures aimed at unraveling the complexities of LLMs. Our approach was immersive and practical, ensuring participants not only gained knowledge but also experienced the thrill of innovation.
-
Data Mastery:
- Detailed exploration of data preparation for LLMs.
- Hands-on language translation challenges for low-resource languages, including Norsk (Bokmål, Ny Norsk).
-
Ultra-Low Resource Languages:
- Insightful discussions on handling languages like Sami with its 8 varieties.
-
Building Foundations:
- Foundational model training from the ground up, covering requirements and intricacies.
-
Diverse Tasks:
- Diverse data preparation strategies for LLM training across multiple tasks.
-
Fine-Tuning Simplified:
- A straightforward approach to LLM fine-tuning.
-
Cutting-Edge Adaptation:
- Introduction to LoRA-based adaptation for parameter-efficient fine-tuning.
-
Innovation in Action:
- Real-world LLM application development using OpenAI API, Langchain, and Huggingface resources.
-
Privacy Matters:
- Presentation of PrivateGPT++ for private LLM-based application development, supported by our very own university's IT division.
-
Supercharged Training:
- Strategies for accelerating foundational model training using distributed servers with Nvidia and AMD GPUs.
The repository is structured as follows:
/lectures
: Dedicated webpage contains all the slides used during the lectures: bioailab.org/arcticllmworkshop2023./hands-on
: Includes the code and datasets used in the hands-on sessions.
/privateGPT++
: Dedicated resources for setting up and using PrivateGPT++.
To get started, clone this repository using:
git clone https://github.com/Bio-AI-Lab/Arctic-LLM-Workshop.git
Then navigate to the respective directories to access content related to lectures, hands-on sessions, and more.
We welcome contributions from participants and the community. If you have any suggestions or improvements, please feel free to fork the repository, make your changes, and submit a pull request.
For any queries or technical support, please open an issue on this repository, and one of our lab members will get back to you.
This repository is licensed under the MIT License.
We are excited to share these resources and hope they empower you to continue learning and innovating in the field of LLMs. Happy learning!