Skip to content

pariskang/LLM-on-Kunpeng920

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-on-Kunpeng920

LLM-on-Kunpeng920 is a comprehensive guide and toolkit for deploying and optimizing Large Language Models (LLMs) on the Huawei Kunpeng 920 platform. This project aims to provide developers and researchers with the necessary tools and knowledge to effectively utilize the Kunpeng 920's ARM-based architecture for LLM inference.

Table of Contents

  1. Introduction
  2. Features
  3. Getting Started
  4. Documentation
  5. Contributing
  6. License
  7. Acknowledgements
  8. Citation

Introduction

The Huawei Kunpeng 920 is a high-performance ARM-based CPU processor designed for server workloads. This toolkit provides optimized methods for deploying various LLMs, including but not limited to ChatGLM, Baichuan, and Qwen, on this platform. Our goal is to maximize the performance of LLM inference while maintaining model accuracy.

Demo

Features

  • Detailed guides for system environment preparation
  • Instructions for model conversion and quantization
  • Optimized compilation procedures for various inference engines
  • Deployment strategies and best practices
  • Inference optimization techniques specific to Kunpeng 920
  • Performance monitoring and troubleshooting guides

Getting Started

To get started with LLM-on-Kunpeng920, follow these steps:

  1. Clone the repository:

    git clone https://github.com/pariskang/LLM-on-Kunpeng920.git
    cd LLM-on-Kunpeng920
    
  2. Follow the guides in the docs folder for detailed instructions on each step of the process.

Documentation

Our documentation is divided into several key sections:

  1. System Environment Preparation
  2. Necessary Dependencies and Tools
  3. Setting Up Model Repositories
  4. Model Conversion and Quantization
  5. Inference
  6. Demo and Ngrok

Each document provides step-by-step instructions and best practices for its respective topic.

Contributing

We welcome contributions to LLM-on-Kunpeng920! If you have suggestions for improvements or encounter any issues, please feel free to open an issue or submit a pull request.

Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Acknowledgements

  • Huawei for the Kunpeng 920 platform
  • The developers of ChatGLM, Baichuan, and Qwen for their excellent LLM implementations
  • Peng Cheng Laboratory for providing resources and support
  • All contributors and users of this toolkit

Citation

If you use LLM-on-Kunpeng920 in your research or project, please cite it as follows:

@misc{LLM-on-Kunpeng920,
  author = {Yanlan Kang},
  title = {LLM-on-Kunpeng920: Optimizing Large Language Models on Huawei Kunpeng 920},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/pariskang/LLM-on-Kunpeng920}}
}

About

华为鲲鹏920CPU上部署推理大语言模型

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published