LLaMA

Tested on Macbook Pro M1 Max -- pytorch nightly

LLaMA

This repository is intended as a minimal, hackable and readable example to load LLaMA (arXiv) models and run inference. In order to download the checkpoints and tokenizer, fill this google form

Setup

In a conda env with pytorch / cuda available, run

pip install -r requirements.txt

Then in this repository

pip install -e .

If you are using MPS commit, use these to disable mps backend memory limit + fallback

export PYTORCH_ENABLE_MPS_FALLBACK=1
export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0

torchrun --nproc_per_node 1 example.py --ckpt_dir $TARGET_FOLDER/model_size --tokenizer_path $TARGET_FOLDER/tokenizer.model

Different models require different MP values:

Model	MP
7B	1
13B	2
33B	4
65B	8

Model Card

See MODEL_CARD.md

License

See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
llama		llama
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
README.md		README.md
download.sh		download.sh
example.py		example.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA

Setup

Model Card

License

About

Languages

License

b0kch01/llama-cpu

Folders and files

Latest commit

History

Repository files navigation

LLaMA

Setup

Model Card

License

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages