Awesome Mobile LLMs

A curated list of LLMs and related studies targeted at mobile and embedded hardware

Last update: 10th April 2024

If your publication/work is not included - and you think it should - please open an issue or reach out directly to @stevelaskaridis.

Let's try to make this list as useful as possible to researchers, engineers and practitioners all around the world.

Mobile-First LLMs

The following Table shows sub-3B models designed for on-device deployments, sorted by year.

Name	Year	Sizes	Primary Group/Affiliation	Publication	Code Repository	HF Repository
Mobile LLMs	2024	125M, 250M	Meta	paper	-	-
Gemma	2024	2B, ...	Google	website	code, gemma.cpp	huggingface
MobiLlama	2024	0.5B, 1B	MBZUAI	paper	code	huggingface
TinyLlama	2024	1.1B	Singapore University of Technology and Design	paper	code	huggingface
Gemini-Nano	2024	1.8B, 3.25B	Google	paper	-	-
Phi-2	2023	2.7B	Microsoft	website	-	huggingface
Phi-1.5	2023	1.3B	Microsoft	paper	-	huggingface
Phi-1	2023	1.3B	Microsoft	paper	-	huggingface
RWKV	2023	169M, 430M, 1.5B, 3B, ...	EleutherAI	paper	code	huggingface
Cerebras-GPT	2023	111M, 256M, 590M, 1.3B, 2.7B ...	Cerebras	paper	code	huggingface
OPT	2022	125M, 350M, 1.3B, 2.7B, ...	Meta	paper	code	huggingface
LaMini-LM	2023	61M, 77M, 111M, 124M, 223M, 248M, 256M, 590M, 774M, 738M, 783M, 1.3B, 1.5B, ...	MBZUAI	paper	code	huggingface
Pythia	2023	70M, 160M, 410M, 1B, 1.4B, 2.8B, ...	EleutherAI	paper	code	huggingface
Galactica	2022	125M, 1.3B, ...	Meta	paper	code	huggingface
BLOOM	2022	560M, 1.1B, 1.7B, 3B, ...	BigScience	paper	code	huggingface
XGLM	2021	564M, 1.7B, 2.9B, ...	Meta	paper	code	huggingface
GPT-Neo	2021	125M, 350M, 1.3B, 2.7B	EleutherAI	-	code, gpt-neox	huggingface
MobileBERT	2020	15.1M, 25.3M	CMU, Google	paper	code	huggingface
BART	2019	140M, 400M	Meta	paper	code	huggingface
DistilBERT	2019	66M	HuggingFace	paper	code	huggingface
T5	2019	60M, 220M, 770M, 3B, ...	Google	paper	code	huggingface
TinyBERT	2019	14.5M	Huawei	paper	code	huggingface
Megatron-LM	2019	336M, 1.3B, ...	Nvidia	paper	code	-

Infrastructure / Deployment of LLMs on Device

This section showcases frameworks and contributions for supporting LLM inference on mobile and edge devices.

Deployment Frameworks

llama.cpp
- LLMFarm: iOS frontend for llama.cpp
- Sherpa: Android frontend for llama.cpp
- dusty-nv's llama.cpp: Containers for Jetson deployment of llama.cpp
MLC-LLM
- Android App: MLC Android app
- iOS App: MLC iOS app
- dusty-nv's MLC: Containers for Jetson deployment of MLC
Google MediaPipe
Apple MLX
Alibaba MNN
llama2.c (More educational, see here for android port)
tinygrad
TinyChatEngine (Targeted at Nvidia, Apple M1 and RPi)

Papers

2024

[MobiCom'24] Mobile Foundation Model as Firmware (paper, code)
Merino: Entropy-driven Design for Generative Language Models on IoT Devicess (paper)
LLM as a System Service on Mobile Devices (paper)

2023

LLMCad: Fast and Scalable On-device Large Language Model Inference (paper)
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models (paper)

2022

The Future of Consumer Edge-AI Computing (paper, talk)

Benchmarking LLMs on Device

This section focuses on measurements and benchmarking efforts for assessing LLM performance when deployed on device.

Papers

2024

MELTing point: Mobile Evaluation of Language Transformers (paper)

Applications

Papers

2024

Octopus v2: On-device language model for super agent (paper)

2023

Towards an On-device Agent for Text Rewriting (paper)

Multimodal LLMs

This section refers to multimodal LLMs, which integrate vision or other modalities in their tasks.

Papers

2024

TinyLLaVA: A Framework of Small-scale Large Multimodal Models (paper, code)
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (paper, code)

2023

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices (paper, code)

Surveys on Efficient LLMs

This section includes survey papers on LLM efficiency, a topic very much related to deploying in constrained devices.

Papers

2024

A Survey of Resource-efficient LLM and Multimodal Foundation Models (paper)

2023

Efficient Large Language Models: A Survey (paper, code)
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems (paper)
A Survey on Model Compression for Large Language Models (paper)

Training LLMs on Device

This section refers to papers attempting to train/fine-tune LLMs on device, in a standalone or federated manner.

Papers

2023

[MobiCom'23] Federated Few-Shot Learning for Mobile NLP (paper, code)
FwdLLM: Efficient FedLLM using Forward Gradient (paper, code)
[Electronics'24] Forward Learning of Large Language Models by Consumer Devices (paper)
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly (paper)
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes (paper, code)

Mobile-Related Use-cases

This section includes paper that are mobile-related, but not necessarily run on device.

Papers

2024

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs (paper)
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception (paper, code)

2023

[NeurIPS'23] AndroidInTheWild: A Large-Scale Dataset For Android Device Control (paper, code)
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation (paper, code)

Older

[ACL'20] Mapping Natural Language Instructions to Mobile UI Action Sequences (paper)

Related Awesome Repositories

If you want to read more about related topics, here are some tangential awesome repositories to visit:

Hannibal046/Awesome-LLM on Large Language Models
KennethanCeyer/awesome-llm on Large Language Models
HuangOwen/Awesome-LLM-Compression on Large Language Model Compression
csarron/awesome-emdl on Embedded and Mobile Deep Learning

Contribute

Contributions welcome! Read the contribution guidelines first.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
code-of-conduct.md		code-of-conduct.md
contributing.md		contributing.md

License

hanabhp/awesome-mobile-llm

Folders and files

Latest commit

History

Repository files navigation

Awesome Mobile LLMs

Contents

Mobile-First LLMs

Infrastructure / Deployment of LLMs on Device

Deployment Frameworks

Papers

2024

2023

2022

Benchmarking LLMs on Device

Papers

2024

Applications

Papers

2024

2023

Multimodal LLMs

Papers

2024

2023

Surveys on Efficient LLMs

Papers

2024

2023

Training LLMs on Device

Papers

2023

Mobile-Related Use-cases

Papers

2024

2023

Older

Related Awesome Repositories

Contribute

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages