Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New paper: BoRA: Bayesian Hierarchical Low-Rank Adaption for Multi-task Large #15

Open
maykcaldas opened this issue Jul 25, 2024 · 0 comments

Comments

@maykcaldas
Copy link
Collaborator

Paper: BoRA: Bayesian Hierarchical Low-Rank Adaption for Multi-task Large

Authors: Simen Eide, Arnoldo Frigessi

Abstract: This paper introduces Bayesian Hierarchical Low-Rank Adaption (BoRA), a novelmethod for finetuning multi-task Large Language Models (LLMs). Currentfinetuning approaches, such as Low-Rank Adaption (LoRA), perform exeptionallywell in reducing training parameters and memory usage but face limitations whenapplied to multiple similar tasks. Practitioners usually have to choose betweentraining separate models for each task or a single model for all tasks, both ofwhich come with trade-offs in specialization and data utilization. BoRA addresses these trade-offs by leveraging a Bayesian hierarchical modelthat allows tasks to share information through global hierarchical priors. Thisenables tasks with limited data to benefit from the overall structure derivedfrom related tasks while allowing tasks with more data to specialize. Ourexperimental results show that BoRA outperforms both individual and unifiedmodel approaches, achieving lower perplexity and better generalization acrosstasks. This method provides a scalable and efficient solution for multi-taskLLM finetuning, with significant practical implications for diverseapplications.

Link: https://arxiv.org/abs/2407.15857

Reasoning: Reasoning: Let's think step by step in order to produce the is_lm_paper. We start by examining the title and abstract. The title mentions "Multi-task Large Language Models (LLMs)" and the abstract discusses a method for finetuning these models. The focus is on improving the performance of large language models through a novel Bayesian hierarchical approach. Given that the paper is centered around methods for enhancing language models, it is clear that the primary subject is language models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant