Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support third-party oneflow device extension #549

Merged
merged 5 commits into from
Sep 4, 2024

Conversation

0x404
Copy link
Contributor

@0x404 0x404 commented Sep 4, 2024

Changes of this PR made:

  1. In _init_distributed_env, import oneflow’s third-party library (i.e., oneflow-npu and oneflow-xpu) based on the actual device in use, because _DistributeUtil is only called once, thus we will import the required third-party library only once.
  2. Slight modifications to the initialization logic of BasePipeline: if the user provides the model_path parameter, then BasePipeline will load the model based on this parameter (which has a higher priority than the model configuration in config’s model.cfg.pretrained_model_path). If this parameter is not provided, the default will be to use model.cfg.pretrained_model_path.
  3. If the tokenization.tokenizer in the config does not have a pretrained_model_path set, the default will be to use the file tokenizer.model under model.cfg.pretrained_model_path (this is usually correct, as it is the default storage location and naming used by hugging face).

@0x404 0x404 requested a review from ShawnXuan September 4, 2024 07:43
@ShawnXuan ShawnXuan merged commit 593937f into llama_device Sep 4, 2024
2 checks passed
@ShawnXuan ShawnXuan deleted the zqh/llama_device branch September 4, 2024 08:41
ShawnXuan added a commit that referenced this pull request Sep 5, 2024
* update llama for multi devices

* xpu and npu config files

* update device for inference

* update

* update

* update README

* update

* format

* format

* fix

* feat: support third-party oneflow device extension (#549)

* feat: support third-party device oneflow extentions

also, refactor the build process of model and tokenizer using
pretrained_model_path cofnig

* refactor: remove unnecessary config and warnings

* docs: update readme for commands to run llama on npu and xpu

* fix import order

* update

* update

* fix: skip lint on oneflow third-party imports

---------

Co-authored-by: Qunhong Zeng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants