Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from pytorch:main #12

Closed
wants to merge 11 commits into from
Closed

[pull] main from pytorch:main #12

wants to merge 11 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 24, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

shoumikhin and others added 2 commits February 23, 2024 19:26
Summary: .

Reviewed By: kirklandsign

Differential Revision: D54135699

fbshipit-source-id: c9b160140e5a8f86d9ca04d983dcf3c042532d45
Summary:
## Context

Move Vulkan graph runtime from PyTorch directory to ExecuTorch directory to improve development logistics:

* ExecuTorch delegate changes will no longer require export to PyTorch directory
* Makes it much easier to enable OSS build for Vulkan delegate

bypass-github-export-checks

Reviewed By: shoumikhin

Differential Revision: D54133350

fbshipit-source-id: 7c14d6ffbfc9a49bbabae476d2373986fc8acad3
@pull pull bot added the ⤵️ pull label Feb 24, 2024
kimishpatel and others added 9 commits February 24, 2024 07:53
Summary:
Pull Request resolved: #2055

Enables fast perf benchmarking for per channel 4 bit dynamic quant of linear
layers.

Since XNNPACK delegate now supports 4 bit dqlinear, it should get lowered as
well.
ghstack-source-id: 216323964
exported-using-ghexport

bypass-github-export-checks
bypass-github-pytorch-ci-checks
bypass-github-executorch-ci-checks

Reviewed By: digantdesai, shoumikhin, kirklandsign

Differential Revision: D54120422

fbshipit-source-id: 63050ef2fe262224114986a976139d07690d4b42
Summary:
Pull Request resolved: #2088

Llama2 7b checkpoints are in bfloat16 and the current code fails at sdpa op due
to type mismatch
bypass-github-export-checks
ghstack-source-id: 216296241
exported-using-ghexport

Reviewed By: larryliu0820

Differential Revision: D54137104

fbshipit-source-id: ea1fe9b5b696e5139e367f01c5c07fef6501fef8
Summary: pow(n,2) => n*n

Reviewed By: manuelcandales

Differential Revision: D54142779

fbshipit-source-id: 91185e07e9fe040df090756d78be2cf5e87f6e1d
Summary: Add options for embedding quantization: bitwidth, group_size on CLI

Reviewed By: mavlyutovr

Differential Revision: D54159472

fbshipit-source-id: 25b0b560667c3875c911b878ee0f28fd042ff713
…loat

Summary:
Resolve recurring errors where query is c10::Half and key and value float.  This should ideally work from first principles, but somehow it does not.

We need to fix this but in the meantime this ugly have will enable us to proceed and allow others to debug other aspects of ET lowering.

Reviewed By: mavlyutovr

Differential Revision: D54167581

fbshipit-source-id: 6cc4e76e3abbf107014b5b9da00e817ee3b2ab03
Summary:
Earlier changes to 4bit working diff results in not working 4 bit support.

THis diff restores those and avoids using min/max. This would have also
intefered with 8bit quant that expects symmeteric min/max unlike 4bit.

Reviewed By: digantdesai

Differential Revision: D54198222

fbshipit-source-id: 035d34bbd7f87f8eb7fa61ac6b938ecac651cb00
Summary:
Currently it is using a magic formula.

However, there are cases where we need to set explicitly

Currently, it should be a no-op, existing models and CI wouldn't break.

Reviewed By: kimishpatel, iseeyuan, Jack-Khuu, shreydesai

Differential Revision: D54121131

fbshipit-source-id: 996534456c9e83e10e7fdaa6973c754d37e734d4
Summary:
Pull Request resolved: #2101

In [PR #2062](#2062), we introduced the partitioner and removed this failing test. The test fails because we were using the wrong op name. We fix it to that from [PR #1737](#1737).
ghstack-source-id: 216477874
exported-using-ghexport

Reviewed By: SS-JIA

Differential Revision: D54206402

fbshipit-source-id: 0c9ae2af9a380e8aa0e28b107a33ccd36a89033e
Summary:
Pull Request resolved: #2102

Derived classes of `OpNode` are currently used only for prepack or execute, never both. This means they need not have both API.

Inspired by [Stephen's comment](https://www.internalfb.com/diff/D53982441?dst_version_fbid=370105355800543&transaction_fbid=940226924354801), we will build on `ExecuteNode` to be initialized with member functions for `create_params_block()`, `get_shader()`, etc. `PrepackNode` doesn't need these members. Hence, it makes sense to separate the classes.

Feel free to suggest better names. I don't really like mine.
ghstack-source-id: 216481872
exported-using-ghexport

Reviewed By: SS-JIA

Differential Revision: D54042646

fbshipit-source-id: a8965d69c94cdb3fe9837e9b83b6db8a877949f0
kirklandsign pushed a commit that referenced this pull request Oct 22, 2024
…ytorch#6407)

* Main backup (#12)

* Add nnlib as submodule

* Adding nnlib submodule

* Integrated nnlib API unde backends/cadence/hifi

* Fix review comments on PR#3

* Add nnlib as submodule

* Adding nnlib submodule

* Integrated nnlib API unde backends/cadence/hifi

* Fix review comments on PR#3

* Incorporated feedback from Meta team.

* lint errors fixed

* Adding Sub operator optimized version

* Add optimization for add, mul operators

* Adding Div operator

* Modified div mod to cover truncate and floor modes

---------

Co-authored-by: cad-audio <[email protected]>
Co-authored-by: cad-audio <[email protected]>

* Adding sigmoid optimizations

* Adding tanh optimizations

* Fixing review comments in 5483

* Adding cflags to prevent compilation halts

* Adding cflags to prevent compilation halts

* Changing name space of optimized ops; Remove unused ops from file

* fixed lint issues.

* Namespace updates for cadence ops, adding 6 optimized ops

---------

Co-authored-by: cad-audio <[email protected]>
Co-authored-by: cad-audio <[email protected]>
kirklandsign pushed a commit that referenced this pull request Nov 21, 2024
* Main backup (#12)

* Add nnlib as submodule

* Adding nnlib submodule

* Integrated nnlib API unde backends/cadence/hifi

* Fix review comments on PR#3

* Add nnlib as submodule

* Adding nnlib submodule

* Integrated nnlib API unde backends/cadence/hifi

* Fix review comments on PR#3

* Incorporated feedback from Meta team.

* lint errors fixed

* Adding Sub operator optimized version

* Add optimization for add, mul operators

* Adding Div operator

* Modified div mod to cover truncate and floor modes

---------

Co-authored-by: cad-audio <[email protected]>
Co-authored-by: cad-audio <[email protected]>

* Adding sigmoid optimizations

* Adding tanh optimizations

* Fixing review comments in 5483

* Adding cflags to prevent compilation halts

* Adding cflags to prevent compilation halts

* Changing name space of optimized ops; Remove unused ops from file

* fixed lint issues.

* Namespace updates for cadence ops, adding 6 optimized ops

* Updating add op with namespace change

* Updating add op with namespace change, fix build issue

---------

Co-authored-by: cad-audio <[email protected]>
Co-authored-by: cad-audio <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants