Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add llama and resnet emitc tests #2144

Merged
merged 1 commit into from
Feb 16, 2025

Conversation

svuckovicTT
Copy link
Contributor

Ticket

Closes #2143

Problem description

Add llama and resnet tests for emitc path.

What's changed

  • For llama, it was needed to tackle several issues:
    • ttnn::repeat_interleave support added thru emitc
    • Pass --ttnn-create-input-gens wasn't putting tensors on device when they were assumed on device by their layout desc, fixed that
  • Added resnet test, currently unsupported, awaiting further op support
  • Change the dylib compilation script ci_compile_dylib.py to not remove ttnn-dylib.cpp as it is useful for debugging - this is okay as the file is gitignored already

@svuckovicTT
Copy link
Contributor Author

One weird thing here though... in llama test, generating the flatbuffer takes ~75 seconds for some reason, the rest of the compiler is much faster (total 1-2 sec).

We should look into what is so slow in the --ttnn-to-flatbuffer translation pass.

fyi @nsmithtt

@svuckovicTT svuckovicTT self-assigned this Feb 7, 2025
@svuckovicTT svuckovicTT added the EmitC EmitC related work label Feb 7, 2025
@azecevicTT
Copy link
Contributor

@svuckovicTT You might want to try this #1782.

@svuckovicTT svuckovicTT force-pushed the svuckovic/add-llama-resnet-emitc-tests branch from e640031 to fb0972a Compare February 7, 2025 15:09
Copy link
Contributor

@sdjordjevicTT sdjordjevicTT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

lib/Conversion/TTNNToEmitC/TTNNToEmitC.cpp Outdated Show resolved Hide resolved
ArrayAttr arrayAttrs = rewriter.getArrayAttr({
rewriter.getIndexAttr(0), // input tensor
repeatInterleaveOp.getRepeatsAttr(), repeatInterleaveOp.getDimAttr(),
repeatInterleaveOp.getMemoryConfig().has_value()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: has_value() is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but I prefer it this way as it makes it obvious what is being checked.

repeatInterleaveOp.getMemoryConfigAttr(),
repeatInterleaveOp.getLoc())
->getResult(0)),
mlir::cast<Attribute>(rewriter.getIndexAttr(1)))
Copy link
Contributor

@azecevicTT azecevicTT Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cast really needed?
Also I struggle to follow the logic here, you can maybe derive work related to ?: operator before ArrayAttr creation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast is indeed needed, otherwise the error is Incompatible operand types ('IntegerAttr' and 'emitc::OpaqueAttr'). I remember that you and I actually looked into this while working on OnesOp conversion haha!

you can maybe derive work related to ?: operator before ArrayAttr creation.

I'll leave it like this since it's "closer" to the automated approach we discussed, with the idea of this op (and OnesOp with same-style implementation) being one of the first to rework into it (automated approach).

azecevicTT added a commit that referenced this pull request Feb 8, 2025
- added `assumeVefied()` to `printFlags` that are used for printing op
- greately lowers TTNN->Flatbuffer compile time (on LlamaPrefill example
  from #2144 compile time was
lowered from ~70s -> ~1.5s
azecevicTT added a commit that referenced this pull request Feb 10, 2025
### Ticket
#2156

### Problem description
As described here #1782
`getOpDebugString` was taking more than 99% of the time of
`ttmlir-translate -ttnn-to-flatbuffer` execution. From that time, 99%
was spent on op verification.

### What's changed
- added `assumeVefied()` to `printFlags` that are used for printing ops
(the way we are running our pipelines I believe this is a safe
assumption)
- greatly lowers TTNN->Flatbuffer compile time (on LlamaPrefill example
from #2144) compile time was
lowered from ~70s -> ~1.5s (around 0.8s was spent on op printing)
@svuckovicTT svuckovicTT force-pushed the svuckovic/add-llama-resnet-emitc-tests branch from fb0972a to a07e6b2 Compare February 16, 2025 19:03
@svuckovicTT svuckovicTT merged commit f80099b into main Feb 16, 2025
32 checks passed
@svuckovicTT svuckovicTT deleted the svuckovic/add-llama-resnet-emitc-tests branch February 16, 2025 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EmitC EmitC related work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add llama and resnet emitc tests
3 participants