Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for ConvTranspose Layers (1D and 2D) #644

Conversation

Jonathan-Shoemaker
Copy link
Contributor

Description

This adds support for ConvTranspose layers. Specifically, it adds support for both io_stream and io_parallel compilation of Conv1DTranspose and Conv2DTranspose (as of now, only converted from keras).

The strategy roughly follows that of non-transposed convolution layers. We treat a conv transpose as a group of stride_width by stride_height convolutions, with their outputs interlaced. Thus, we essentially do a normal conv implementation where each kernel produces stride_width * stride_height outputs. Perhaps the most unintuitive part of how things are currently set up is that the weight matrix is transformed substantially (in the python code). This is done to split up the kernel into what amounts to the stride_width * stride_height smaller kernels.

The draft PR depends on PR #600 due to use of that PR's new implementation of io_parallel conv layers. Thus, all of the changes from that PR are currently included in this draft (will change once it is merged).

As of now, it seems both parallel and stream 1D implementations are working well, with performance matching that of the non-transposed layers. There are slight latency increases in the 2D implementations that may need to be worked out (namely the writing of data in both cases is a bit slow - in parallel it seems to have trouble writing to the output in the order the implementation wants to and in stream there are often multiple writes that get queued up which causes the implementation to take extra cycles).

Type of change

  • New feature (non-breaking change which adds functionality)

Tests

Still have to add tests to this PR.

Test Configuration:

Testing was done by compiling models consisting of single ConvTranspose layers, and comparing the performance of those layers to analogous Conv layers (i.e. a layer that maps the conv transpose output to its input).

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have added tests that prove my fix is effective or that my feature works.

@jmitrevs
Copy link
Contributor

For the nondefault project name handling, it may be good to rebase with the current main branch. I think those things have been solved, though of course, there are no guarantees.

@vloncar
Copy link
Contributor

vloncar commented Oct 21, 2022

@Jonathan-Shoemaker since #600 has been merged, can you rebase?

@jmitrevs
Copy link
Contributor

I was wondering about the status of this PR. We'll talk about the code status and release schedule this Friday, and the conv transpose layer is an important layer for us to support.

@Jonathan-Shoemaker
Copy link
Contributor Author

I was wondering about the status of this PR. We'll talk about the code status and release schedule this Friday, and the conv transpose layer is an important layer for us to support.

The PR no longer is waiting on any others. There are still slight issues in optimization for 2D transpose. I can clean it up a little bit / rebase, etc.

@jmduarte jmduarte marked this pull request as ready for review March 18, 2023 13:48
@jmduarte jmduarte added this to the v0.8.0 milestone Mar 18, 2023
@jmduarte jmduarte added the please test Trigger testing by creating local PR branch label Mar 18, 2023
Jonathan-Shoemaker and others added 5 commits March 18, 2023 08:07
add new files for conv1dtranspose resource

clean up so that conv code is reached. Still need to get the actual implementation matching keras

implement conv1dtranspose super inefficiently (gets correct answer though)

try to fix indices to make code work

make the c code work for conv1dtranspose

reduce weight dimensions to properly reflect transposed kernel size

clean up so that transpose filter width is passes around from config

fix code such that simple transpose layer gets synthesized

move variables out of loops, optimize slightly and add in alternative method of computation to compute by kernel (that option is not optimized as of now)

add in conv1d transpose linebuffer format code. seems to work, unsure of if it is optimized yet

trying to fix stream behavior

get transpose compilation working mostly as expected. weird jump in latency from reuse 1 to 2 still exists

initial conv2dtranspose addition. Output is permuted as of now.

output in correct order. using large array to buffer output though

fix up conv1dtranspose a bit to pad correctly. fix up stream instructions for both 1d and 2d transposes

fix allowed reuse factors for transpose layers

update to new conv methods for io_parallel. Still some issues with multiple filters as well as some padding issues

clean up error with multiple filters and larger kernels

optimize conv transpose resource to get it working reasonably well. may still have slight optimization left

fix output to conv1d transpose resource

add conv2dtranspose io_parallel implementation. Can still be optimized

small changeup to data storage in conv1d parallel

fix zero padding pass addition for transpose stream layers

move transposing of weight matrix to resource_strategy for transpose layers

change how stream loads in weights to be like parallel for conv transposes. unroll all stride steps completely

 fix output of 1d transpose parallel to be faster

change 1d transpose weight input to be 2-dimensional (passed from python code)

change 2d transpose weight input to be 3-dimensional (passed from python code)

small changes to transposes

Revert "fix nondefault project name handling (fastmachinelearning#626)". The commit breaks the Vivado Accelerator workflow, and the fix is unclear to me right now.

This reverts commit e8f048a.

steps towards getting integer inputs to work
@jmduarte
Copy link
Member

Hi @Jonathan-Shoemaker, I squashed your commits + rebased to main and tried to decouple the nonrelated changes on my branch, diff here: main...jmduarte:hls4ml:conv_tr_parallel

Can I push it here and we can proceed to review it?

@Jonathan-Shoemaker
Copy link
Contributor Author

Hi @Jonathan-Shoemaker, I squashed your commits + rebased to main and tried to decouple the nonrelated changes on my branch, diff here: main...jmduarte:hls4ml:conv_tr_parallel

Can I push it here and we can proceed to review it?

sounds good to me. I can work on adding tests

@jmduarte jmduarte self-requested a review March 19, 2023 16:30
@jmduarte jmduarte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Mar 19, 2023
@jmduarte jmduarte requested a review from vloncar March 19, 2023 16:30
@jmduarte
Copy link
Member

jmduarte commented Mar 19, 2023

Great! sounds good. I'll also review what's here soon, have some minor comments/questions.

Also don't worry about running pre-commit yet, we can run that at the end after we're done reviewing (to not introduce large diffs).

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 2, 2023

I think we want to support this for version 0.8. I will try rebasing it on the current main.

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 2, 2023

What is the meaning of "keep_dims"?

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 3, 2023

The rebase is at https://github.com/fastmachinelearning/hls4ml/tree/conv_tr_rebase. There were lots of merge conflicts so please take a look. We can replace this PR with that one, or force push it.

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 3, 2023

#844 is the version of this PR based on my rebased attempt. I wanted to make the PR to see how the tests go.

@Jonathan-Shoemaker
Copy link
Contributor Author

What is the meaning of "keep_dims"?

keep_dims keeps the weight matrix from being entirely flattened, keeping the first "keep_dims" dimensions as is, flattening along all other dimensions.

The reason for this is that the ConvTranspose is computed as the interleaving of "stride" number of Conv layers. The dimensions kept are for indexing into these different Conv layers. The idea was that the weight matrix of a ConvTranspose layer can be thought of as a disjoint set of weight matrices for Conv layers and treating it as such was easier.

@MODISENEISO
Copy link

Hello does this mean currently Convd2DTranspose from keras is not supported in hls4ml

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 9, 2023

Not yet, but hopefully in a few days (in the main branch, not release).

@jmitrevs
Copy link
Contributor

jmitrevs commented Aug 9, 2023

I think we moved to the rebased request (#844), so I will close this one

@jmitrevs jmitrevs closed this Aug 9, 2023
@MODISENEISO
Copy link

Good day

How do I effect the changes for Conv2DTranspose in the hls4ml .
I tried updating the version the changes still not effect.I am not familiar with github commit and branches.
Could you please share a guide.
image

@jmduarte
Copy link
Member

hi @MODISENEISO. It looks like you're using a conda environment. Did you do pip install hls4ml?

You can also install any branch of hls4ml as follows:

pip install git+https://github.com/fastmachinelearning/hls4ml@conv_tr_rebase

In this case, conv_tr_rebase is the name of the branch for the updated pull request with this new feature #844.

Thanks,
Javier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement please test Trigger testing by creating local PR branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants