Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"level" parameter in waveletSmooth function #9

Open
danizil opened this issue Jun 3, 2019 · 11 comments
Open

"level" parameter in waveletSmooth function #9

danizil opened this issue Jun 3, 2019 · 11 comments
Assignees

Comments

@danizil
Copy link

danizil commented Jun 3, 2019

Hi Timothy.
In reviewing your code I ran into issues using the waveletSmooth function (in the directory subrepos/models/wavelet). I think it might be the difference in our pywt versions, but the function was doing the wavelet decomposition along the features axis and not separately for each feature along its time series.
After fixing it, I noticed that the "level" parameter was only in charge of thresholding
the detail coefficients using the "level" detail coefficient's median.
I'm hardly a wavelet expert, and have learned it only now for this algorithm, but I changed your code to threshold coefficients according to their level's median, because that was done in all denoising sources I have seen.
Could you explain your consideration in choosing one level for all cD thresholding?
cheers!
Danny

@timothyyu
Copy link
Owner

@danizil the waveletSmooth function in the subrepos/models/wavelet directory is from DeepLearning_Financial, a previous attempt to replicate the results of the paper: (https://github.com/mlpanda/DeepLearning_Financial)

I am currently using a modified implementation of that formula, seen here:
https://github.com/timothyyu/wsae-lstm/blob/master/wsae_lstm/models/wavelet.py#L27

@timothyyu
Copy link
Owner

image

@timothyyu
Copy link
Owner

timothyyu commented Jun 10, 2019

level is 2 as defined in the original paper; as for the axis, I am still looking into how that specific step is tied to the next level of the model (the stacked autoencoder stage). I am fairly confident that my implementation is on the right track axis wise, but I am not infallible. I do recall the feature set being seemingly incorrectly oriented when using axis=1, but it is something that I will have to double check

Related/relevant:
#7
#6
https://github.com/timothyyu/wsae-lstm/blob/master/reports/csi300%20index%20data%20tvt%20split%20scale%20denoise%20visual.pdf

@danizil
Copy link
Author

danizil commented Jun 11, 2019

Hi @timothyyu

  1. What puzzles me is not the decomposition level, which can be toggled, but the fact that we take the threshold from the "level" level's median coefficient and apply it to all other levels. That way did work better at reconstruction though so I went on without further exploration.

  2. Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.

  3. After exploring for a bit, and not being able to converge with the AE, I think ill try a new scaling process which compresseses the data to the range (0, 1). The reason is mainly that I wasn't able to figure out how to recreate the scaled original signal with ReLUs or tanhs without using another linear transformation in the end, thereby breaking the symmetry of the AE. Other resons are that this will make it possible to use sigmoids as in the paper and also I saw it used in AE tutorials:
    https://www.youtube.com/watch?v=582irhtQOhw
    on minute 11:39
    https://medium.com/datadriveninvestor/deep-autoencoder-using-keras-b77cd3e8be95
    I think this is the subject for another issue though, and will update on it.
    Here is my code for the wavelet function (besides my comments, that I think explain the process though are verbose, there are the two options and a transpose on X):
    Cheers!

def waveletSmooth(x, wavelet="db4", level=1, DecLvl=2):
    # calculate the wavelet coefficients
    # danny: coeffs is (DecLvl + 1) arrays: one approximation coefficients (cA) array (lowest frequencies)
    # and then DecLvl number of detail coefficients (cD)
    coeffs = pywt.wavedec(x.T, wavelet, mode="per", level=DecLvl)

    # calculate a threshold
    # danny: mad is median deviation (not standard deviation)
    sigma = mad(coeffs[-level]) #danny: should be shape 2X19. this is the original but i turned it off
    #danny: option 2 - scale each cD by its own median
    # sigma = np.array([mad(i) for i in coeffs[1:]])

    # changing this threshold also changes the behavior,
    # but I have not played with this very much
    # danny: uthresh is universal threshold - a formula appearing in articles (haven't gone into it)
    uthresh = sigma * np.sqrt(2 * np.log(len(x)))

    # danny: we take [1:] because these are the detail coefficients and we denoise them
    coeffs[1:] = (pywt.threshold(coeffs[1:][i], value=uthresh[i], mode="soft") for i in range(len(coeffs[1:])))
    # reconstruct the signal using the thresholded coefficients
    y = pywt.waverec(coeffs, wavelet, mode="per")
    return y```

@timothyyu timothyyu self-assigned this Jul 11, 2019
@timothyyu
Copy link
Owner

related closed issue (duplicate):
#12

@timothyyu
Copy link
Owner

@danizil make sure the wavelet type in your code is haar, not db4. The authors for the WSAE-LSTM specifically specify haar; the existing/previous attempt to implement this model for the wavelet stage by mlpanda uses db4, but that is incorrect

I am still looking into/examining the level median application/decomposition (#1) + the axis orientation (#2); one of main issues I'm running into is that the authors of the model were not very specific when it comes to particular aspects of the implementation of their model (see #6 and #7 for relevant discussion regarding that).

Basically beyond a certain point, the highest academic judgement/practice should be used to fill in the gaps in the implementation of the model + correction of errors.

@timothyyu
Copy link
Owner

3. After exploring for a bit, and not being able to converge with the AE, I think ill try a new scaling process which compresseses the data to the range (0, 1). The reason is mainly that I wasn't able to figure out how to recreate the scaled original signal with ReLUs or tanhs without using another linear transformation in the end, thereby breaking the symmetry of the AE.

That is one of the fundamental issues that I am looking into - Whether the scaling and denoising with the wavelet transform is reversed at some stage before/after the LSTM layer, used to make predictions one timestep ahead:
image

I can't say I have a definitive answer/solution yet, but I'm going to be trying more than one method/approach. Unfortunately, the actual journal does not explicitly detail this component/issue in defining the model & the model pipeline for the price data + technical indicator data.

@timothyyu
Copy link
Owner

@timothyyu
Copy link
Owner

  • Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.

Axis decomposition check started; see 707dfb5:
https://github.com/timothyyu/wsae-lstm/blob/master/notebooks/6a_wavelet_axis_decomp_check.ipynb

image
image

@timothyyu timothyyu assigned timothyyu and unassigned timothyyu Jul 12, 2019
@timothyyu
Copy link
Owner

axis = 1 wavelet has an extra y axis column that has to be removed to be accurate feature wise:
image

@timothyyu
Copy link
Owner

timothyyu commented Sep 13, 2019

will have to double check but i believe i was correct initially with axis=0 - still may be worth running in parallel with axis=1 with the extra column chopped off as an A/B control or test

image

Regarding the axis, the way I understood it is that we're supposed to compress nineteen indicators into ten features' time series' (i.e. on the indicator axis and not on the time axis), and then run the compressed features through an lstm. I imagine that whichever denoising process transforms the dataframe into 19X(DecLvl + 1) is preformed on the correct axis (spectral decomposition is relevant only on the time axis is what I mean). I'll add my own code underneath, could be that it's a package version thing.

The authors of the original paper were not explicitly clear or detailed for this aspect of the model - will also have to take a closer look at/reevaluate the autoencoder stage in how it is supposed to work on the transformed data (19X DecLvl+1) before LSTM input

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants