some questions about preprocess_elect.py and data_loader.py #26

mw66 · 2023-04-15T17:07:54Z

Hi,

I have a few questions about preprocess_elect.py:

in prep_data(): v_input[:, 1] is never used (read or write), so why you need this 2nd column?
https://github.com/ant-research/Pyraformer/blob/master/preprocess_elect.py#L35
about x_input:
https://github.com/ant-research/Pyraformer/blob/master/preprocess_elect.py#L58
x_input[count, 1:, 0] from 1 onward, x_input contains the real raw input data, but x_input[count, 0, 0] is never assigned, so it will remain all 0s, which means it does not contain any real raw input data
(https://github.com/ant-research/Pyraformer/blob/master/preprocess_elect.py#L67 this line for x_input[count, 0, 0] is also zero)
why don't you just drop all such x_input[:, 0, :], since they are the wrong training data? and why you want to save it in the final train npy file?
i.e. change https://github.com/ant-research/Pyraformer/blob/master/preprocess_elect.py#L72-L74 to

    np.save(prefix+'data_'+save_name, x_input[:, 1:, :])
    np.save(prefix+'v_'+save_name, v_input[1:, :])
    np.save(prefix+'label_'+save_name, label[1:, :])

and I did some inspection of the saved train data, it's confirmed that they are all 0s:

>>> import numpy as np
>>> t = np.load("data/elect/train_data_elect.npy")
>>> np.max(t[:, 0, 0])
0.0
>>> np.min(t[:, 0, 0])
0.0
>>>

The text was updated successfully, but these errors were encountered:

mw66 · 2023-04-15T18:37:44Z

@Zhazhan

my 3rd question:

https://github.com/ant-research/Pyraformer/blob/master/preprocess_elect.py#L58

            x_input[count, 1:, 0] = data[window_start:window_end-1, series]

so, the x_input[:, :, 0] is the raw input sequence data,

but in:
https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L440-L445

        cov = all_data[:, :, 2:]   # the raw input sequence data is dropped here?

        split_start = len(label[0]) - self.pred_length + 1
        data, label = split(split_start, label, cov, self.pred_length)

        return data, label

it's dropped from the training data?

This is the same question I have here: #25 (comment)

So the previous value of the raw input sequence value is not used at all in training?

mw66 · 2023-04-16T04:34:34Z

ok, for my question 3), I found:

https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L443

        data, label = split(split_start, label, cov, self.pred_length)

which on
https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L398-L403

            single_data = batch_label[i:(split_start+i)].clone().unsqueeze(1)
            single_data[-1] = -1
            single_cov = cov[batch_idx, i:(split_start+i), :].clone()
            temp_data = [single_data, single_cov]
            single_data = torch.cat(temp_data, dim=1)
            all_data.append(single_data)

insert the label (as previous values in the window) back into the all_data. This is confusing, why you choose to do it this way?

Also, the implementation of electTrainDataset.__getitem__
https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L432

is so different from electTestDataset.__getitem__
https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L460

in particular
https://github.com/ant-research/Pyraformer/blob/master/data_loader.py#L473-L477

            single_data = data[i:(split_start+i)].clone().unsqueeze(1)
            single_data[-1] = -1
            single_cov = cov[i:(split_start+i), :].clone()
            single_data = torch.cat([single_data, single_cov], dim=1)
            all_data.append(single_data)

Here, you didn't do the same to insert the label (as previous values in the window) back into all_data, why there is such difference?

mw66 changed the title ~~some questions about preprocess_elect.py~~ some questions about preprocess_elect.py and data_loader.py Apr 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some questions about preprocess_elect.py and data_loader.py #26

some questions about preprocess_elect.py and data_loader.py #26

mw66 commented Apr 15, 2023 •

edited

Loading

mw66 commented Apr 15, 2023 •

edited

Loading

mw66 commented Apr 16, 2023 •

edited

Loading

some questions about preprocess_elect.py and data_loader.py #26

some questions about preprocess_elect.py and data_loader.py #26

Comments

mw66 commented Apr 15, 2023 • edited Loading

mw66 commented Apr 15, 2023 • edited Loading

mw66 commented Apr 16, 2023 • edited Loading

mw66 commented Apr 15, 2023 •

edited

Loading

mw66 commented Apr 15, 2023 •

edited

Loading

mw66 commented Apr 16, 2023 •

edited

Loading