Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect behavior of the tsfresh.utilities.dataframe_functions.roll_time_series function #1108

Open
maresin opened this issue Feb 9, 2025 · 0 comments
Labels

Comments

@maresin
Copy link

maresin commented Feb 9, 2025

The problem:
By default, the tsfresh.utilities.dataframe_functions.roll_time_series function should move the window in the direction of growth on the time scale. However, in fact, it slices windows starting from the end. The error is not so noticeable if the rolling_direction parameter is set by default. Therefore, it is worth increasing it.
Here is an example:

a = list(range(10))
df = pd.DataFrame({
   "id": [1] * len(a),
   "time": a,
   "x": list(map(lambda x: x + 10, a)),
})
   id time  x
0  1  0  10
1  1  1  11
2  1  2  12
3  1  3  13
4  1  4  14
5  1  5  15
6  1  6  16
7  1  7  17
8  1  8  18
9  1  9  19
roll_time_series(df, column_id="id", column_sort="time",  
                 max_timeshift=2, min_timeshift=0, rolling_direction=4
                )
    id    time  x
0  (1, 1)   0  10
1  (1, 1)   1  11
2  (1, 5)   3  13
3  (1, 5)   4  14
4  (1, 5)   5  15
5  (1, 9)   7  17
6  (1, 9)   8  18
7  (1, 9)   9  19

You may notice that the window moves from the end to the beginning of the timeseries, since it is at the beginning that the unfinished window appears. Let's also pay attention to the second value in the tuple of the identifier - for example. id=(1, 5). It indicates the beginning of the window construction, and we see that the window contains the early timeseries values that should have been skipped.

In addition, the names of the parameters do not correspond well with their meaning. Obviously, max_timeshift should be called window_size and explicitly specify the window width (without adding +1), which also applies to min_timeshift. And rolling_direction should be renamed to `shift'.

@maresin maresin added the bug label Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant