Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonalize fails on non-time series grouped data #55

Open
larry77 opened this issue Jul 30, 2020 · 0 comments
Open

Anonalize fails on non-time series grouped data #55

larry77 opened this issue Jul 30, 2020 · 0 comments

Comments

@larry77
Copy link

larry77 commented Jul 30, 2020

Dear All,
Hopefully the reprex is self-explanatory.
I plan to use anomalize on non-time series data.
It should still work according to the documentation (without the time series decomposition) and it does, but not on non-time series grouped data.
Any ideas?

library(tidyverse)

library(anomalize)
#> ══ Use anomalize to improve your Forecasts by 50%! ═════════════════════════════
#> Business Science offers a 1-hour course - Lab #18: Time Series Anomaly Detection!
#> </> Learn more at: https://university.business-science.io/p/learning-labs-pro </>

test1 <- tidyverse_cran_downloads %>%
    time_decompose(count) %>%
    anomalize(remainder)
#> Registered S3 method overwritten by 'quantmod':
#>   method            from
#>   as.zoo.data.frame zoo

print(test1)  ##and this works fine
#> # A time tibble: 6,375 x 9
#> # Index:  date
#> # Groups: package [15]
#>    package date       observed season trend remainder remainder_l1 remainder_l2
#>    <chr>   <date>        <dbl>  <dbl> <dbl>     <dbl>        <dbl>        <dbl>
#>  1 broom   2017-01-01    1053. -1007. 1708.    352.         -1725.        1704.
#>  2 broom   2017-01-02    1481    340. 1731.   -589.         -1725.        1704.
#>  3 broom   2017-01-03    1851    563. 1753.   -465.         -1725.        1704.
#>  4 broom   2017-01-04    1947    526. 1775.   -354.         -1725.        1704.
#>  5 broom   2017-01-05    1927    430. 1798.   -301.         -1725.        1704.
#>  6 broom   2017-01-06    1948    136. 1820.     -8.11       -1725.        1704.
#>  7 broom   2017-01-07    1542   -988. 1842.    688.         -1725.        1704.
#>  8 broom   2017-01-08    1479. -1007. 1864.    622.         -1725.        1704.
#>  9 broom   2017-01-09    2057    340. 1887.   -169.         -1725.        1704.
#> 10 broom   2017-01-10    2278    563. 1909.   -194.         -1725.        1704.
#> # … with 6,365 more rows, and 1 more variable: anomaly <chr>




test2 <- tidyverse_cran_downloads %>%
    group_by(package) %>% 
    time_decompose(count) %>%
    anomalize(remainder)

print(test2)  ##and also this works fine
#> # A time tibble: 6,375 x 9
#> # Index:  date
#> # Groups: package [15]
#>    package date       observed season trend remainder remainder_l1 remainder_l2
#>    <chr>   <date>        <dbl>  <dbl> <dbl>     <dbl>        <dbl>        <dbl>
#>  1 broom   2017-01-01    1053. -1007. 1708.    352.         -1725.        1704.
#>  2 broom   2017-01-02    1481    340. 1731.   -589.         -1725.        1704.
#>  3 broom   2017-01-03    1851    563. 1753.   -465.         -1725.        1704.
#>  4 broom   2017-01-04    1947    526. 1775.   -354.         -1725.        1704.
#>  5 broom   2017-01-05    1927    430. 1798.   -301.         -1725.        1704.
#>  6 broom   2017-01-06    1948    136. 1820.     -8.11       -1725.        1704.
#>  7 broom   2017-01-07    1542   -988. 1842.    688.         -1725.        1704.
#>  8 broom   2017-01-08    1479. -1007. 1864.    622.         -1725.        1704.
#>  9 broom   2017-01-09    2057    340. 1887.   -169.         -1725.        1704.
#> 10 broom   2017-01-10    2278    563. 1909.   -194.         -1725.        1704.
#> # … with 6,365 more rows, and 1 more variable: anomaly <chr>


## From the documentation:
## For non-time series data (data without trend), the anomalize()
## function can be used without time
## series decomposition.





test3 <- tidyverse_cran_downloads %>%
    select(-date) %>%
    filter(package=="broom") %>% 
    anomalize(count)


print(test3) ## OK!
#> # A tibble: 425 x 5
#>    count package count_l1 count_l2 anomaly
#>    <dbl> <chr>      <dbl>    <dbl> <chr>  
#>  1  1053 broom     -2535.    7965. No     
#>  2  1481 broom     -2535.    7965. No     
#>  3  1851 broom     -2535.    7965. No     
#>  4  1947 broom     -2535.    7965. No     
#>  5  1927 broom     -2535.    7965. No     
#>  6  1948 broom     -2535.    7965. No     
#>  7  1542 broom     -2535.    7965. No     
#>  8  1479 broom     -2535.    7965. No     
#>  9  2057 broom     -2535.    7965. No     
#> 10  2278 broom     -2535.    7965. No     
#> # … with 415 more rows



### now let us try this on grouped data






test4 <- tidyverse_cran_downloads %>%
    select(-date) %>% 
    group_by(package) %>% 
    anomalize(count)
#> Error in value[[3L]](cond): Error in prep_tbl_time(): No date or datetime column found.

print(test4)  ##and now an error ## what to do?
#> Error in print(test4): object 'test4' not found

Created on 2020-07-30 by the reprex package (v0.3.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant