Replies: 1 comment
-
This is certainly a tradeoff. The issue is discussed also here: https://elib.dlr.de/126424/1/Runge_Chaos_2018.pdf There are two different things here: down-sampling to me is skipping every other sample, I call this sub-sampling. On the other hand, time aggregation is taking time-averages. I would, as always, come up with a realistic toymodel that models your challenges, and then make a choice. Btw, if you aggregate/time-average the data more, then lagged causal links likely become contemporaneous and you can use PCMCIplus. We are also working on causal discovery methods that are less lag-specific to circumvent this problem, something should come out this year! |
Beta Was this translation helpful? Give feedback.
-
Hi there, I am working on applications of causal discovery in epidemiology, applying PCMCI+ to large time series, and am struggling to make a decision regarding my selection of a maximum lag length (tau_max as it is referred to in the implementation of the algorithm).
To illustrate my problem in context, I have minute-level resolution data and I expect a maximum delay between cause and effect of 1 hour (i.e. I want to set tau_max to 60). However I am worried about multiple testing problems from considering too many lags. I considered a down-sampled time series, for example by averaging over 10 minutes. In this case I could set tau_max to 6. However I am wondering if this approach will miss links or even cause spurious links. This idea of using down-sampled data comes from Runge et al. 2015 (http://doi.org/10.1038/ncomms9502) where the authors make use of a weekly time resolution to balance between resolving causal directionality and a multiple testing problem.
I was wondering if anyone has any advice when it comes to dealing with multiple testing problems in this context, and whether you think the approach of downsampling can yield trustworthy results? Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions