How to speed up PCMCI? #208

wang70880 · 2022-04-13T06:49:44Z

wang70880
Apr 13, 2022

Recently I am using PCMCI algorithm to discover causality for binary time series. Specifically, I chose a small datasets with T = 7000 and N = 30. The process configuration is the following.

Conditional independence test: CMIsymb
pc_alpha, mci_alpha = 0.1, 0.05
tau = 1

However, it takes my Macbook nearly 3 hours to finish the discovery process. Any idea about the bottleneck and how to speed up the process?

Answered by NH89

Apr 13, 2022

Hello Wang70880,

(i) Have you looked at the mpi4py parallelization script for pcmci :
https://github.com/jakobrunge/tigramite/blob/developer/run_pcmci_parallel.py ?
If you adapt this script, then you can use all the cores of your CPU
This requires an MPI library and its headers (e.g. Open MPI), plus the python module "mip4py" (available on either Conda or Pip).

(ii) Use version 5.x of Tigramite, ensure that you have Numba installed (Python JIT compilation) .
Compiled code is much faster than interpreted code.

(iii) If you also use Gaussian processes, ensure that Torch library and the Python module "Gpytorch" are installed and working with your GPU.

See the "Optional packages depending on …

View full answer

NH89 · 2022-04-13T14:09:52Z

NH89
Apr 13, 2022

Hello Wang70880,

(i) Have you looked at the mpi4py parallelization script for pcmci :
https://github.com/jakobrunge/tigramite/blob/developer/run_pcmci_parallel.py ?
If you adapt this script, then you can use all the cores of your CPU
This requires an MPI library and its headers (e.g. Open MPI), plus the python module "mip4py" (available on either Conda or Pip).

(ii) Use version 5.x of Tigramite, ensure that you have Numba installed (Python JIT compilation) .
Compiled code is much faster than interpreted code.

(iii) If you also use Gaussian processes, ensure that Torch library and the Python module "Gpytorch" are installed and working with your GPU.

See the "Optional packages depending on used functions" in the Tigramite [README.md] file (https://github.com/jakobrunge/tigramite/tree/developer#readme)

Best regards,
Nick Hockings
(just a Tigramite user)

1 reply

wang70880 Apr 20, 2022
Author

The use of mpi is a great idea. Thanks a lot.

wang70880 · 2022-04-14T05:31:18Z

wang70880
Apr 14, 2022
Author

Hi Nick,

Thanks a lot for your advice. I will try it to see if it works. :-)

0 replies

NH89 · 2022-04-14T10:02:53Z

NH89
Apr 14, 2022

Note, when using the mpi4py script for PCMCI found that I had to specify

tau_min = 1 # not 0

otherwise I could not plot the causal graph from the results.

0 replies

NH89 · 2022-04-15T13:33:47Z

NH89
Apr 15, 2022

I'm currently learning how to use this myself, and these notes may be helpful to others.
Anyone who knows better, please correct any errors you spot!

Using prior knowledge to keep causal search tractable:

NB the number of _links to be tested_ scales with the square of the number of variables.
The computational cost would scale with the number of links, times the number of other variables to be conditioned on.

The case of 30 variables and 'ParCorr' independence tests, should be tractable on current computers, but other Tigramite users may have many more variables.

The computational cost can be substantially reduced if you have prior knowledge that excludes some links.
e.g. If you know that
(i) certain variables are primary causes, so will have only out going links
(ii) certain variables are end effects, so will only have incoming links
(iii) any other sets of edges that you can exclude, because you already have good reason to believe certain causal links are not possible.

You can tell Tigramite functions to only consider certain links by providing "selected_links", a Python dict of possible edges per variable.

The structure of the dict is:

selected_links = 
    { <index_of_'from'_variable> : [ list of edges ....( <index_of_'to'_variable> , < tau, negative number of time steps>),  ( , ),  .... ] , 
                   < next variable > : [ (    ,    ),   (    ,     ), ..... ] , 
                    ...... 
    }

e.g. this would be a fully-connected causal graph for 3 variables, and tau_min=1, tau_max=2:

selected_links = 
    {0: [(0, -1), (0, -2),   (1, -1), (1, -2),   (2, -1), (2, -2)],     
     1: [(0, -1), (0, -2),   (1, -1), (1, -2),   (2, -1), (2, -2)],             
     2: [(0, -1), (0, -2),   (1, -1), (1, -2),   (2, -1), (2, -2)]
     }

Any elements that you can omit from this dict, will reduce the amount of computation required.

Search the Tigramite documentation for functions that have a "selected_links" parameter.

This technique is also used in the Tigramite parallelization scripts, to send different dicts of links to different MPI Ranks on different CPU cores, see :
run_pcmci_parallel.py
run_pcmciplus_parallel.py

0 replies

wang70880 · 2022-04-19T10:33:11Z

wang70880
Apr 19, 2022
Author

Hi Nick,

How do you use the numba speedup? Which modules (or functions) do you enforce the compliation?

2 replies

NH89 Apr 20, 2022

numba==0.55.1 is one of the required packages of Tigramite-5.x . It gets imported by certain files in the Tigramite code, e.g. independence_tests/cmiknn.py

You can search the code to see where it is imported or used with :

grep -rn "numba" tigramite/*

grep -rn "@jit" tigramite/*

NH89 Apr 20, 2022

This assumes you have the command line function "grep". It is usually available on Linux and OSX terminals using a shell such as bash.

wang70880 · 2022-04-20T02:26:08Z

wang70880
Apr 20, 2022
Author

It turns out that the usage of numba will be very limitted. Any incompatibilities (e.g., inference of data type) will result in fall back of numba to the "object mode". I test the speed and find that when working in object mode, the exeuction time increases about 10 ~ 20%.

As a result, I don't suggest to use numba, unless you can identify functions which use numpy only.

4 replies

NH89 Apr 20, 2022

In #176 on Tigramite 5.0 , @jakobrunge writes : (bold font mine)

Removed cython-dependency, which caused many installation problems. This affects the CMIknn and distance correlation functions, as well as the ordinal pattern generator. The methods are now run with numba with similar to better runtime performances.

CMIknn's _get_nearest_neighbors() is now run with scipy's CKDTree entirely and numba and can use its parallelization through the workers parameter. Further, get_restricted_permutation() is also now run with numba. Both lead to significant speedups.

These are important. Distance correlation is used with many of Tigramite's conditional dependence tests. CMIknn (kernel-nearest-neighbours) is one of two non-linear conditional-independence tests available in Tigramite. Non-linear conditional independence is much more computationally expensive than linear partial correlation (ParrCorr test).

The other non-linear conditional dependence test, Gaussian Process Distance Correlation (GPDCtorch), is available with GPU acceleration, via Torch-Cuda and GPyTorch.

Both CMIknn and GPDCtorch can be combined with mpi4py, if you have suitable workstation or cluster (multiple CPU cores or multiple GPUs respectively). Note that if you use mpi4py from a .ipynb in Jupyter, then you would need to include 'parallel magics' commands in the note book cell to be executed in parallel. See how-to-use-mpi4py-in-ipython-notebook and ipython parallel magics .

NH89 Apr 20, 2022

Also see :
Best way to choose from independence tests? #134

wang70880 Apr 20, 2022
Author

For my application, I use the CMIsymb, which is not distance-based. As a result, the numba speedup seems not working....

wang70880 Apr 20, 2022
Author

As for CMIsymb, I have identified the part which consumes time most. That is the function called "_get_shuffle_dist()".

Moreover, I found that by reducing the number of significance samples (The round for a precise significance testing), I can speed up the test.

jakobrunge · 2023-08-01T16:36:21Z

jakobrunge
Aug 1, 2023
Maintainer

Sorry for the late reply. We have had quite some improvements lately, does the issue still persist?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speed up PCMCI? #208

{{title}}

Replies: 7 comments 7 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to speed up PCMCI? #208

Replies: 7 comments · 7 replies

wang70880 Apr 20, 2022 Author

wang70880 Apr 14, 2022 Author

Using prior knowledge to keep causal search tractable:

wang70880 Apr 19, 2022 Author

wang70880 Apr 20, 2022 Author

wang70880 Apr 20, 2022 Author

wang70880 Apr 20, 2022 Author

jakobrunge Aug 1, 2023 Maintainer

Replies: 7 comments 7 replies

wang70880 Apr 20, 2022
Author

wang70880
Apr 14, 2022
Author

wang70880
Apr 19, 2022
Author

wang70880
Apr 20, 2022
Author

wang70880 Apr 20, 2022
Author

wang70880 Apr 20, 2022
Author

jakobrunge
Aug 1, 2023
Maintainer