Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH; WIP] Implementation of FCI algorithm that leverages the base classes used in PC #32

Closed
wants to merge 14 commits into from

Conversation

adam2392
Copy link
Collaborator

@adam2392 adam2392 commented Aug 30, 2022

Changes proposed in this pull request:

  • implements the FCI algorithm
  • implements a subclass of LearnSkeleton for learning skeleton graphs using the idea of "possibly-d-separating" sets and "PDS-Path" sets (see RFCI paper)
  • implements unit tests for skeleton and FCI procedure

Next PR:

  • implements an example demonstrating how FCI is different

This relies on Pywhy-graph changes in: py-why/pywhy-graphs#10
Also this will be downstream of #30 .

Order of merging:

  1. [ENH] PC algorithm with updated example using dowhy.gcm for the SCM module #30
  2. this PR

Before submitting

  • I've read and followed all steps in the Making a pull request
    section of the CONTRIBUTING docs.
  • I've updated or added any relevant docstrings following the syntax described in the
    Writing docstrings section of the CONTRIBUTING docs.
  • If this PR fixes a bug, I've added a test that will fail without my fix.
  • If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

  • All GitHub Actions jobs for my pull request have passed.

Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@adam2392 adam2392 changed the title [ENH] Implementation of FCI algorithm that leverages the base classes used in PC [ENH; WIP] Implementation of FCI algorithm that leverages the base classes used in PC Aug 30, 2022
@adam2392 adam2392 marked this pull request as draft August 30, 2022 16:31
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
Signed-off-by: Adam Li <[email protected]>
@bloebp
Copy link
Member

bloebp commented Aug 31, 2022

Just want to mention that if we start supporting causal-learn (which we plan to), we should probably not have another implementation of the most common algorithms. However, I see the benefit here to demonstrate the API and graph usages.

That being said, maybe we should look into having wrappers to support calling the causal-learn functions. This could be as simple as:

def pc(data):
  adjacency_matrix = causal_learn.pc_algorihtm(data)
  
  cpdag = OurCPDAGImplementation(adjacency_matrix)

  return cpdag

There is no need to have the graph definitions deeply integrated in the algorithms.

@adam2392
Copy link
Collaborator Author

Just want to mention that if we start supporting causal-learn (which we plan to), we should probably not have another implementation of the most common algorithms. However, I see the benefit here to demonstrate the API and graph usages.

That sounds good. I'm motivated to implement at least the core constraint-based algos. in this repo and codebase because of a few reasons:

i) extensibility: rn I am trying to get an implementation of rFCI working w/o having to use the R package pcalg.
ii) maintainability: I don't see the rules 5-10 of Zhang 2008 in the FCI function. I found it easier to just draft up an implementation that teases apart the semantics

In the proposed constraint-based API, I think it's important to modularize and pull apart each aspect of the core algorithms to enable future improvements. E.g. rFCI, or even conservativeFCI, or maxvoteFCI is trivially implementable by subclassing class FCI here.

Of course... this all relies on me being invested in constraint-based learning :p. I think if there is developer consensus on the API, then re-implementing will be helpful in the long run to consolidate the API across the entire pywhy. With that being said, I am less motivated to do the score-based algorithms, which I think then implementing a wrapper sounds like a great way of getting the functionality out right away. E.g. #29

@adam2392
Copy link
Collaborator Author

def pc(data):
  adjacency_matrix = causal_learn.pc_algorihtm(data)
  
  cpdag = OurCPDAGImplementation(adjacency_matrix)

  return cpdag

There is no need to have the graph definitions deeply integrated in the algorithms.

For this comment: are you referring to the lines here

def convert_skeleton_graph(self, graph: nx.Graph) -> EquivalenceClassProtocol:
raise NotImplementedError(
"All constraint discovery algorithms need to implement a function to convert "
"the skeleton graph to a causal graph."
)

or

if graph.has_edge(v_i, u, graph.undirected_edge_name):
graph.orient_uncertain_edge(v_i, u)
if graph.has_edge(v_j, u, graph.undirected_edge_name):
graph.orient_uncertain_edge(v_j, u)
?

If the former, then agreed it's quite hidden. Rn I convert to the class, but that's because I need the explicit object to "orient edges" in the PC/FCI orientation phase. I think we've discussed the issue of assuming a numpy array vs an actual object in the internals of the algo. I think there are equally the same issues of converting any graph object to numpy array as converting a numpy array to a graph object before running the algos. But if we eventually can push MixedEdgeGraph into networkx, why not go with an explicit graph object instead of numpy array? Perhaps something to discuss concretely at the next meeting(?)

I think one way to make this more transparent and modularizable is to allow the user to optionally pass in a graph function that allows def convert_skeleton_graph(self, graph: nx.Graph) -> EquivalenceClassProtocol: to define any graph object that meets the Protocol defined.

@bloebp
Copy link
Member

bloebp commented Aug 31, 2022

I was rather referring to providing wrappers instead of re-implementing the algorithms (as discussed offline). That was rather meant as a pseudo-code to show that we can simply call external libraries and convert their result into our objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants