Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace JGraphT with scala-graph #3297

Closed
wants to merge 13 commits into from
Closed

Conversation

shengquan-ni
Copy link
Collaborator

@shengquan-ni shengquan-ni commented Mar 3, 2025

This PR replaces the usage of JGraphT with another library called scala-graph, which is under an apache2.0 license.

I evaluated the support for JGraphT use cases in scala-graph:

  1. (Supported) create a DAG from both a logical and physical plan. This DAG should allow multiple edges between 2 vertices.

  2. (Not supported) find bridges in a DAG. A bridge is an edge whose removal will cause an increase of weakly connected components.

  3. compute maxChains, which requires

      1. (Not supported) enumerating all paths between 2 vertices.
      1. keeping only chains. Besides the head and tail, other operators in a chain should only have one input edge and one output edge.
      1. finding dominant chains that cover sub-chains.
  4. (Supported) find weakly connected components.

For the Not supported part, I re-implemented them.

@shengquan-ni shengquan-ni self-assigned this Mar 3, 2025
@shengquan-ni shengquan-ni added dependencies Pull requests that update a dependency file refactoring Refactor the code labels Mar 3, 2025
Copy link
Collaborator

@Xiao-zhen-Liu Xiao-zhen-Liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding a new library and implementing those graph algorithms. However there are some serious issues remaining. Please see the comments.

import edu.uci.ics.amber.core.workflow.PhysicalLink
import org.jgrapht.alg.connectivity.BiconnectivityInspector
import edu.uci.ics.amber.core.workflow.{PhysicalLink, PhysicalPlan, WorkflowContext}
import edu.uci.ics.amber.engine.common.{AmberConfig, AmberLogging}
import org.jgrapht.graph.{DirectedAcyclicGraph, DirectedPseudograph}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jgrapht is still used in many other places in the amber module. Please do a thorough check.

operatorList: List[LogicalOp],
links: List[LogicalLink]
): DirectedAcyclicGraph[OperatorIdentity, LogicalLink] = {
Copy link
Collaborator

@Xiao-zhen-Liu Xiao-zhen-Liu Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Graph class ensure the plan is a DAG? For jgrapht if a plan is not a DAG an error will be thrown. If Graph does not ensure the acyclicity part of a DAG, do we implement that ourselves? The acyclicity is crucial for scheduling (RegionDAG), which is currently still implemented as a jgrapht DAG.

@transient private lazy val operatorMap: Map[PhysicalOpIdentity, PhysicalOp] =
operators.map(o => (o.id, o)).toMap

// the dag will be re-computed again once it reaches the coordinator.
@transient lazy val dag: DirectedAcyclicGraph[PhysicalOpIdentity, PhysicalLink] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@shengquan-ni
Copy link
Collaborator Author

shengquan-ni commented Mar 4, 2025

Since JgraphT is dual-licensed with EPL and EPL is ok-ish for the apache project, we decided to keep JGraphT.
2 Resons:

  1. JGraphT has better support for DAG and related graph algorithms.
  2. the migration's work is way more than I expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file refactoring Refactor the code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants