Merge pull request #90 from inzva/muratbiberoglu-patch

related topics to ap-24-25-fall week 5 migrated
inzva · Nov 14, 2024 · dbc10fc · dbc10fc
2 parents ff2b597 + 5651f75
commit dbc10fc
Show file tree

Hide file tree

Showing 12 changed files with 376 additions and 19 deletions.
diff --git a/docs/data-structures/index.md b/docs/data-structures/index.md
@@ -29,6 +29,10 @@ Bilgisayar biliminde veri yapıları, belirli bir eleman kümesi üzerinde verim
 ### [Sparse Table](sparse-table.md)
 ### [SQRT Decomposition](sqrt-decomposition.md)
 
+## Common Problems
+
+### [LCA](lowest-common-ancestor.md)
+
 ## Örnek Problemler
 
 Veri yapıları üzerinde pratik yapabilmeniz için önerilen problemler:

diff --git a/docs/data-structures/lowest-common-ancestor.md b/docs/data-structures/lowest-common-ancestor.md
@@ -0,0 +1,32 @@
+---
+title: Lowest Common Ancestors
+tags:
+    - Tree
+    - LCA
+    - Lowest Common Ancestors
+    - Binary Lifting
+---
+
+This problem consists of queries, LCA(x, y), and asks for the ancestor of both x and y whose depth is maximum. We will use a similar algorithm to the jump pointer algorithm with implementation.
+
+## Initialization
+
+As we did in Jump Pointer Method, we will calculate node's all $2^i$. ancestors if they exist. L[x][y] corresponds to x's $2^y$. ancestors. Hence L[x][0] is basically the parent of x.
+
+```cpp
+void init() {
+    for(int x=1 ; x<=n ; x++)
+        L[x][0] = parent[x];
+
+    for(int y=1 ; y<=logN ; y++)
+        for(int x=1 ; x<=n ; x++)
+            L[x][y] = L[L[x][y-1]][y-1];
+}
+```
+Note that we have used the fact that x's $2^y$. ancestor is x's $2^{y−1}$. ancestor's $2^{y−1}$. ancestor.
+
+## Queries-Binary Lifting
+
+Given LCA(x, y), we calculate answer by following:
+
+Firstly, ensure that both x and y are in same depth. If it is not take the deepest one to the other one's depth. Then control whether x and y is equal. If they are equal, that means the lowest common ancestor is x. After that, from i = log(N), check that if x's $2^i$. ancestor is equal to y's $2^i$. ancestor. If they are not equal that means LCA is somewhere above the $2^i$. ancestors of x and y. Then we continue to search LCA of y and x’s ancestors as LCA(L[x][i], L[y][i]) is the same as LCA(x, y). Please notice that we have ensured that depth diﬀerence between LCA and both x and y are no longer larger than $2^i$. If we apply this producure until i = 0, we would left with x and y such that parent of x is LCA. Of course, the parent of y would also be LCA.
diff --git a/docs/graph/cycle-finding.md b/docs/graph/cycle-finding.md
@@ -0,0 +1,37 @@
+---
+title: Cycle Finding
+tags:
+    - Graph
+    - Cycle
+---
+
+**Cycle**: A sequence of nodes that returns to the starting node while visiting each node at most once and contains at least two nodes. 
+
+We can use dfs order in order to find the graph has a cycle or not.
+
+If we find a back edge while traversing the graph then we can say that graph has a cycle. Because **back edge** connects the nodes at the top and bottom ends and causes a cycle.
+
+The algorithm that we are going to use to find the cycle in the directed graph:
+
+- Traverse the graph with dfs order.
+- When you come to a node, color it gray and start visiting its neighbors.
+- If one of the current node's neighbors is gray, then there is a cycle in the graph. Because a gray node is definitely an ancestor of the current node, and an edge to one of its ancestors is definitely a back edge.
+- Once you're done visiting the neighbors, color the node black.
+
+```cpp
+bool dfs(int node){
+    // The color array holds the color of each node.
+    // 0 represents white, 1 represents gray, and 2 represents black.
+    color[node] = 1;
+    for(int i = 0; i < g[node].size(); i++){
+        int child = g[node][i];
+        if(color[child] == 1)
+            return true;
+        if(!color[child])
+            if(dfs(child))
+                return true;
+    }
+    color[node] = 2;
+    return false;
+}
+```
diff --git a/docs/graph/definitions.md b/docs/graph/definitions.md
@@ -0,0 +1,34 @@
+---
+title: Graph Definitions
+tags:
+    - Graph
+---
+
+## Definitions of Common Terms
+
+- **Node** - An individual data element of a graph is called Node. Node is also known as vertex.
+- **Edge** - An edge is a connecting link between two nodes. It is represented as e = {a,b} Edge is also called Arc.
+- **Adjacent** - Two vertices are adjacent if they are connected by an edge.
+- **Degree** - a degree of a node is the number of edges incident to the node.
+- **Undirected Graphs** - Undirected graphs have edges that do not have a direction. The edges indicate a two-way relationship, in that each edge can be traversed in both directions.
+- **Directed Graphs** - Directed graphs have edges with direction. The edges indicate a one-way relationship, in that each edge can only be traversed in a single direction.
+- **Weighted Edges** - If each edge of graphs has an association with a real number, this is called its weight.
+- **Self-Loop** - It is an edge having the same node for both destination and source point.
+- **Multi-Edge** - Some Adjacent nodes may have more than one edge between each other.
+
+## Walks, Trails, Paths, Cycles and Circuits
+
+- **Walk** - A sequence of nodes and edges in a graph.
+- **Trail** - A walk without visiting the same edge.
+- **Circuit** - A trail that has the same node at the start and end.
+- **Path** - A walk without visiting same node.
+- **Cycle** - A circuit without visiting same node. 
+
+## Special Graphs
+
+- **Complete Graph** - A graph having at least one edge between every two nodes.
+- **Connected Graph** - A graph with paths between every pair of nodes.
+- **Tree** - an undirected connected graph that has any two nodes that are connected by exactly one path. There are some other definitions that you can notice it is tree:
+    - an undirected graph is connected and has no cycles. an undirected graph is acyclic, and a simple cycle is formed if any edge is added to the graph.
+    - an undirected graph is connected, it will become disconnected if any edge is removed.
+    - an undirected graph is connected, and has  (number of nodes - 1) edges.
diff --git a/docs/graph/img/kruskal.jpg b/docs/graph/img/kruskal.jpg
diff --git a/docs/graph/img/mst.png b/docs/graph/img/mst.png
diff --git a/docs/graph/img/prim.png b/docs/graph/img/prim.png
diff --git a/docs/graph/index.md b/docs/graph/index.md
@@ -6,28 +6,15 @@ title: Graph
 
 **Reviewers:** Yasin Kaya
 
-## Introduction
-
-A graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to the mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines for the edges. [8]
-
-Why graphs? Graphs are usually used to represent different elements that are somehow related to each other.
-
-A Graph consists of a finite set of vertices(or nodes) and set of edges which connect a pair of nodes. G = (V,E)
-
-V = set of nodes
-
-E = set of edges(e) represented as e = a,b
-
-Graph are used to show a relation between objects. So, some graphs may have directional edges (e.g. people and their love relationships that are not mutual: Alice may love Alex, while Alex is not in love with her and so on), and some graphs may have weighted edges (e.g. people and their relationship in the instance of a debt)
-
-<figure markdown="span">
-![Directed Acyclic Graph](img/directed_acyclic_graph.png)
-<figcaption>Figure 1: a simple unweigted graph</figcaption>
-</figure>
-
+### [Introduction](introduction.md)
+### [Definitions](definitions.md)
+### [Representing Graphs](representing-graphs.md)
 ### [Depth First Search](depth-first-search.md)
 ### [Breadth First Search](breadth-first-search.md)
+### [Cycle Finding](cycle-finding.md)
+### [Union Find](union-find.md)
 ### [Shortest Path](shortest-path.md)
+### [Minimum Spanning Tree](minimum-spanning-tree.md)
 ### [Topological Sort](topological-sort.md)
 
 ## References

diff --git a/docs/graph/introduction.md b/docs/graph/introduction.md
@@ -0,0 +1,22 @@
+---
+title: Introduction
+tags:
+    - Graph
+---
+
+A graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to the mathematical abstractions called vertices (also called nodes or points) and each of the related pairs of vertices is called an edge. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines for the edges. [8]
+
+Why graphs? Graphs are usually used to represent different elements that are somehow related to each other.
+
+A Graph consists of a finite set of vertices(or nodes) and set of edges which connect a pair of nodes. G = (V,E)
+
+V = set of nodes
+
+E = set of edges(e) represented as e = a,b
+
+Graph are used to show a relation between objects. So, some graphs may have directional edges (e.g. people and their love relationships that are not mutual: Alice may love Alex, while Alex is not in love with her and so on), and some graphs may have weighted edges (e.g. people and their relationship in the instance of a debt)
+
+<figure markdown="span">
+![Directed Acyclic Graph](img/directed_acyclic_graph.png)
+<figcaption>Figure 1: a simple unweigted graph</figcaption>
+</figure>
diff --git a/docs/graph/minimum-spanning-tree.md b/docs/graph/minimum-spanning-tree.md
@@ -0,0 +1,104 @@
+---
+title: Minimum Spanning Tree
+tags:
+    - Graph
+    - Minimum Spanning Tree
+    - Prim
+    - Kruskal
+---
+
+## Definition
+
+Given an undirected weighted connected graph $G = (V,E)$ Spanning tree of G is a connected acyclic sub graph that covers all nodes and some edges. In a disconnected graph -where there is more than one connected component- the spanning tree of that graph is defined as the forest of the spanning trees of each connected component of the graph.
+
+Minimum spanning tree (MST) is a spanning tree in which the sum of edge weights is minimum. The MST of a graph is not unique in general, there might be more than one spanning tree with the same minimum cost. For example, take a graph where all edges have the same weight, then any spanning tree would be a minimum spanning tree. In problems involving minimum spanning trees where you have to output the tree itself (and not just the minimum cost), it either puts more constraint so the answer is unique, or simply asks for any minimum spanning tree.
+
+<figure markdown="span">
+![Minimum Spanning Tree](img/mst.png)
+<figcaption>MST of the graph. It spans all nodes of the graph and it is connected.</figcaption>
+</figure>
+
+To find the minimum spanning tree of a graph, we will introduce two algorithms. The first one called Prim's algorithm, which is similar to Dijkstra's algorithm. Another algorithm is  Kruskal agorithm, which makes use of the disjoint set data structure. Let's discover each one of them in detail!
+
+## Prim Algorithm
+
+Prim algorithm is very similar to Dijkstra's shortest path algorithm. In this algorithm we have a set $S$ which represents the explored nodes and again we can maintain a priority queue data structure the closest node in $V-S$. It is a greedy algorithm just like Dijkstra's shortest path algorithm.
+
+<figure markdown="span" style="text-align: left">
+```
+G = (V, E)   V set of all nodes, E set of all edges
+T = {}       result, edges of MST
+S = {1}      explored nodes
+while S /= V do
+    let (u, v) be the lowest cost edge such that u in S and v in V - S;
+    T = T U {(u, v)}
+    S = S U {v}
+end
+```
+<figcaption>Prim Algorithm in Pseudo code, what is the problem here?</figcaption>
+</figure>
+
+There is a problem with this implementation, it assumes that the graph is connected. If the graph is not connected this algorithm will be stuck on loop. There is a good visualization for Prim algorithm at [10]. If we use priority queue complexity would be $O(ElogV)$.
+
+<figure markdown="span">
+![Prim's Algorithm](img/prim.png)
+<figcaption>Example of how Prim Algorithm constructs the MST</figcaption>
+</figure>
+
+## Kruskal Algorithm
+
+In Prim algorithm we started with a specific node and then proceeded with choosing the closest neighbor node to our current graph. In Kruskal algorithm, we follow a different strategy; we start building our MST by choosing one edge at a time, and link our (intially separated) nodes together until we connect all of the graph.
+
+To achieve this task, we will start with having all the nodes separated each in a group. In addition, we will have the list of edges from the original graph sorted based on their cost. At each step, we will:
+
+1. Pick the smallest available edge (that is not taken yet)
+2. Link the nodes it connects together, by merging their group into one unified group
+3. Add the cost of the edge to our answer
+
+However, you may realize in some cases the link we add will connect two nodes from the same group (because they were grouped before by other taken edges), hence violating the spanning tree condition (Acyclic) and more importantly introducing unnecessary edges that adds more cost to the answer. So to solve this problem, we will only add the edges as long as they connect two currently (at the time of processing this edge) separated nodes that belong to different groups, hence completing the algorithm.
+
+The optimality of Kruskal algorithm comes from the fact that we are taking from a sorted list of edges. For more rigorous proof please refer to [11].
+
+So how can we effectively merge the group of nodes and check that which group each node belong? We can utilize disjoint set data structure which will help us to make union and find operations in an amortized constant $\mathcal{O}(1)$ time.
+
+```cpp
+typedef pair<int,pair<int,int>> edge;
+// represent edge as triplet (w,u,v)
+// w is weigth, u and v verticies.
+// edge.first is weigth edge.second.first -> u, edge.second.second -> v
+typedef vector<edge> weigthed_graph;
+
+/*union - find data structure utilities */
+const int maxN = 3005;
+int parent[maxN];
+int ssize[maxN];
+void make_set(int v);
+int find_set(int v);
+void union_sets(int a, int b);
+void init_union_find();
+
+/*Code that finds edges in MST */
+void kruskal(vector<edge> &edgeList ){
+    vector<edge> mst;
+    init_union_find();
+    sort(edgeList.begin(),edgeList.end(), \
+        [](const auto &a, const auto  &b) { return a.first< b.first;}); 
+    //well this weird syntax is lambda function 
+    // for sorting pairs to respect their first element.
+    for( auto e: edgeList){
+        if( find_set(e.second.first )!= find_set(e.second.second)){
+            mst.push_back(e);
+            union_sets(e.second.first, e.second.second);
+        }
+    }
+}
+```
+
+To calculate the time complexity, observe how we first sorted the edges, this takes $\mathcal{O}(E log E)$. In addition we pass through the edges one by one, and each time we check which group the two nodes of the edge belongs to, and in some cases merge the two groups. So in the worst case we will assume that both operations (finding and merging) happens, but since the disjoint data structure guarantee $\mathcal{O}(1)$ amortized time for both operations, we end up with $\mathcal{O}(E)$ amortized time of processing the nodes.
+
+So in total we have $\mathcal{O}(E log E)$ from sorting edges and $\mathcal{O}(E)$ from processing them, those results in a total of $\mathcal{O}(E log E)$ (if you don't understand why please refer to the first bundle where we discuss time complexity).
+
+<figure markdown="span">
+![Kruskal's Algorithm](img/kruskal.jpg)
+<figcaption>Example of how Kruskal Algorithm constructs the MST</figcaption>
+</figure>
diff --git a/docs/graph/representing-graphs.md b/docs/graph/representing-graphs.md
@@ -0,0 +1,89 @@
+---
+title: Representing Graphs
+tags:
+    - Graph
+---
+
+## Edge Lists
+
+A simple way to define edge list is that it has a list of pairs. We just have a list of objects consisting of the vertex numbers of 2 nodes and other attributes like weight or the direction of edges. [16]
+
+- **\+** For some specific algorithms you need to iterate over all the edges, (i.e. kruskal's algorithm)
+- **\+** All edges are stored exactly once.
+- **\-** It is hard to determine whether two nodes are connected or not.
+- **\-** It is hard to get information about the edges of a specific vertex.
+
+```cpp
+#include <iostream>
+#include <vector>
+using namespace std;
+
+int main(){
+    int edge_number;
+    vector<pair <int,int> > edges;
+    cin >> edge_number;
+    for( int i=0 ; i<edge_number ; i++ ){
+        int a,b;
+        cin >> a >> b;
+        edges.push_back(make_pair(a,b)); // a struct can be used if edges are weighted or have other properties.
+    }
+}
+```
+
+## Adjacency Matrices
+
+Stores edges, in a 2-D matrix. matrix[a][b] keeps an information about road from a to b. [16]
+- **\+** We can easily check if there is a road between two vertices.
+- **\-** Looping through all edges of a specific node is expensive because you have to check all of the empty cells too. Also these empty cells takes huge memory in a graph which has many vertices. (For example representing a tree)
+
+```cpp
+#include <iostream>
+#include <vector>
+using namespace std;
+int main(){
+    int node_number;
+    vector<vector<int> > Matrix;
+    cin >> node_number;
+    for( int i=0 ; i<node_number ; i++ )
+        for( int j=0 ; j<node_number ; j++ ){
+            Matrix.push_back(vector <int> ());
+            int weight;
+            cin >>weight ;
+            Matrix[i].push_back(weight);
+        }
+}
+```
+
+## Adjacency List
+
+Each node has a list consisting of nodes each is adjacent to. So, there will be no empty cells. Memory will be equal to number of edges. The most used one is in algorithms. [16]
+
+- **\+** You do not have to use space for empty cells.
+- **\+** Easily iterate over all the neighbors of a specific node.
+- **\-** If you want to check if two nodes are connected, in this form you still need to iterate over all the neighbors of one of them. But, there are some structures that you can do this operation in O(log N). For example if you won't add any edge, you can sort every vector with nodes' names, so you can find it by binary search.  
+
+```cpp
+#include <iostream>
+#include <vector>
+using namespace std;
+
+int main(){
+    int node_number,path_number;
+
+    vector<vector<int> > paths; 
+    // use object instead of int, 
+    //if you need to store other features
+
+    cin >> node_number >> path_number;
+    for( int i=0 ; i<node_number ; i++ )
+        Matrix.push_back(vector <int> ());
+    for( int j=0 ; j< path_number ; j++ ){
+        int beginning_node, end_node;
+        cin >> beginning_node >> end_node;
+
+        Matrix[ beginning_node ].push_back( end_node ); // push st
+        // Matrix[ end_node ].push_back(  beginning_node ); 
+        // ^^^ If edges are Undirected, you should push in reverse direction too
+    }
+}
+```