cpinitiative · crafticat · May 15, 2024 · May 15, 2024 · May 16, 2024 · Jun 19, 2024
@@ -93,11 +93,11 @@
     {
       "uniqueId": "cf-840D",
       "name": "Destiny",
-      "url": "https://codeforces.com/problemset/problem/840/D",
+      "url": "https://codeforces.com/contest/840/problem/D",
       "source": "CF",
       "difficulty": "Hard",
       "isStarred": false,
-      "tags": [],
+      "tags": ["Wavelet"],
       "solutionMetadata": {
         "kind": "autogen-label-from-site",
         "site": "CF"

@@ -1,19 +1,17 @@
 ---
 id: wavelet
 title: 'Wavelet Tree'
-author: Benjamin Qi
+author: Benjamin Qi, Omri Jerbi
 prerequisites:
   - RURQ
-description: "?"
+description: Wavelet trees support efficient queries for the kth minimum element in a range"
 frequency: 0
 ---
 
-## Wavelet Tree
+# Wavelet Tree
+Wavelet trees are data structures that support efficient queries for the k-th minimum element in a range by maintaining a segment tree over values instead of indices.
 
 <FocusProblem problem="waveletSam" />
-
-Like a segment tree on values rather than indices.
-
 <Resources>
 	<Resource
 		source = "IOI"
@@ -32,9 +30,235 @@ Like a segment tree on values rather than indices.
 	</Resource>
 </Resources>
 
-### Solution - Range K-th Smallest
+Suppose you want to support the following queries:
+
+- Given a range, count the number of occurrences of value $x$.
+- Given a range, find the $k$ smallest element
+
+With a wavelet tree, you can easily support those queries in $\log(M)$ time,
+where $M$ is the maximum value in the array.
+
+## Wavelet tree structure
+
+A wavelet tree is a binary tree where each node represents a range of values.
+The root node covers the entire range, and each subsequent level splits the
+range into two halves.
+
+We are going to maintain a segment tree over values instead of indices. Each segment will contain indices whose values lie within the segment's range. We'll save those indices in a vector. Notice that an index can be in at most $log(M)$ segments
+
+
+<Info title="Motivation">
+  The motivation for maintaining a segment tree over values lies in the types of queries we want to answer. While traditional segment trees handle range-based queries like sums effectively, they fall short for value-based queries. By maintaining segments over values, we can efficiently address these value-specific queries.
+</Info>
+
+<Spoiler title="Wavelet tree Visualization">
+
+Let's say our array is: $[3,5,3,1,2,2,3,4,5,5]$
+Each node has an array representing the indices of every number between l and r
+
+![Wavelet Tree Visualization](./assets/diagram.png)
+</Spoiler>
+
+## Solving the first type of query
+**Given a range l, r count the number of occurrences of value x.**
+
+To calculate the number of occurrences from $𝑙$ to $𝑟$, we can use the following
+formula:
+
+$$
+\begin{aligned}
+\texttt{occurrences}(l, r) = \texttt{occurrences}(r) - \texttt{occurrences}(l)
+\end{aligned}
+$$
+
+This reduces the problem to counting the number of occurrences in a prefix.
+
+One way to solve the problem is to go to the leaf node
+and perform a binary search for the number of indices less than $𝑟$
+However, let's explore a different approach that can also be extended to the
+second type of query.
+
+Instead of binary searching on the leaf, we update $𝑟$ as we recurse down the
+tree.
+If we can determine the position (index) of $𝑟$ in the left and right
+children of a node.
+We can recurse down the tree and determine its position in the leaf node.
+
+To find the position of $𝑟$ in a node's left and right children, we need to
+determine how many indices are smaller than the middle value (mid) and precede
+$𝑟$.
+This can be done using a prefix sum.
+
+Let's define:
+- $c[i]$ = as 1 if $index[i]$ is smaller than mid otherwise 0
+- $prefixB[i]$ as prefix sum of $c[i]$
+
+Formally
+
+$$
+c[i] = index[i] < mid ? 1 : 0;
+prefixB[i] = prefixB[i - 1] + c[i]
+$$
-Let's define:
- $c[i]$ = as 1 if $index[i]$ is smaller than mid otherwise 0
- $prefixB[i]$ as prefix sum of $c[i]$
-
-Formally
-
-$$
-c[i] = index[i] < mid ? 1 : 0;
-prefixB[i] = prefixB[i - 1] + c[i]
-$$
+Let's define:
+- $c[i]$ = as $1$ if $index[i]$ is smaller than mid otherwise $0$
+- $prefixB[i]$ as prefix sum of $c[i]$
+
+Formally
+
+$$
+c[i] = index[i] < mid ? 1 : 0;
+prefixB[i] = prefixB[i - 1] + c[i]
+$$
-Let's define:
- $c[i]$ = as 1 if $index[i]$ is smaller than mid otherwise 0
- $prefixB[i]$ as prefix sum of $c[i]$
-
-Formally
-
-$$
-c[i] = index[i] < mid ? 1 : 0;
-prefixB[i] = prefixB[i - 1] + c[i]
-$$
+Let's define:
+- $c[i]$ = as $1$ if $index[i]$ is smaller than mid otherwise $0$
+- $prefixB[i]$ as prefix sum of $c[i]$
+
+Formally
+
+$$
+c[i] = index[i] < mid ? 1 : 0;
+prefixB[i] = prefixB[i - 1] + c[i]
+$$
+
+
+To update $r$ as we recurse down, we do the following:
+- To know the value of $r$ if we recurse left, we use prefixB[r]
+- If we recurse right, we use $r$ - prefixB[r]
+
+## Solving the second type of query
+**Given a range $l$, $r$ find the k smallest element**
+
+We will determine whether the answer for a given node is in the left or the
+right segment.
+We can calculate how many times the elements within the segments' ranges appear
+in our range $(l, r)$ using our first type of query.
+Note that this also works for non-leaf nodes using the following formula:
+
+$$
+\texttt{occurrences}(l, r) = r - l
+$$
+<Info title="Simular">
+This is similar to counting how many times a value appears up to index 𝑅 in our previous query. We did this by using the new 𝑅 value at the leaf node. But now, we consider the difference between the updated 𝑅 and 𝐿
+</Info>
+
+Therefore, the occurrences of the left node is
+
+$$
+\texttt{left\_occurrences} = prefixB[r] - prefixB[l]
+$$
+
+<Info title="Left occurrences">
+  Note that $\texttt{left\_occurrences}$ is the number of indices between l and r whose value is less than mid
+
+</Info>
+
+- If $\texttt{left\_occurrences}$ is greater or equal to $k$, it means the $k$-th smallest element is in
+  the left subtree. Therefore, we update our range and recurse into the left
+  child
+- If $\texttt{left\_occurrences}$ is less than $k$, it means the
+  $k$-th smallest element is in the right subtree. We adjust k by subtracting
+  $\texttt{left\_occurrences}$ from $k$, update our range, and recurse into the right child
+
+<Info title="Notice">
+  Notice we still update $l, r$ accordingly when we go left or right
+</Info>
+
+the answer then will be the value of the node we end up on (leaf)
+
+## Implemention
+**Time Complexity:** $\mathcal{O}(Q \cdot \log(M))$
+
+<LanguageSection>
+<CPPSection>
+
+```cpp
+#include <bits/stdc++.h>
+
+using namespace std;
+
+struct Segment {
+	Segment *left = nullptr, *right = nullptr;
+	int l, r, mid;
+	bool children = false;
+	vector<pair<int, int>> indices;  // Index, Value
+	vector<int> prefix_b;
+
+	Segment(int l, int r, const vector<pair<int, int>> &indices)
+	    : l(l), r(r), mid((r + l) / 2), indices(indices) {
+		calculate_b();
+	}
+
+	// Sparse since values can go up to 1e9
+	void update() {
+		if (children) { return; }
+		children = true;
+		if (r - l > 1) {
+			// Split the indices for left and right child
+			vector<pair<int, int>> leftIndices, rightIndices;
+			partition_copy(indices.begin(), indices.end(), leftIndices.begin(),
+			               rightIndices.begin(), [this](const pair<int, int> &elem) {
+				               return elem.second < mid;
+			               });
+
+			left = new Segment(l, mid, leftIndices);
+			right = new Segment(mid, r, rightIndices);
+		}
+	}
+
+	// Calculates the prefix B
+	void calculate_b() {
+		int i = 1;
+		int j = 0;
+		prefix_b.resize(indices.size() + 1);
+		for (auto [ind, val] : indices) {
+			if (val < mid) j++;
+			prefix_b[i++] = j;
+		}
+	}
+
+	int find_k_smallest(int a, int b, int k) {
+		update();
+		if (r - l <= 1) { return l; }
+
+		int lb = prefix_b[a];
+		int lr = prefix_b[b];
+		int inLeft = lr - lb;  // Amount of values in range (a,b) that are less the mid
+
+		if (k <= inLeft) {
+			return left->find_k_smallest(lb, lr, k);  // Appears in left
+		} else {
+			return right->find_k_smallest(a - lb, b - lr,
+			                              k - inLeft);  // Appears in right
+		}
+	}
+};
+
+int main() {
+	int n, q;
+	cin >> n >> q;
+
+	vector<pair<int, int>> indices;
+	for (int i = 0; i < n; ++i) {
+		int v;
+		cin >> v;
+		indices.emplace_back(i, v);
+	}
+	Segment seg(0, 1e9 + 2, indices);
+
+	for (int i = 0; i < q; ++i) {
+		int a, b, k;
+		cin >> a >> b >> k;
+		k++;
+		cout << seg.find_k_smallest(a, b, k) << " ";
+	}
+}
+```
+</CPPSection>
+</LanguageSection>
+
+
+
+## Supporting updates
+
+Let's support updates of type:
+	- change value at index $i$ to $y$
+
+We can traverse down to the leaf to remove the old element and also traverse down to add the new element.
+
+So what do the updates change?
+ -
+ Our indices vector
+ Our prefix vector
+
+To change the indices vector, what we can do is, instead of storing a vector, use a set.
+Then erasing and adding values becomes easy.
 
-<IncompleteSection />
+On the other hand, To change the prefix vector, since each update could change our prefix vector a lot, we can't maintain just the normal vector. What we could do is use a sparse segment tree.
+	- erasing and inserting can be done by just setting the value to 0 or 1 at the specific index
+	- querying for a prefix can be done by querying the segment tree from 0 to $i$
+This approach is not memory efficient and requires a segment tree's implementation.
+A more friendly approach would be using an order statistics tree.
+Such that querying for a prefix would be equivalent to order_of_key($i$)
 
 ### Problems
 

@@ -16,6 +16,42 @@
     }
   ],
   "wavelet": [
+    {
+      "uniqueId": "cf-840D",
+      "name": "Destiny",
+      "url": "https://codeforces.com/contest/840/problem/D",
+      "source": "CF",
+      "difficulty": "Normal",
+      "isStarred": false,
+      "tags": ["Wavelet"],
+      "solutionMetadata": {
+        "kind": "none"
+      }
+    },
+    {
+      "uniqueId": "spoj-ILKQUERY2",
+      "name": "I Love Kd-Trees",
+      "url": "https://www.spoj.com/problems/ILKQUERY2/",
+      "source": "SPOJ",
+      "difficulty": "Normal",
+      "isStarred": false,
+      "tags": ["Wavelet"],
+      "solutionMetadata": {
+        "kind": "none"
+      }
+    },
+    {
+      "uniqueId": "coci-20-index",
+      "name": "2021 - Index",
+      "url": "https://evaluator.hsin.hr/tasks/HONI202167index/",
+      "source": "COCI",
+      "difficulty": "Normal",
+      "isStarred": false,
+      "tags": ["Wavelet, Persistent Segtree"],
+      "solutionMetadata": {
+        "kind": "none"
+      }
+    },
     {
       "uniqueId": "kattis-easyquery",
       "name": "Easy Query",