Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wavelet Tree Module #4813

Open
wants to merge 63 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
764cbc8
Update Wavelet.mdx
crafticat May 15, 2024
fd171f2
Update Wavelet.mdx
crafticat May 15, 2024
9103e44
Update Wavelet.mdx
crafticat May 16, 2024
0640ad1
Update Wavelet.mdx
crafticat Jun 19, 2024
70f4794
Update Wavelet.problems.json
crafticat Jun 19, 2024
8d77548
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 19, 2024
3c29d69
Requested changes
crafticat Jun 22, 2024
54abaaf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 22, 2024
d367691
Added image
crafticat Jun 22, 2024
0002d9e
Req changes v2.mdx
crafticat Jun 24, 2024
2978c67
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 24, 2024
a49cd77
Update Wavelet.mdx
crafticat Jun 30, 2024
6df8c0b
Update Wavelet.problems.json
crafticat Jun 30, 2024
0379489
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 30, 2024
a2ae4c1
Update Wavelet.mdx
crafticat Jul 7, 2024
76bd734
Update Wavelet.problems.json
crafticat Jul 7, 2024
22d3b84
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 7, 2024
dc752bd
Update Wavelet.mdx
crafticat Jul 9, 2024
8d1c959
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 9, 2024
f4ad194
Update Wavelet.mdx
crafticat Jul 9, 2024
a5df338
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 9, 2024
fd9509e
fix build
ryanchou-dev Jul 11, 2024
c5394c4
solution metadata
ryanchou-dev Jul 11, 2024
b5864e2
format
ryanchou-dev Jul 11, 2024
56a111c
Update Wavelet.mdx
crafticat Jul 25, 2024
10585a2
Update Wavelet.mdx
crafticat Jul 25, 2024
8eb4b89
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 25, 2024
f964801
Update Wavelet.mdx
crafticat Jul 26, 2024
f459084
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 26, 2024
5738143
Update Wavelet.problems.json
crafticat Aug 15, 2024
89c01e0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 15, 2024
27ca815
Update Wavelet.problems.json
crafticat Aug 15, 2024
0dbb8e1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 15, 2024
3466558
Update content/6_Advanced/Wavelet.mdx
crafticat Aug 21, 2024
cb82a0c
Update content/6_Advanced/Wavelet.mdx
crafticat Aug 21, 2024
1631205
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 21, 2024
75ebdca
Changed problem name.json
crafticat Oct 2, 2024
c49efd4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 2, 2024
fd77752
Changed description.mdx
crafticat Oct 3, 2024
03ddc49
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 3, 2024
5013297
Some minor changes.mdx
crafticat Oct 3, 2024
a3f89eb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 3, 2024
39519df
Changed variable name styles (to snake cases).mdx
crafticat Oct 6, 2024
3a1c37a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 6, 2024
0cd5cfa
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 8, 2024
4ee7ff8
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 8, 2024
43abe71
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 8, 2024
8586c51
Some grammer and latex changes + added motivation
crafticat Oct 8, 2024
3774454
Structure changes + latex errors
crafticat Oct 21, 2024
d36e36f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2024
920e150
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 25, 2024
e688cb5
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 25, 2024
af86360
Update content/6_Advanced/Wavelet.mdx
crafticat Oct 25, 2024
f5d7c2a
Added code for updates.mdx
crafticat Nov 5, 2024
6f4a74a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 5, 2024
0c64b49
var names changed.mdx
crafticat Nov 5, 2024
ee1f440
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 5, 2024
967677c
Formatting errors and grammar changes .mdx
crafticat Nov 27, 2024
acb53cf
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 27, 2024
1983f24
Update Wavelet.mdx
crafticat Nov 27, 2024
9702306
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 27, 2024
b4c4fbf
Latex changes + desc for updates.mdx
crafticat Nov 30, 2024
46be5f5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions content/5_Plat/DC-SRQ.problems.json
Original file line number Diff line number Diff line change
Expand Up @@ -93,11 +93,11 @@
{
"uniqueId": "cf-840D",
"name": "Destiny",
"url": "https://codeforces.com/problemset/problem/840/D",
"url": "https://codeforces.com/contest/840/problem/D",
"source": "CF",
"difficulty": "Hard",
"isStarred": false,
"tags": [],
"tags": ["Wavelet"],
"solutionMetadata": {
"kind": "autogen-label-from-site",
"site": "CF"
Expand Down
237 changes: 230 additions & 7 deletions content/6_Advanced/Wavelet.mdx
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
---
id: wavelet
title: 'Wavelet Tree'
author: Benjamin Qi
author: Benjamin Qi, Omri Jerbi
prerequisites:
- RURQ
description: "?"
description: Wavelet trees support efficient queries for the kth minimum element in a range
frequency: 0
---

## Wavelet Tree
# Wavelet Tree
Wavelet trees are data structures that support efficient queries for the k-th minimum element in a range by maintaining a segment tree over values instead of indices.

<FocusProblem problem="waveletSam" />

Like a segment tree on values rather than indices.

<Resources>
<Resource
source = "IOI"
Expand All @@ -32,9 +30,234 @@ Like a segment tree on values rather than indices.
</Resource>
</Resources>

## Introduction
Suppose you want to support the following queries:

- Given a range, count the number of occurrences of value $x$.
- Given a range, find the $k$ smallest element

With a wavelet tree, you can easily support those queries in $\mathcal{O}(\log M)$ time,
where $M$ is the maximum value in the array.

### Wavelet tree structure

To answer value-based queries efficiently, we'll create a segment tree where each node represents a range of values, instead of indices. Just like a normal segment tree, each subsequent level splits the range into two halves. Note that an index can appear in at most $\log(M)$ nodes.

<Spoiler title="Wavelet Tree Visualization">

Let's say our array is: $[3,5,3,1,2,2,3,4,5,5]$
ryanchou-dev marked this conversation as resolved.
Show resolved Hide resolved
Each node has an array representing the indices of every number between l and r

![Wavelet Tree Visualization](./assets/diagram.png)
</Spoiler>

### Solution - Range K-th Smallest
Before we solve this problem, let's consider a simpler version where we are asked, given a range, to count the number of occurrences of value $x$.

## Solving the first type of query
**Given a range $l$, $r$, count the number of occurrences of value x.**

To calculate the number of occurrences from $𝑙$ to $𝑟$, we can use the following
formula:

$$
\begin{aligned}
\texttt{occurrences}(l, r) = \texttt{occurrences}(r) - \texttt{occurrences}(l)
\end{aligned}
$$

This reduces the problem to counting the number of occurrences in a prefix.

One way to solve the problem is to go to the leaf node
and perform a binary search for the number of indices less than $𝑟$
However, let's explore a different approach that can also be extended to the
second type of query.

Instead of binary searching on the leaf, we update $𝑟$ as we recurse down the
tree.
If we can determine the position (index) of $r$ in the left and right children of a node, we can recurse down the tree and determine its position in the leaf node.

To find the position of $𝑟$ in a node's left and right children, we need to
determine how many indices are smaller than the middle value (mid) and precede
$𝑟$.
This can be done using a prefix sum.

Let's define:
- $c[i]$ = as 1 if $index[i]$ is smaller than mid otherwise 0
crafticat marked this conversation as resolved.
Show resolved Hide resolved
- $prefix_b[i]$ as prefix sum of $c[i]$

Formally

$$
c[i] = \begin{cases}
1, & \text{if } \text{index}[i] < \text{mid} \\
crafticat marked this conversation as resolved.
Show resolved Hide resolved
0, & \text{otherwise}
\end{cases}
$$
$$
\text{prefixB}[i] = \text{prefixB}[i - 1] + c[i]
$$


To update $r$ as we recurse down, we do the following:
- To know the value of $r$ if we recurse left, we use prefix_b[r]
- If we recurse right, we use $r$ - prefix_b[r]
crafticat marked this conversation as resolved.
Show resolved Hide resolved

Now let's try to solve our main problem.
## Solving the second type of query
**Given a range $l$, $r$ find the k smallest element**
crafticat marked this conversation as resolved.
Show resolved Hide resolved

We will determine whether the answer for a given node is in the left or the
right segment.
We can calculate how many times the elements within the segments' ranges appear
in our range $(l, r)$ using our first type of query.
Note that this also works for non-leaf nodes using the following formula:

$$
\texttt{occurrences}(l, r) = r - l
$$
<Info title="Similar">
This is similar to counting how many times a value appears up to index $r$ in our previous query. We did this by using the new $r$ value at the leaf node. But now, we consider the difference between the updated $r$ and $l$
</Info>

Therefore, the occurrences of the left node is

$$
\texttt{left\_occurrences} = prefix_b[r] - prefix_b[l]
$$

<Info title="Left occurrences">
Note that $\texttt{left\_occurrences}$ is the number of indices between l and r whose value is less than mid

</Info>
crafticat marked this conversation as resolved.
Show resolved Hide resolved

- If $\texttt{left\_occurrences}$ is greater or equal to $k$, it means the $k$-th smallest element is in
the left subtree. Therefore, we update our range and recurse into the left
child
- If $\texttt{left\_occurrences}$ is less than $k$, it means the
$k$-th smallest element is in the right subtree. We adjust k by subtracting
$\texttt{left\_occurrences}$ from $k$, update our range, and recurse into the right child

<Info title="Notice">
Notice we still update $l, r$ accordingly when we go left or right
</Info>

the answer then will be the value of the node we end up on (leaf)

In conclusion we maintain our ranges l and r as we recurse down to our child, and when we reach the child node we can return $R$ - $L$.

## Implemention
**Time Complexity:** $\mathcal{O}(Q \cdot \log(M))$

<LanguageSection>
<CPPSection>

```cpp
#include <bits/stdc++.h>

using namespace std;
constexpr int MAX_VAL = 1e9 + 2;

struct Segment {
Segment *left = nullptr, *right = nullptr;
int l, r, mid;
bool children = false;
vector<pair<int, int>> indices; // Index, Value
vector<int> prefix_b;

Segment(int l, int r, const vector<pair<int, int>> &indices)
: l(l), r(r), mid((r + l) / 2), indices(indices) {

// Calcultes the prefix B
crafticat marked this conversation as resolved.
Show resolved Hide resolved
int i = 1;
int j = 0;
prefix_b.resize(indices.size() + 1);
for (auto [ind, val] : indices) {
if (val < mid) j++;
prefix_b[i++] = j;
}
}

// Sparse since values can go up to 1e9
void update() {
if (children) { return; }
children = true;
if (r - l > 1) {
// Split the indices for left and right child
vector<pair<int, int>> leftIndices, rightIndices;
crafticat marked this conversation as resolved.
Show resolved Hide resolved
partition_copy(indices.begin(), indices.end(), leftIndices.begin(),
rightIndices.begin(), [this](const pair<int, int> &elem) {
return elem.second < mid;
});

left = new Segment(l, mid, leftIndices);
right = new Segment(mid, r, rightIndices);
}
}

int find_k_smallest(int a, int b, int k) {
update();
if (r - l <= 1) { return l; }

int lb = prefix_b[a];
int lr = prefix_b[b];
int inLeft = lr - lb; // Amount of values in range (a,b) that are less the mid

if (k <= inLeft) {
return left->find_k_smallest(lb, lr, k); // Appears in left
} else {
return right->find_k_smallest(a - lb, b - lr,
k - inLeft); // Appears in right
}
}
};

int main() {
int n, q;
cin >> n >> q;

vector<pair<int, int>> indices;
for (int i = 0; i < n; ++i) {
int v;
cin >> v;
indices.emplace_back(i, v);
}
Segment seg(0, MAX_VAL, indices);

for (int i = 0; i < q; ++i) {
int a, b, k;
cin >> a >> b >> k;
k++;
cout << seg.find_k_smallest(a, b, k) << " ";
}
}
```
</CPPSection>
</LanguageSection>



## Supporting updates

Let's support updates of type:
- change value at index $i$ to $y$

crafticat marked this conversation as resolved.
Show resolved Hide resolved
We can traverse down to the leaf to remove the old element and also traverse down to add the new element.

So what do the updates change?
-
Our indices vector
Our prefix vector

crafticat marked this conversation as resolved.
Show resolved Hide resolved
To change the indices vector, what we can do is, instead of storing a vector, use a set.
Then erasing and adding values becomes easy.
crafticat marked this conversation as resolved.
Show resolved Hide resolved

<IncompleteSection />
On the other hand, to change the prefix vector, since each update could change our prefix vector a lot, we can't maintain just the normal vector. What we could do is use a sparse segment tree.
crafticat marked this conversation as resolved.
Show resolved Hide resolved
- erasing and inserting can be done by just setting the value to 0 or 1 at the specific index
- querying for a prefix can be done by querying the segment tree from 0 to $i$
crafticat marked this conversation as resolved.
Show resolved Hide resolved
This approach is not memory efficient and requires a segment tree's implementation.
A more friendly approach would be using an order statistics tree.
Such that querying for a prefix would be equivalent to order_of_key($i$)
crafticat marked this conversation as resolved.
Show resolved Hide resolved

### Problems

Expand Down
36 changes: 36 additions & 0 deletions content/6_Advanced/Wavelet.problems.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,42 @@
}
],
"wavelet": [
{
"uniqueId": "cf-840D",
"name": "Destiny",
"url": "https://codeforces.com/contest/840/problem/D",
"source": "CF",
"difficulty": "Normal",
"isStarred": false,
"tags": ["Wavelet"],
"solutionMetadata": {
"kind": "none"
}
},
{
"uniqueId": "spoj-ILKQUERY2",
"name": "I Love Kd-Trees",
"url": "https://www.spoj.com/problems/ILKQUERY2/",
"source": "SPOJ",
"difficulty": "Normal",
"isStarred": false,
"tags": ["Wavelet"],
"solutionMetadata": {
"kind": "none"
}
},
{
"uniqueId": "coci-20-index",
"name": "2021 - Index",
"url": "https://evaluator.hsin.hr/tasks/HONI202167index/",
"source": "COCI",
"difficulty": "Normal",
"isStarred": false,
"tags": ["Wavelet, Persistent Segtree"],
"solutionMetadata": {
"kind": "none"
}
},
{
"uniqueId": "kattis-easyquery",
"name": "Easy Query",
Expand Down
Binary file added content/6_Advanced/assets/diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading