-
Notifications
You must be signed in to change notification settings - Fork 2
/
algorithms.tex
161 lines (143 loc) · 8.96 KB
/
algorithms.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
\section{Algorithms}
\label{sec:algorithms}
%-------------------------------------------------------------------------------
\subsection{Breadth-First Search (BFS)}
%-------------------------------------------------------------------------------
\label{sec:bfs}
The breadth-first search (BFS)
builds on the observation that vector-matrix multiplication $\grbv{f}\grbt\grbm{A}$ expresses
navigation from the nodes selected by vector $\grbv{f}$ in the graph represented
by $\grbm{A}$.
The direction-optimizing push/pull BFS \cite{DBLP:conf/sc/BeamerAP12} is simple
to express in GraphBLAS \cite{DBLP:conf/icpp/YangBO18}. If $\grbm{A}$ is held by row,
then $\grbv{f}\grbt\grbm{A}$ is a push step, while $\grbm{B}\grbv{f}$ is a pull step, where
$\grbm{B}=\grbm{A}\grbt$ is the explicit transpose of $\grbm{A}$, also held by row.
Other \grb libraries, \eg GraphBLAST, store both directions and perform
direction-optimization automatically~\cite{DBLP:journals/corr/abs-1908-01407}.
The push-only BFS is shown in
\autoref{alg:bfs-parents}, while the push/pull BFS is \autoref{alg:bfs-parents-do}.
The GraphBLAS BFS relies on the $\grbanysecondi$ %ANY-SECONDI
semiring to compute a single step,
$\grbv{q}\grbt \grbmaskreplace{\grbneg \grbstr{\grbv{p}\grbt}} = \grbv{q}\grbt\grbm{A}$, where $\grbv{q}$ is the current frontier,
%$\grbv{q} \grbmask{\grbneg \grbstr{\grbv{p}}} = \grbv{q}\grbt\grbm{A}$, where $\grbv{q}$ is the current frontier
$\grbv{p}$ is the parent vector, and $\grbm{A}$ is the adjacency matrix.
The mask is a \emph{complemented structural mask} which means the mask corresponds to the
empty elements of the mask vector. Replace semantics are indicated (due to the \emph{r} in the mask expression)
so any elements of the vector other than those selected by the mask are deleted.
The result is the assignment to the parent vector on line 8 updates the vector with the parents of the the newly visited nodes.
%
%This step assigns the parents of newly visited nodes, which do not yet have a parent,
%using the complemented structural mask with replace semantics $\grbmaskreplace{\grbneg \grbstr{\grbv{p}}}$.
%using the complemented structural mask $\grbmask{\grbneg \grbstr{\grbv{p}}}$.
Consider a matrix multiply for conventional linear algebra, where the $\grbplus$ %PLUS
monoid sums a set of $t$ entries to obtain a single scalar for computing
$c_{ij} = \sum a_{ik} b_{kj}$ in the matrix multiply $\grbm{C} = \grbm{A}\grbm{B}$. The $\grbany$ %ANY
monoid performs the reduction of $t$ entries to a single number by merely selecting
any one of the $t$ entries as the result $c_{ij}$. The selection is done
non-deterministically, allowing for a benign race condition. In the BFS, this
corresponds to selecting any valid parent of a newly discovered node. Indeed,
the creation of the $\grbany$ %ANY
operator was inspired by Scott Beamer's \verb'bfs.cc'
method in the GAP benchmark, which has the same benign race condition. The $\grbany$ %ANY
monoid translates the concept of this benign race condition to construct a
valid BFS tree into a linear algebraic operation, suitable for implementation
in GraphBLAS.
The $\grbsecondi$ %SECONDI
operator is the multiplicative operator in the $\grbanysecondi$ %ANY-SECONDI
semiring, where the result of $a_{ik} b_{kj}$ is simply the index $k$ in the
semiring for $\grbm{C} = \grbm{A}\grbm{B}$. This gives the id of the parent node for a newly
discovered node in the next frontier. The $\grbany$ %ANY
monoid then selects any valid
parent $k$.
%
\input{algorithms/bfs}
%-------------------------------------------------------------------------------
\subsection{Betweenness Centrality (BC)}
%-------------------------------------------------------------------------------
\label{sec:bc}
\input{algorithms/bc}
%
The vertex betweenness centrality metric is based on the number of
shortest paths through any given node,
$ \sum_{s \ne i \ne t} {\sigma (s, t|i)}/{\sigma(s,t)}, $
where $\sigma(s,t)$ is the total number of shortest paths from node $s$ to $t$,
and $\sigma(s,t|i)$ is the total number of shortest paths from node $s$ to $t$
that pass through node $i$. This is expensive to compute, so in practice,
a subset of source nodes are chosen at random (a {\em batch}), of size $\mathit{ns}$.
Like the BFS, direction-optimization is incredibly simple to add to the LAGraph
algorithm for batched betweenness centrality (BC).
It only requires a simple heuristic to determine which
direction to use, followed by masked matrix-matrix multiplication with the
matrix or its transpose: $\grbm{F} \grbmask{\grbneg \grbstr{\grbm{P}}} = \grbm{F}\grbm{B}\grbt$ (the pull) or $\grbm{F}
\grbmask{\grbneg \grbstr{\grbm{P}}} = \grbm{F} \grbm{A}$ (the push), where $\grbm{A}$ is the adjacency matrix of
the graph and $\grbm{B} = \grbm{A}\grbt$ is its explicit transpose, $\grbm{F}$ is the frontier, and the
complemented structural mask $\grbneg \grbstr{\grbm{P}}$ is the set of unvisited nodes. The multiplication
$\grbm{F} \grbm{B}\grbt$ relies on the descriptor to represent the transpose of $\grbm{B}$, which is not
explicitly transposed. In the backward phase, the pull step is $\grbm{W} = \grbm{W} \grbm{A}\grbt$ while
the push is $\grbm{W} = \grbm{W} \grbm{B}$, where $\grbm{W}$ is the $\mathit{ns}$-by-$n$ matrix in which centrality is
accumulated (where $\mathit{ns}=4$ is a typical batch size).
To simplify the presentation of the entire BC algorithm, \autoref{alg:bc} does
not show the direction-optimization. It is the same transformation as the pair
of BFS algorithms, where the push-only step (line 5 of
\autoref{alg:bfs-parents}), is expanded to a push/pull heuristic (lines 4-7 of
\autoref{alg:bfs-parents-do}).
%-------------------------------------------------------------------------------
\subsection{PageRank (PR)}
%-------------------------------------------------------------------------------
\label{sec:pagerank}
PageRank (PR) computes the importance of each node as a recursively-defined
metric: a web page is important if important pages link to it.
\autoref{alg:pagerank} shows the GraphBLAS implementation of PR as specified in
the GAP benchmark. It uses the $\grbplussecond$ semiring, where
$\grboperator{second}(x,y)=y$, so it can ignore any edge weights in the input
matrix. The PR in GAP does not properly handle dangling vertices in the graph.
The Graphalytics benchmark has a PageRank variant which avoids this
problem~\cite{DBLP:journals/corr/abs-2011-15028}. We have included this
version to compare its performance with the GAP benchmark algorithm
\verb'pr.cc'.
%
\input{algorithms/pagerank}
%-------------------------------------------------------------------------------
\subsection{Single-Source Shortest Paths (SSSP)}
%-------------------------------------------------------------------------------
\label{sec:sssp}
A Delta-Stepping Single-Source Shortest Paths algorithm in GraphBLAS is shown in
\autoref{alg:sssp-delta-stepping}. It relies on the $\grbminplus$ semiring.
Since it is a fairly complex algorithm, refer to
\cite{DBLP:conf/ipps/SridharBMSLM19} for a description of the method.
%
\input{algorithms/sssp-delta-stepping}
%-------------------------------------------------------------------------------
\subsection{Triangle Counting (TC)}
%-------------------------------------------------------------------------------
\label{sec:triangle-count}
The triangle counting (TC) problem is to compute the number of unique cliques
of size 3 in a graph. The TC algorithm is shown in
\autoref{alg:triangle-count-sandiadot}, based on \cite{8091043}.
%
\input{algorithms/triangle-count}
%
It starts with a heuristic that decides when
to sort the input graph by ascending degree. Next, it constructs the lower and
upper triangular part and computes a masked matrix multiply using the
$\grbpluspair$ semiring. Internally, a dot product method is used in SS:GrB,
because $\grbm{U}$ is transposed via the descriptor. The $\grboperator{pair}$
is the simple function $\grboperator{pair}(x,y)=1$. When used in a semiring,
it acts like the $\grboperator{times}$ operator of the conventional semiring,
except that it can ignore the values of its inputs and treat them both as~1.
This semiring is useful for structural computations, such as triangle counting,
when the edge weights of a graph may be present but should be ignored in a
particular algorithm.
%-------------------------------------------------------------------------------
\subsection{Connected Components}
%-------------------------------------------------------------------------------
\label{sec:connected-components}
The connected components algorithm in LAGraph (\autoref{alg:fastsv})
is written by Zhang, Azad, and Bulu{\c{c}}
\cite{ZHANG202014,DBLP:conf/ppsc/ZhangAH20}. The method maintains a forest of
trees represented by a parent vector, and iteratively merges trees until no
more merging is possible. The method as shown in \autoref{alg:fastsv} is a
simplified variant that operates on the entire graph. In the LAGraph
version, a subgraph is constructed first, and the method finds the connected
components of the subgraph, and then operates on the entire graph.
\input{algorithms/fastsv}