Skip to content

Commit

Permalink
Add tool to detect/fix DB inconsistencies (#2840)
Browse files Browse the repository at this point in the history
This tool checks and can repair certain inconsistencies between the `user`, `user_keys` and `user_handle` tables in brig.

More context on how this was/is useful under these issues:

- https://wearezeta.atlassian.net/browse/SQSERVICES-1798
- https://wearezeta.atlassian.net/browse/SQSERVICES-1797

(A precursor tool to check inconsistencies between spar and brig tables, if deemed useful, could be exhumed from git history of [PR #2840](#2840) or [one commit](2e06428) and incorporated here)

This tool writes findings into an output file as JSON lines, so it can be more easily analysed. The tool should run on a cluster (as opposted to through port-forwarding from a local machine) for speed. Though please do watch metrics when running this as a few thousand parallelized table scan pagination requests per second can have a performance impact on the whole database.

See the README for more details on how to use/deploy this tool.

Co-authored-by: jschaul <[email protected]>
  • Loading branch information
akshaymankar and jschaul authored Dec 13, 2022
1 parent 403d3d9 commit 578854f
Show file tree
Hide file tree
Showing 13 changed files with 1,209 additions and 1 deletion.
1 change: 1 addition & 0 deletions cabal.project
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ packages:
, tools/db/find-undead/
, tools/db/move-team/
, tools/db/repair-handles/
, tools/db/inconsistencies/
, tools/rex/
, tools/stern/

Expand Down
1 change: 1 addition & 0 deletions changelog.d/5-internal/pr-2840
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add 'inconsistencies' tool to check for, and repair certain kinds of data inconsistencies across different cassandra tables.
1 change: 1 addition & 0 deletions nix/local-haskell-packages.nix
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
auto-whitelist = hself.callPackage ../tools/db/auto-whitelist/default.nix { inherit gitignoreSource; };
billing-team-member-backfill = hself.callPackage ../tools/db/billing-team-member-backfill/default.nix { inherit gitignoreSource; };
find-undead = hself.callPackage ../tools/db/find-undead/default.nix { inherit gitignoreSource; };
inconsistencies = hself.callPackage ../tools/db/inconsistencies/default.nix { inherit gitignoreSource; };
migrate-sso-feature-flag = hself.callPackage ../tools/db/migrate-sso-feature-flag/default.nix { inherit gitignoreSource; };
move-team = hself.callPackage ../tools/db/move-team/default.nix { inherit gitignoreSource; };
repair-handles = hself.callPackage ../tools/db/repair-handles/default.nix { inherit gitignoreSource; };
Expand Down
3 changes: 2 additions & 1 deletion nix/wire-server.nix
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ let
stern = [ "stern" ];

billing-team-member-backfill = [ "billing-team-member-backfill" ];
inconsistencies = [ "inconsistencies" ];
api-simulations = [ "api-smoketest" "api-loadtest" ];
zauth = [ "zauth" ];
};
Expand Down Expand Up @@ -300,7 +301,7 @@ let
pkgs.helm
pkgs.helmfile
pkgs.hlint
( hlib.justStaticExecutables pkgs.haskellPackages.apply-refact )
(hlib.justStaticExecutables pkgs.haskellPackages.apply-refact)
pkgs.jq
pkgs.kubectl
pkgs.nixpkgs-fmt
Expand Down
138 changes: 138 additions & 0 deletions tools/db/inconsistencies/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# inconsistencies

This tool checks and can repair certain inconsistencies between the `user`, `user_keys` and `user_handle` tables in brig.

More context on how this was/is useful under these issues:

- https://wearezeta.atlassian.net/browse/SQSERVICES-1798
- https://wearezeta.atlassian.net/browse/SQSERVICES-1797

(A precursor tool to check inconsistencies between spar and brig tables, if deemed useful, could be exhumed from git history of [PR #2840](https://github.com/wireapp/wire-server/pull/2840) or [one commit](https://github.com/wireapp/wire-server/pull/2840/commits/2e06428d10508328bcf2d829b16a7cc75ee72386) and incorporated here)

This tool writes findings into an output file as JSON lines, so it can be more easily analysed. The tool should run on a cluster (as opposted to through port-forwarding from a local machine) for speed. Though please do watch metrics when running this as a few thousand parallelized table scan pagination requests per second can have a performance impact on the whole database.

## How to run and make sense of data

1. Build image

```
make build-image-inconsistencies
```

2. Push image

```
docker push <image-including-tag-from-above-output>
```

3. Run it in K8s using this pod yaml **Update image field and args appropriately**:

Inside the affected cluster's context (e.g. `targets/wire/staging/app`), open a PR with a pod manifest file that can be created using `kubectl apply -f <filename>`

```yaml
apiVersion: v1
kind: Pod
metadata:
name: inconsistencies
labels:
app: inconsistencies
spec:
restartPolicy: Never
containers:
- name: inconsistencies
image: <image-in-your-personal-docker-repo>
imagePullPolicy: Always
args:
- handle-less-users # adjust to the command you need, see Options.hs
- --cassandra-host-brig
- brig-brig-eks-service.databases
- --cassandra-keyspace-brig
- brig
- --inconsistencies-file
- /inconsistencies.log
```
4. Wait for the process to finish. Watch logs, it will say something like "sleeping for 4 hours" and then close all connections to cassandra.
5. Copy the logs using `kubectl cp`

```
kubectl cp inconsistencies:/inconsistencies.log inconsistencies.log
```
6. **IMPORTANT:** Delete the pod. The easiest way to do this is with `kubectl delete -f <filename>` (which also deletes any configmap)
7. Convert logs into CSV:
```bash
cat inconsistencies.log |
jq -r '[.userId, .status.value, .status.writetime, .userHandle.value, .userHandle.writetime, .handleClaimUser.value, .handleClaimUser.writetime] | @csv' >! inconsistencies.csv
```

You can look at this data using any tool comfortable.

8. From a CSV file, you may extract only handles/emails/keys to feed into repair using awk/grep:

```bash
cat inconsistencies.csv | awk -F ',' '{print $1}' | grep -v '^"+' | xargs -n 1 echo > dangling-email-keys.txt
```

## How to repair some data

First, you need to extract a list of emails/handles/UUIDs you wish to repair. The code will still perform checks on whether these inputs actually need any kind of repairing (backfilling into tables or removing from tables).

You can run the same container with additional flags of the command, a configmap with values (for simplicity called `input`), and the `--repair-data` flag. See source code under `Options.hs`.

At least the following are supported:

- `missing-email-keys` (and a mounted configmap containing newline-separated UUIDs)
- `dangling-handles` (and a mounted configmap containing newline-separated handles)
- `dangling-keys` (and a mounted configmap containing newline-separated emails)

Example:

```yaml
apiVersion: v1
kind: Pod
metadata:
name: inconsistencies
labels:
app: inconsistencies
spec:
restartPolicy: Never
containers:
- name: inconsistencies
image: quay.io/wire/inconsistencies:<tag>
imagePullPolicy: Always
args:
- missing-email-keys
- --input-file
- /input/input
- --repair-data
- --cassandra-host-brig
- brig-brig-eks-service.databases
- --cassandra-keyspace-brig
- brig
- --inconsistencies-file
- /inconsistencies.log
volumeMounts:
- name: input
mountPath: "/input"
readOnly: true
volumes:
- name: input
configMap:
name: input
---
apiVersion: v1
kind: ConfigMap
metadata:
name: input
data:
input: |
2a7de2ba-754c-11ed-b14d-00163e5e6c00
3049c812-754c-11ed-b56e-00163e5e6c00
...
```
Apply as usual, should execute quickly, and make sure to export inconsistencies.log and check actual logs, then delete the resources created (`kubectl delete -f ...`)
76 changes: 76 additions & 0 deletions tools/db/inconsistencies/default.nix
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# WARNING: GENERATED FILE, DO NOT EDIT.
# This file is generated by running hack/bin/generate-local-nix-packages.sh and
# must be regenerated whenever local packages are added or removed, or
# dependencies are added or removed.
{ mkDerivation
, aeson
, base
, brig
, brig-types
, bytestring
, case-insensitive
, cassandra-util
, conduit
, containers
, extended
, extra
, galley-types
, gitignoreSource
, HsOpenSSL
, http-client
, imports
, lens
, lib
, multihash
, optparse-applicative
, saml2-web-sso
, string-conversions
, text
, time
, tinylog
, types-common
, unliftio
, uri-bytestring
, uuid
, wire-api
}:
mkDerivation {
pname = "inconsistencies";
version = "1.0.0";
src = gitignoreSource ./.;
isLibrary = false;
isExecutable = true;
executableHaskellDepends = [
aeson
base
brig
brig-types
bytestring
case-insensitive
cassandra-util
conduit
containers
extended
extra
galley-types
HsOpenSSL
http-client
imports
lens
multihash
optparse-applicative
saml2-web-sso
string-conversions
text
time
tinylog
types-common
unliftio
uri-bytestring
uuid
wire-api
];
description = "Find handles which belong to deleted users";
license = lib.licenses.agpl3Only;
mainProgram = "inconsistencies";
}
99 changes: 99 additions & 0 deletions tools/db/inconsistencies/inconsistencies.cabal
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
cabal-version: 1.12
name: inconsistencies
version: 1.0.0
synopsis: Find handles which belong to deleted users.
category: Network
author: Wire Swiss GmbH
maintainer: Wire Swiss GmbH <[email protected]>
copyright: (c) 2020 Wire Swiss GmbH
license: AGPL-3
build-type: Simple

executable inconsistencies
main-is: Main.hs
other-modules:
DanglingHandles
DanglingUserKeys
EmailLessUsers
HandleLessUsers
Options
Paths_inconsistencies

hs-source-dirs: src
default-extensions:
NoImplicitPrelude
AllowAmbiguousTypes
BangPatterns
ConstraintKinds
DataKinds
DefaultSignatures
DeriveFunctor
DeriveGeneric
DeriveLift
DeriveTraversable
DerivingStrategies
DerivingVia
EmptyCase
FlexibleContexts
FlexibleInstances
FunctionalDependencies
GADTs
InstanceSigs
KindSignatures
LambdaCase
MultiParamTypeClasses
MultiWayIf
NamedFieldPuns
OverloadedStrings
PackageImports
PatternSynonyms
PolyKinds
QuasiQuotes
RankNTypes
ScopedTypeVariables
StandaloneDeriving
TupleSections
TypeApplications
TypeFamilies
TypeFamilyDependencies
TypeOperators
UndecidableInstances
ViewPatterns

ghc-options:
-O2 -Wall -Wincomplete-uni-patterns -Wincomplete-record-updates
-Wpartial-fields -fwarn-tabs -optP-Wno-nonportable-include-path
-funbox-strict-fields -threaded -with-rtsopts=-N -with-rtsopts=-T
-rtsopts

build-depends:
aeson
, base
, brig
, brig-types
, bytestring
, case-insensitive
, cassandra-util
, conduit
, containers
, extended
, extra
, galley-types
, HsOpenSSL
, http-client
, imports
, lens
, multihash
, optparse-applicative
, saml2-web-sso
, string-conversions
, text
, time
, tinylog
, types-common
, unliftio
, uri-bytestring
, uuid
, wire-api

default-language: Haskell2010
Loading

0 comments on commit 578854f

Please sign in to comment.