-
Notifications
You must be signed in to change notification settings - Fork 325
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add tool to detect/fix DB inconsistencies (#2840)
This tool checks and can repair certain inconsistencies between the `user`, `user_keys` and `user_handle` tables in brig. More context on how this was/is useful under these issues: - https://wearezeta.atlassian.net/browse/SQSERVICES-1798 - https://wearezeta.atlassian.net/browse/SQSERVICES-1797 (A precursor tool to check inconsistencies between spar and brig tables, if deemed useful, could be exhumed from git history of [PR #2840](#2840) or [one commit](2e06428) and incorporated here) This tool writes findings into an output file as JSON lines, so it can be more easily analysed. The tool should run on a cluster (as opposted to through port-forwarding from a local machine) for speed. Though please do watch metrics when running this as a few thousand parallelized table scan pagination requests per second can have a performance impact on the whole database. See the README for more details on how to use/deploy this tool. Co-authored-by: jschaul <[email protected]>
- Loading branch information
1 parent
403d3d9
commit 578854f
Showing
13 changed files
with
1,209 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add 'inconsistencies' tool to check for, and repair certain kinds of data inconsistencies across different cassandra tables. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
# inconsistencies | ||
|
||
This tool checks and can repair certain inconsistencies between the `user`, `user_keys` and `user_handle` tables in brig. | ||
|
||
More context on how this was/is useful under these issues: | ||
|
||
- https://wearezeta.atlassian.net/browse/SQSERVICES-1798 | ||
- https://wearezeta.atlassian.net/browse/SQSERVICES-1797 | ||
|
||
(A precursor tool to check inconsistencies between spar and brig tables, if deemed useful, could be exhumed from git history of [PR #2840](https://github.com/wireapp/wire-server/pull/2840) or [one commit](https://github.com/wireapp/wire-server/pull/2840/commits/2e06428d10508328bcf2d829b16a7cc75ee72386) and incorporated here) | ||
|
||
This tool writes findings into an output file as JSON lines, so it can be more easily analysed. The tool should run on a cluster (as opposted to through port-forwarding from a local machine) for speed. Though please do watch metrics when running this as a few thousand parallelized table scan pagination requests per second can have a performance impact on the whole database. | ||
|
||
## How to run and make sense of data | ||
|
||
1. Build image | ||
|
||
``` | ||
make build-image-inconsistencies | ||
``` | ||
|
||
2. Push image | ||
|
||
``` | ||
docker push <image-including-tag-from-above-output> | ||
``` | ||
|
||
3. Run it in K8s using this pod yaml **Update image field and args appropriately**: | ||
|
||
Inside the affected cluster's context (e.g. `targets/wire/staging/app`), open a PR with a pod manifest file that can be created using `kubectl apply -f <filename>` | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
name: inconsistencies | ||
labels: | ||
app: inconsistencies | ||
spec: | ||
restartPolicy: Never | ||
containers: | ||
- name: inconsistencies | ||
image: <image-in-your-personal-docker-repo> | ||
imagePullPolicy: Always | ||
args: | ||
- handle-less-users # adjust to the command you need, see Options.hs | ||
- --cassandra-host-brig | ||
- brig-brig-eks-service.databases | ||
- --cassandra-keyspace-brig | ||
- brig | ||
- --inconsistencies-file | ||
- /inconsistencies.log | ||
``` | ||
4. Wait for the process to finish. Watch logs, it will say something like "sleeping for 4 hours" and then close all connections to cassandra. | ||
5. Copy the logs using `kubectl cp` | ||
|
||
``` | ||
kubectl cp inconsistencies:/inconsistencies.log inconsistencies.log | ||
``` | ||
6. **IMPORTANT:** Delete the pod. The easiest way to do this is with `kubectl delete -f <filename>` (which also deletes any configmap) | ||
7. Convert logs into CSV: | ||
```bash | ||
cat inconsistencies.log | | ||
jq -r '[.userId, .status.value, .status.writetime, .userHandle.value, .userHandle.writetime, .handleClaimUser.value, .handleClaimUser.writetime] | @csv' >! inconsistencies.csv | ||
``` | ||
|
||
You can look at this data using any tool comfortable. | ||
|
||
8. From a CSV file, you may extract only handles/emails/keys to feed into repair using awk/grep: | ||
|
||
```bash | ||
cat inconsistencies.csv | awk -F ',' '{print $1}' | grep -v '^"+' | xargs -n 1 echo > dangling-email-keys.txt | ||
``` | ||
|
||
## How to repair some data | ||
|
||
First, you need to extract a list of emails/handles/UUIDs you wish to repair. The code will still perform checks on whether these inputs actually need any kind of repairing (backfilling into tables or removing from tables). | ||
|
||
You can run the same container with additional flags of the command, a configmap with values (for simplicity called `input`), and the `--repair-data` flag. See source code under `Options.hs`. | ||
|
||
At least the following are supported: | ||
|
||
- `missing-email-keys` (and a mounted configmap containing newline-separated UUIDs) | ||
- `dangling-handles` (and a mounted configmap containing newline-separated handles) | ||
- `dangling-keys` (and a mounted configmap containing newline-separated emails) | ||
|
||
Example: | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: Pod | ||
metadata: | ||
name: inconsistencies | ||
labels: | ||
app: inconsistencies | ||
spec: | ||
restartPolicy: Never | ||
containers: | ||
- name: inconsistencies | ||
image: quay.io/wire/inconsistencies:<tag> | ||
imagePullPolicy: Always | ||
args: | ||
- missing-email-keys | ||
- --input-file | ||
- /input/input | ||
- --repair-data | ||
- --cassandra-host-brig | ||
- brig-brig-eks-service.databases | ||
- --cassandra-keyspace-brig | ||
- brig | ||
- --inconsistencies-file | ||
- /inconsistencies.log | ||
volumeMounts: | ||
- name: input | ||
mountPath: "/input" | ||
readOnly: true | ||
volumes: | ||
- name: input | ||
configMap: | ||
name: input | ||
--- | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: input | ||
data: | ||
input: | | ||
2a7de2ba-754c-11ed-b14d-00163e5e6c00 | ||
3049c812-754c-11ed-b56e-00163e5e6c00 | ||
... | ||
``` | ||
Apply as usual, should execute quickly, and make sure to export inconsistencies.log and check actual logs, then delete the resources created (`kubectl delete -f ...`) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# WARNING: GENERATED FILE, DO NOT EDIT. | ||
# This file is generated by running hack/bin/generate-local-nix-packages.sh and | ||
# must be regenerated whenever local packages are added or removed, or | ||
# dependencies are added or removed. | ||
{ mkDerivation | ||
, aeson | ||
, base | ||
, brig | ||
, brig-types | ||
, bytestring | ||
, case-insensitive | ||
, cassandra-util | ||
, conduit | ||
, containers | ||
, extended | ||
, extra | ||
, galley-types | ||
, gitignoreSource | ||
, HsOpenSSL | ||
, http-client | ||
, imports | ||
, lens | ||
, lib | ||
, multihash | ||
, optparse-applicative | ||
, saml2-web-sso | ||
, string-conversions | ||
, text | ||
, time | ||
, tinylog | ||
, types-common | ||
, unliftio | ||
, uri-bytestring | ||
, uuid | ||
, wire-api | ||
}: | ||
mkDerivation { | ||
pname = "inconsistencies"; | ||
version = "1.0.0"; | ||
src = gitignoreSource ./.; | ||
isLibrary = false; | ||
isExecutable = true; | ||
executableHaskellDepends = [ | ||
aeson | ||
base | ||
brig | ||
brig-types | ||
bytestring | ||
case-insensitive | ||
cassandra-util | ||
conduit | ||
containers | ||
extended | ||
extra | ||
galley-types | ||
HsOpenSSL | ||
http-client | ||
imports | ||
lens | ||
multihash | ||
optparse-applicative | ||
saml2-web-sso | ||
string-conversions | ||
text | ||
time | ||
tinylog | ||
types-common | ||
unliftio | ||
uri-bytestring | ||
uuid | ||
wire-api | ||
]; | ||
description = "Find handles which belong to deleted users"; | ||
license = lib.licenses.agpl3Only; | ||
mainProgram = "inconsistencies"; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
cabal-version: 1.12 | ||
name: inconsistencies | ||
version: 1.0.0 | ||
synopsis: Find handles which belong to deleted users. | ||
category: Network | ||
author: Wire Swiss GmbH | ||
maintainer: Wire Swiss GmbH <[email protected]> | ||
copyright: (c) 2020 Wire Swiss GmbH | ||
license: AGPL-3 | ||
build-type: Simple | ||
|
||
executable inconsistencies | ||
main-is: Main.hs | ||
other-modules: | ||
DanglingHandles | ||
DanglingUserKeys | ||
EmailLessUsers | ||
HandleLessUsers | ||
Options | ||
Paths_inconsistencies | ||
|
||
hs-source-dirs: src | ||
default-extensions: | ||
NoImplicitPrelude | ||
AllowAmbiguousTypes | ||
BangPatterns | ||
ConstraintKinds | ||
DataKinds | ||
DefaultSignatures | ||
DeriveFunctor | ||
DeriveGeneric | ||
DeriveLift | ||
DeriveTraversable | ||
DerivingStrategies | ||
DerivingVia | ||
EmptyCase | ||
FlexibleContexts | ||
FlexibleInstances | ||
FunctionalDependencies | ||
GADTs | ||
InstanceSigs | ||
KindSignatures | ||
LambdaCase | ||
MultiParamTypeClasses | ||
MultiWayIf | ||
NamedFieldPuns | ||
OverloadedStrings | ||
PackageImports | ||
PatternSynonyms | ||
PolyKinds | ||
QuasiQuotes | ||
RankNTypes | ||
ScopedTypeVariables | ||
StandaloneDeriving | ||
TupleSections | ||
TypeApplications | ||
TypeFamilies | ||
TypeFamilyDependencies | ||
TypeOperators | ||
UndecidableInstances | ||
ViewPatterns | ||
|
||
ghc-options: | ||
-O2 -Wall -Wincomplete-uni-patterns -Wincomplete-record-updates | ||
-Wpartial-fields -fwarn-tabs -optP-Wno-nonportable-include-path | ||
-funbox-strict-fields -threaded -with-rtsopts=-N -with-rtsopts=-T | ||
-rtsopts | ||
|
||
build-depends: | ||
aeson | ||
, base | ||
, brig | ||
, brig-types | ||
, bytestring | ||
, case-insensitive | ||
, cassandra-util | ||
, conduit | ||
, containers | ||
, extended | ||
, extra | ||
, galley-types | ||
, HsOpenSSL | ||
, http-client | ||
, imports | ||
, lens | ||
, multihash | ||
, optparse-applicative | ||
, saml2-web-sso | ||
, string-conversions | ||
, text | ||
, time | ||
, tinylog | ||
, types-common | ||
, unliftio | ||
, uri-bytestring | ||
, uuid | ||
, wire-api | ||
|
||
default-language: Haskell2010 |
Oops, something went wrong.