Skip to content
Nico Korthout edited this page Dec 11, 2024 · 8 revisions

Advanced zdb usage

This section describes some advanced zdb usages.

Find all active process instances

Each process instance has an entry in the PROCESS_INSTANCE_KEY_BY_DEFINITION_KEY column family. It contains a composed key where the first part is the processDefinitionKey and the second part is the processInstanceKey.

ZEEBE_RUNTIME_DIRECTORY=raft-partition/partitions/1/runtime
zdb state list -p "$ZEEBE_RUNTIME_DIRECTORY" \
    -cf PROCESS_INSTANCE_KEY_BY_DEFINITION_KEY \
  > processInstancesPerDefinition.json

Now we can use jq to extract the processInstanceKey using capture(). This allows us to use a regular expression with capture groups to extract part of the value.

jq '.data[].key | capture("\\d+:(?<key>\\d+)") | (.key)|tonumber' processInstancesPerDefinition.json

Note that we pass the captured key through tonumber, otherwise the output is a string.

Also output the corresponding process definition key

You could also capture both the processDefinitionKey and processInstanceKey:

jq '.data[].key | capture("(?<def>\\d+):(?<pi>\\d+)") | { "processInstanceKey": (.pi)|tonumber,  "processDefinitionKey": (.def)|tonumber }'\
  processInstancesPerDefinition.json --compact-output

Example output:

{"processInstanceKey":4503599628409648,"processDefinitionKey":2251799813685662}
{"processInstanceKey":4503599628409655,"processDefinitionKey":2251799813685662}

Find all active incidents

Each incident has an entry in the INCIDENTS column family.

ZEEBE_RUNTIME_DIRECTORY=raft-partition/partitions/1/runtime
zdb state list -p "$ZEEBE_RUNTIME_DIRECTORY" -cf INCIDENTS > incidents.json

Using jq, we can extract the incidentKey and the processInstanceKey that it was raised in:

jq '.data[] | { "incidentKey": (.key)|tonumber, "processInstanceKey": .value.incidentRecord.processInstanceKey }' \
  --compact-output

Example output:

{"incidentKey":4503599926214237,"processInstanceKey":4503599926213857}
{"incidentKey":4503599926214256,"processInstanceKey":4503599926213857}
{"incidentKey":4503599926219685,"processInstanceKey":4503599925444404}

Find the highest process model version

zdb process list -p snapshot/ > processes.json # get all process model meta data
cat processes.json | jq '.[] | select( .bpmnProcessId == "PROCESS_ID" ).version' | sort -n | tail -1 # sort and tail to find the highest version

Read the process model from a create deployment command

Generally, you can inspect deployed processes, but when a Deployment was rejected the process will not be available in the state. If you want to find it, you'll need to read it from the rejected create deployment command.

zdb log print -p=/tmp/data/raft-partition/partitions/1/ \
  | jq '. | select(.valueType == "DEPLOYMENT") | select(.intent == "CREATE") | .value.resources[].resource' \
  | cut -c 2- | rev| cut -c 2- | rev \
  | base64 -D \
  | pbcopy
  1. print the records from the log zdb log print -p=<path> (or print a specific record if you know the position zdb log search --position=<position> -p=<path>)
  2. filter for the DEPLOYMENT:CREATE record and retrieve its resource: jq '. | select(.valueType == "DEPLOYMENT") | select(.intent == "CREATE") | .value.resources[].resource'
  3. cut off the " from the json string: cut -c 2- | rev| cut -c 2- | rev
  4. decode the base64 resource: base64 -D
  5. copy to clipboard: pbcopy

If your resource is a BPMN model, you'll now have an XML representation of it. You can paste this into the XML tab of a new BPMN diagram in Camunda Modeler.

Extract specific process instance variables

Variables are stored in the VARIABLES column family. This column family is structured as follows:

# <key> -> {value}
<scopeKey:variableName> -> {key, value}
- scopeKey (long): key referencing the element instance where the variable exists
- variableName (string): user-provided variable name
- key (long): unique key for the variable
- value (string): base64 encoded msg-pack value

With that information, we can figure out how to read data from this column family:

zdb state list -p=raft-partition/partitions/1/runtime -cf=VARIABLES -kf='ls' > variables.json

Let's unpack that for a moment:

  • -p=raft-partition/partitions/1/runtime: we tell zdb which data to read, in this case the runtime data of partition 1. You can also read data from a different partition or from a specific snapshot of the same partition.
  • -cf=VARIABLES: this tells zdb to read/output the data from the VARIABLES column family.
  • -kf='ls': this tells zdb the format of the keys, in our case it's a long scopeKey followed by a string variableName.
  • > variables.json: zdb's output is json formatted by default, as it may take time for zdb to read through all data, let's store it in a local file. This allows us to pass it through jq fast and often while we figure out the right filtering/transformations.

Filtering the variables

Now we can start filtering the variables data to find those variables we're interested in using jq:

jq '[.data[] | select(.key | test("[0-9]+:parcelId"))]' variables.json > parcelIdVars.json

This filters the variables.json file's data property for objects whose key matches the regex [0-9]+:parcelId. In other words, we're left with all variables named parcelId. The output is wrapped in an array [..] to ensure the resulting data is a json array that we can easily transform further. Of course, we store the data directly in a file for easy access parcelIdVars.json.

Scope key and parcel id

Next we can extract the scope key for each variable. The scope key refers to the element instance where the variable exists. For global variables the scope key would reference the process instance directly, but as we're dealing with local variables it's referencing an element instance inside the process instance.

In the end, we want the processInstanceKey and the value of the parcelId variable. As an in between step, we can transform the data into objects that contain the scopeKey and the variable's value only.

jq '[.[] | { "scopeKey": (.key | capture("(?<scope>[0-9]+):parcelId").scope), "parcelId": .value.value }]' \
   parcelIdVars.json > parcelIdsWithScopeKey.json
  • [..]: we wrap the resulting objects in an array again
  • .[] | { .. }: this part constructs a new json object for each entry in the array
  • (.key | capture("(?<scope>[0-9]+):parcelId").scope): uses a regex with a named capture group scope that we can use to access the scopeKey from the structured key
  • "parcelId": .value.value: move the variable's value to parcelId

This results in the following data (example):

[
  {
    "scopeKey": "2251800441788927",
    "parcelId": "<REDACTED>"
  },
  #...
]

Decoding the parcel id

Variable values are msg-pack values encoded using base64. You can use jq to decode the base64 encoding directly. However, as some characters may not be available (specific to msg-pack encoding), it may be better to pass along the base64 encoded variables when using them in text. If you do need to decode the variable values, here's how you can decode base64 variable values using jq:

jq '[.[] | .parcelId |= @base64d]' parcelIdsWithScopeKey.json > parcelIdsDecodedWithScopeKey.json

Finding the process instance key

You can find the process instance key of any element instance through zdb and jq:

zdb instance entity 2251800441788927 -p=raft-partition/partitions/1/runtime \
  | jq '.elementRecord.processInstanceRecord.processInstanceKey'
2251800441788927

We might be able to use this programmatically, to transform all found variables. However, I've not been able to figure out how to do this easily. At the time of writing, jq does not (yet) support calling shell commands directly from jq. There are some ideas outlined here: https://stackoverflow.com/q/43192556/3442860.

Find the timestamp of the last change

Sometimes, we want to know what moment in time is reflected by our data folder. While a snapshot is made at a certain position, we cannot easily tell this about the runtime state. However, we can use the log to determine when the last change happened because each record has a timestamp.

We don't want to inspect the last record perse, as this could be a COMMAND appended after any changes happened. It's much more accurate to inspect the EVENT and COMMAND_REJECTION records for their timestamp.

The following looks for the last EVENT or COMMAND_REJECTION in the most recent 1000 (set by RECORD_LIMIT) records, and prints its timestamp.

RECORD_LIMIT=1000
ZEEBE_LOG_DIRECTORY=raft-partition/partitions/1
HIGHEST_RECORD_POSITION=$(zdb log status -p "$ZEEBE_LOG_DIRECTORY" | jq '.highestRecordPosition')
FROM_POSITION="$((HIGHEST_RECORD_POSITION-RECORD_LIMIT))"
zdb log print -p "$ZEEBE_LOG_DIRECTORY" --fromPosition="$FROM_POSITION" \
  | jq '.[].entries[] | select(.recordType == "EVENT" or .recordType == "COMMAND_REJECTION") | .timestamp' \
  | tail -n 1