Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CRASH] KeyDB 6.3.4 crash on replication db.cpp:2944 'm_fTrackingChanges >= 0' is not true #874

Open
sbaier1 opened this issue Oct 8, 2024 · 12 comments

Comments

@sbaier1
Copy link

sbaier1 commented Oct 8, 2024

Crash report

looks similar to #541 but i'm running 6.3.4 (also tried current HEAD@main FWIW)

Paste the complete crash log between the quotes below. Please include a few lines from the log preceding the crash report to provide some context.

1:16:S 08 Oct 2024 13:42:13.924 * MASTER <-> REPLICA sync started


=== KEYDB BUG REPORT START: Cut & paste starting from here ===
1:16:S 08 Oct 2024 13:42:13.924 # === ASSERTION FAILED ===
1:16:S 08 Oct 2024 13:42:13.924 # ==> db.cpp:2944 'm_fTrackingChanges >= 0' is not true

------ STACK TRACE ------

Backtrace:
keydb-server *:6379(redisDbPersistentData::processChanges(bool)+0x108) [0xaaaae3078338]
keydb-server *:6379(beforeSleep(aeEventLoop*)+0xbac) [0xaaaae303c4cc]
keydb-server *:6379(aeProcessEvents+0x3d0) [0xaaaae30338b0]
keydb-server *:6379(aeMain+0xb4) [0xaaaae30340f4]
keydb-server *:6379(workerThreadMain(void*)+0x100) [0xaaaae3052500]
/lib/aarch64-linux-gnu/libc.so.6(+0x7d5c8) [0xffffa12cd5c8]
/lib/aarch64-linux-gnu/libc.so.6(+0xe5edc) [0xffffa1335edc]

------ INFO OUTPUT ------
# Server
redis_version:255.255.255
redis_git_sha1:603ebb27
redis_git_dirty:0
redis_build_id:c822dc7b2d5ab23f
redis_mode:standalone
os:Linux 6.10.0-linuxkit aarch64
arch_bits:64
multiplexing_api:epoll
atomicvar_api:atomic-builtin
gcc_version:11.4.0
process_id:1
process_supervised:no
run_id:ce43730b4900f3d074ce27dbf5828f82ffb6fabc
tcp_port:6379
server_time_usec:1728394933926041
uptime_in_seconds:63
uptime_in_days:0
hz:10
configured_hz:10
lru_clock:341685
executable:/keydb-server
config_file:/tmp4/redis.conf
availability_zone:
features:cluster_mget

# Clients
connected_clients:0
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
current_client_thread:0
thread_0_clients:0

# Memory
used_memory:1660016
used_memory_human:1.58M
used_memory_rss:22654976
used_memory_rss_human:21.61M
used_memory_peak:1679504
used_memory_peak_human:1.60M
used_memory_peak_perc:98.84%
used_memory_overhead:1677696
used_memory_startup:1677696
used_memory_dataset:18446744073709533936
used_memory_dataset_perc:1844674407370955161600.00%
allocator_allocated:2357320
allocator_active:2748416
allocator_resident:6537216
total_system_memory:14640783360
total_system_memory_human:13.64G
used_memory_lua:37888
used_memory_lua_human:37.00K
used_memory_scripts:0
used_memory_scripts_human:0B
number_of_cached_scripts:0
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.17
allocator_frag_bytes:391096
allocator_rss_ratio:2.38
allocator_rss_bytes:3788800
rss_overhead_ratio:3.47
rss_overhead_bytes:16117760
mem_fragmentation_ratio:13.66
mem_fragmentation_bytes:20996816
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_clients_slaves:0
mem_clients_normal:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.2.1
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0
storage_provider:none
available_system_memory:unavailable

# Persistence
loading:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1728394870
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_last_cow_size:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:0
total_commands_processed:3
instantaneous_ops_per_sec:0
total_net_input_bytes:1297999586
total_net_output_bytes:0
instantaneous_input_kbps:29962.83
instantaneous_output_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:0
total_writes_processed:0
instantaneous_lock_contention:1
avg_lock_contention:0.375000
storage_provider_read_hits:0
storage_provider_read_misses:0

# Replication
role:slave
master_global_link_status:down
connected_masters:0
master_host:host.docker.internal
master_port:6378
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_read_repl_offset:1
slave_repl_offset:1
master_link_down_since_seconds:-1
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:e8faa2cd336dccfdc1e23cd9417b00eb2dc2ddb4
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:3.137355
used_cpu_user:0.429532
used_cpu_sys_children:0.000000
used_cpu_user_children:0.001520
server_threads:1
long_lock_waits:0
used_cpu_sys_main_thread:2.814500
used_cpu_user_main_thread:0.331181

# Modules
module:name=search,ver=20614,api=1,filters=0,usedby=[],using=[ReJSON],options=[handle-io-errors]
module:name=timeseries,ver=11202,api=1,filters=0,usedby=[],using=[],options=[handle-io-errors]
module:name=ReJSON,ver=20600,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]

# Commandstats
cmdstat_info:calls=3,usec=46,usec_per_call=15.33,rejected_calls=0,failed_calls=0

# Errorstats

# Cluster
cluster_enabled:0

# Keyspace

# KeyDB
mvcc_depth:0

------ CLIENT LIST OUTPUT ------

------ MODULES INFO OUTPUT ------
# search_version
search_version:2.6.14
search_redis_version:255.255.255 - oss

# search_index
search_number_of_indexes:0

# search_fields_statistics

# search_dialect_statistics
search_dialect_1:0
search_dialect_2:0
search_dialect_3:0

# search_runtime_configurations
search_concurrent_mode:OFF
search_enableGC:ON
search_minimal_term_prefix:2
search_maximal_prefix_expansions:200
search_query_timeout_ms:500
search_timeout_policy:return
search_cursor_read_size:1000
search_cursor_max_idle_time:300000
search_max_doc_table_size:1000000
search_max_search_results:1000000
search_max_aggregate_results:-1
search_search_pool_size:20
search_index_pool_size:8
search_gc_scan_size:100
search_min_phonetic_term_length:3

# ReJSON_trace
ReJSON_trace:   0: redis_module::base_info_func
             at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/redis-module-1.0.1/src/lib.rs:73:37
   1: rejson::__info_func
             at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/redis-module-1.0.1/src/macros.rs:120:13
   2: _Z18modulesCollectInfoPcPKcii
             at /opt/keydb/src/module.cpp:7270:24
   3: _Z14logModulesInfov
             at /opt/keydb/src/debug.cpp:1818:40
   4: _Z16printCrashReportv
             at /opt/keydb/src/debug.cpp:2077:19
      _serverAssert
             at /opt/keydb/src/debug.cpp:1014:25
   5: _ZN21redisDbPersistentData14processChangesEb
             at /opt/keydb/src/db.cpp:2944:5
   6: _Z11beforeSleepP11aeEventLoop
             at /opt/keydb/src/server.cpp:2951:55
   7: aeProcessEvents
             at /opt/keydb/src/ae.cpp:755:35
   8: aeMain
             at /opt/keydb/src/ae.cpp:823:24
   9: _Z16workerThreadMainPv
             at /opt/keydb/src/server.cpp:7386:15
  10: <unknown>
  11: <unknown>


------ FAST MEMORY TEST ------
1:16:S 08 Oct 2024 13:42:13.961 # main thread terminated
1:16:S 08 Oct 2024 13:42:13.961 # Bio thread for job type #0 terminated
1:16:S 08 Oct 2024 13:42:13.962 # Bio thread for job type #1 terminated
1:16:S 08 Oct 2024 13:42:13.962 # Bio thread for job type #2 terminated

Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

=== KEYDB BUG REPORT END. Make sure to include from START to END. ===

       Please report the crash by opening an issue on github:

           https://github.com/JohnSully/KeyDB/issues

  Suspect RAM error? Use keydb-server --test-memory to verify it.

Aditional information

Context: I am evaluating KeyDB as an alternative for a current Redis (7.2.0 FWIW) setup. I was trying to use replicaof to run KeyDB as a read-replica for an existing redis replica. The replica contains various time-series and JSON objects.

Out of curiosity, I've also tried running the same image on amd64 and with server-threads 1, neither of which resolved the issue, so it seems to be thread-safety (for the modules) + architecture independent.

The crash happens during replication, while no clients are connected yet.

The configuration file is unremarkable. the minimal config that reproduces the issue is just loadmodule + replicaof to connect to the replica.

  1. OS distribution and version

I'm using a container build based on ubuntu 22.04 including a patch from another issue here to get redisearch to work with KeyDB.

  1. Steps to reproduce (if any)
  • build container
  • run replicaof from a Redis 7.2.0 deployment containing JSON+timeseries objects

Dockerfile FWIW

FROM ubuntu:22.04 AS builder

# Avoid interactive prompts
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8
ENV TZ=America/New_York

# Consolidated dependencies to improve build speed + layer efficiency
RUN apt-get update && \
    apt-get install -y --no-install-recommends locales && \
    locale-gen en_US.UTF-8 && \
    update-locale LANG=en_US.UTF-8 && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    build-essential \
    curl \
    git \
    unzip \
    libssl-dev \
    pkg-config \
    python3 \
    autoconf \
    automake \
    libtool \
    clang \
    libclang-dev \
    nasm \
    autotools-dev \
    libjemalloc-dev \
    tcl \
    tcl-dev \
    redis \
    uuid-dev \
    libcurl4-openssl-dev \
    libbz2-dev \
    libzstd-dev \
    liblz4-dev \
    libsnappy-dev \
    && rm -rf /var/lib/apt/lists/*

# Install Rust (required for building RedisJSON)
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y
ENV PATH="/root/.cargo/bin:${PATH}"

RUN mkdir -p /opt/keydb && \
    cd /opt/keydb && \
    git clone https://github.com/Snapchat/KeyDB.git --branch v6.3.4 . && \
    git submodule update --init --recursive && \
    make -j4 && \
    make install


RUN apt-get update && \
    apt-get install -y \
    wget \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /usr/lib/redis/modules

# via https://github.com/Snapchat/KeyDB/issues/708#issuecomment-1817843245
COPY fix-search.patch /tmp/
# RedisSearch module
RUN mkdir -p /opt/redis-modules/redisearch && \
    cd /opt/redis-modules/redisearch && \
    git clone --depth 1 https://github.com/RediSearch/RediSearch.git --branch v2.6.14 . && \
    git submodule update --init --recursive && \
    git apply /tmp/fix-search.patch && \
    make setup && \
    rm -rf /var/lib/apt/lists/* && \
    make build DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/redisearch*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redisearch/bin/linux-*-release/search/redisearch.so /usr/lib/redis/modules/redisearch.so && \
    make clean

# RedisTimeSeries module
RUN mkdir -p /opt/redis-modules/redistimeseries && \
    cd /opt/redis-modules/redistimeseries && \
    git clone --depth 1 https://github.com/RedisTimeSeries/RedisTimeSeries.git --branch v1.12.2 . && \
    git submodule update --init --recursive

RUN cd /opt/redis-modules/redistimeseries && \
    ./deps/readies/bin/getpy3 && \
    sbin/system-setup.py && \
    make deps && \
    rm -rf /var/lib/apt/lists/* && \
    make build DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redistimeseries/bin/linux-*-release/redistimeseries.so /usr/lib/redis/modules/redistimeseries.so && \
    make clean


# Download and compile Redis modules

# RedisJSON module
RUN mkdir -p /opt/redis-modules/redisjson && \
    cd /opt/redis-modules/redisjson && \
    git clone --depth 1 https://github.com/RedisJSON/RedisJSON.git --branch v2.6.0 . && \
    git submodule update --init --recursive && \
    ./deps/readies/bin/getpy3 --modern && \
    ./deps/readies/bin/system-setup.py && \
    rm -rf /var/lib/apt/lists/* && \
    make DEBUG=0 -j4 && \
    make pack && \
    cp bin/artifacts/*.zip /usr/lib/redis/modules && \
    cp /opt/redis-modules/redisjson/bin/linux-*-release/target/release/librejson.so /usr/lib/redis/modules/librejson.so && \
    make clean

RUN cd /opt/redis-modules/redisjson && \
    mkdir -p /usr/lib/redis/modules/ && \
    cp bin/linux-*-release/rejson.so /usr/lib/redis/modules/

RUN apt-get update && apt-get install -y \
    cmake \
    libuv1-dev \
    && rm -rf /var/lib/apt/lists/*

FROM ubuntu:22.04


# Consolidated dependencies to improve build speed + layer efficiency
RUN apt-get update && \
    apt-get install -y --no-install-recommends locales && \
    locale-gen en_US.UTF-8 && \
    update-locale LANG=en_US.UTF-8 && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /usr/lib/redis/modules

COPY --from=builder /usr/lib/redis/modules/redisearch.so /usr/lib/redis/modules
COPY --from=builder /usr/lib/redis/modules/librejson.so /usr/lib/redis/modules
COPY --from=builder /usr/lib/redis/modules/redistimeseries.so /usr/lib/redis/modules
COPY --from=builder /usr/local/bin/keydb-server /usr/local/bin

# Configure KeyDB to load the modules
RUN mkdir -p /etc/keydb && \
    echo "loadmodule /usr/lib/redis/modules/librejson.so" >> /etc/keydb/redis.conf && \
    echo "loadmodule /usr/lib/redis/modules/redisearch.so" >> /etc/keydb/redis.conf && \
    echo "loadmodule /usr/lib/redis/modules/redistimeseries.so" >> /etc/keydb/redis.conf

# Expose default KeyDB ports
EXPOSE 6379

# Metadata
LABEL org.opencontainers.image.version="keydb-6.3.4-redisearch-2.6.14-redistimeseries-1.12.2-redisjson-2.6.0"

# Command to start KeyDB
CMD ["keydb-server", "/etc/keydb/redis.conf"]

patch

Index: src/spec.c
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/spec.c b/src/spec.c
--- a/src/spec.c	(revision 1f6278e1b3c9730a693f8b8b2a651b53a7dad5ed)
+++ b/src/spec.c	(date 1726578731076)
@@ -2480,6 +2480,7 @@
                                  void *data) {
   if (subevent == REDISMODULE_SUBEVENT_LOADING_RDB_START ||
       subevent == REDISMODULE_SUBEVENT_LOADING_AOF_START ||
+      subevent == REDISMODULE_SUBEVENT_LOADING_FLASH_START ||
       subevent == REDISMODULE_SUBEVENT_LOADING_REPL_START) {
     Indexes_Free(specDict_g);
     if (legacySpecDict) {
Index: src/redismodule.h
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/redismodule.h b/src/redismodule.h
--- a/src/redismodule.h	(revision 1f6278e1b3c9730a693f8b8b2a651b53a7dad5ed)
+++ b/src/redismodule.h	(date 1726578763608)
@@ -336,6 +336,7 @@
 #define REDISMODULE_SUBEVENT_PERSISTENCE_RDB_START 0
 #define REDISMODULE_SUBEVENT_PERSISTENCE_AOF_START 1
 #define REDISMODULE_SUBEVENT_PERSISTENCE_SYNC_RDB_START 2
+#define REDISMODULE_SUBEVENT_LOADING_FLASH_START 6
 #define REDISMODULE_SUBEVENT_PERSISTENCE_ENDED 3
 #define REDISMODULE_SUBEVENT_PERSISTENCE_FAILED 4
 #define _REDISMODULE_SUBEVENT_PERSISTENCE_NEXT 5
@sbaier1 sbaier1 changed the title [CRASH] [CRASH] KeyDB 6.3.4 crash on replication Oct 8, 2024
@sbaier1 sbaier1 changed the title [CRASH] KeyDB 6.3.4 crash on replication [CRASH] KeyDB 6.3.4 crash on replication db.cpp:2944 'm_fTrackingChanges >= 0' is not true Oct 8, 2024
@keithchew
Copy link

Hi @sbaier1

Just on a side note, the patch above will only get KeyDB to startup with redisearch module with FLASH. However, once started up, the indexes will not be created so it will not actually work. You will need additional work on startup to:

  1. Create indexes
  2. Feed data from FLASH to module to initialise indexes

I am unfamiliar with redisjson and timeseries, but I suspect they might also need some work like redisearch to with FLASH. If you disabled FLASH, will replication start to work?

@sbaier1
Copy link
Author

sbaier1 commented Oct 17, 2024

Hi @sbaier1

Just on a side note, the patch above will only get KeyDB to startup with redisearch module with FLASH. However, once started up, the indexes will not be created so it will not actually work. You will need additional work on startup to:

  1. Create indexes
  2. Feed data from FLASH to module to initialise indexes

I am unfamiliar with redisjson and timeseries, but I suspect they might also need some work like redisearch to with FLASH. If you disabled FLASH, will replication start to work?

i think that was mostly a red hering, i just included all the info for complete-ness. it crashes in the same way regardless of

  • whether FLASH is enabled or disabled
  • whether redisearch is even loaded at all

at this point i could imagine it's simply due to the master i'm replicating being an upstream redis 7.2.0 vs keydb being based on 6.x, so maybe it's just a format error causing a crash (which is still cosmetically not great though)

@dt-bernd
Copy link

dt-bernd commented Dec 3, 2024

I am facing the same issue as the OP. I am trying to migrate to KeyBD from a Redis 7.4.1. I tried to create a KeyDB instance as a replica of the Redis server and got the same errors as above.

In fact I can replicate it locally if I use the following docker-compose.yml:

services:
  redis:
    image: redis:7.4.1
    
  keydb:
    image: eqalpha/keydb
    depends_on:
      - redis    
    command: keydb-server /etc/keydb/keydb.conf --replicaof redis 6379

And then run docker compose up

Which gives me the following output before the KeyDB container exits:

keydb-1  | === KEYDB BUG REPORT START: Cut & paste starting from here ===
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.506 # === ASSERTION FAILED ===
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.506 # ==> db.cpp:2935 'm_fTrackingChanges >= 0' is not true
keydb-1  | 
keydb-1  | ------ STACK TRACE ------
keydb-1  | 
keydb-1  | Backtrace:
keydb-1  | keydb-server *:6379(redisDbPersistentData::processChanges(bool)+0x100) [0xaaaaafca0530]
keydb-1  | keydb-server *:6379(beforeSleep(aeEventLoop*)+0x8e0) [0xaaaaafcb39f8]
keydb-1  | keydb-server *:6379(aeProcessEvents+0x348) [0xaaaaafcdf598]
keydb-1  | keydb-server *:6379(aeMain+0x94) [0xaaaaafce08cc]
keydb-1  | keydb-server *:6379(workerThreadMain(void*)+0x80) [0xaaaaafc1b3f0]
keydb-1  | /lib/aarch64-linux-gnu/libpthread.so.0(+0x7088) [0xffffa8971088]
keydb-1  | 
keydb-1  | ------ INFO OUTPUT ------
keydb-1  | # Server
keydb-1  | redis_version:6.3.4
keydb-1  | redis_git_sha1:7e7e5e57
keydb-1  | redis_git_dirty:1
keydb-1  | redis_build_id:eddbc1b3c341a077
keydb-1  | redis_mode:standalone
keydb-1  | os:Linux 6.11.9-orbstack-00279-g4cf512143f2e aarch64
keydb-1  | arch_bits:64
keydb-1  | multiplexing_api:epoll
keydb-1  | atomicvar_api:atomic-builtin
keydb-1  | gcc_version:7.5.0
keydb-1  | process_id:1
keydb-1  | process_supervised:no
keydb-1  | run_id:72fe9e99600a4eb9c06edca658c696c584845a83
keydb-1  | tcp_port:6379
keydb-1  | server_time_usec:1733219488507335
keydb-1  | uptime_in_seconds:5
keydb-1  | uptime_in_days:0
keydb-1  | hz:10
keydb-1  | configured_hz:10
keydb-1  | lru_clock:5166240
keydb-1  | executable:/data/keydb-server
keydb-1  | config_file:/etc/keydb/keydb.conf
keydb-1  | availability_zone:
keydb-1  | features:cluster_mget
keydb-1  | 
keydb-1  | # Clients
keydb-1  | connected_clients:0
keydb-1  | cluster_connections:0
keydb-1  | maxclients:10000
keydb-1  | client_recent_max_input_buffer:0
keydb-1  | client_recent_max_output_buffer:0
keydb-1  | blocked_clients:0
keydb-1  | tracking_clients:0
keydb-1  | clients_in_timeout_table:0
keydb-1  | current_client_thread:0
keydb-1  | thread_0_clients:0
keydb-1  | thread_1_clients:0
keydb-1  | 
keydb-1  | # Memory
keydb-1  | used_memory:2106104
keydb-1  | used_memory_human:2.01M
keydb-1  | used_memory_rss:15327232
keydb-1  | used_memory_rss_human:14.62M
keydb-1  | used_memory_peak:2125720
keydb-1  | used_memory_peak_human:2.03M
keydb-1  | used_memory_peak_perc:99.08%
keydb-1  | used_memory_overhead:2123928
keydb-1  | used_memory_startup:2123928
keydb-1  | used_memory_dataset:18446744073709533792
keydb-1  | used_memory_dataset_perc:1844674407370955161600.00%
keydb-1  | allocator_allocated:3074808
keydb-1  | allocator_active:3674112
keydb-1  | allocator_resident:7458816
keydb-1  | total_system_memory:5571833856
keydb-1  | total_system_memory_human:5.19G
keydb-1  | used_memory_lua:37888
keydb-1  | used_memory_lua_human:37.00K
keydb-1  | used_memory_scripts:0
keydb-1  | used_memory_scripts_human:0B
keydb-1  | number_of_cached_scripts:0
keydb-1  | maxmemory:0
keydb-1  | maxmemory_human:0B
keydb-1  | maxmemory_policy:noeviction
keydb-1  | allocator_frag_ratio:1.19
keydb-1  | allocator_frag_bytes:599304
keydb-1  | allocator_rss_ratio:2.03
keydb-1  | allocator_rss_bytes:3784704
keydb-1  | rss_overhead_ratio:2.05
keydb-1  | rss_overhead_bytes:7868416
keydb-1  | mem_fragmentation_ratio:7.21
keydb-1  | mem_fragmentation_bytes:13202056
keydb-1  | mem_not_counted_for_evict:0
keydb-1  | mem_replication_backlog:0
keydb-1  | mem_clients_slaves:0
keydb-1  | mem_clients_normal:0
keydb-1  | mem_aof_buffer:0
keydb-1  | mem_allocator:jemalloc-5.2.1
keydb-1  | active_defrag_running:0
keydb-1  | lazyfree_pending_objects:0
keydb-1  | lazyfreed_objects:0
keydb-1  | storage_provider:none
keydb-1  | available_system_memory:unavailable
keydb-1  | 
keydb-1  | # Persistence
keydb-1  | loading:0
keydb-1  | current_cow_size:0
keydb-1  | current_cow_size_age:0
keydb-1  | current_fork_perc:0.00
keydb-1  | current_save_keys_processed:0
keydb-1  | current_save_keys_total:0
keydb-1  | rdb_changes_since_last_save:0
keydb-1  | rdb_bgsave_in_progress:0
keydb-1  | rdb_last_save_time:1733219483
keydb-1  | rdb_last_bgsave_status:ok
keydb-1  | rdb_last_bgsave_time_sec:-1
keydb-1  | rdb_current_bgsave_time_sec:-1
keydb-1  | rdb_last_cow_size:0
keydb-1  | aof_enabled:0
keydb-1  | aof_rewrite_in_progress:0
keydb-1  | aof_rewrite_scheduled:0
keydb-1  | aof_last_rewrite_time_sec:-1
keydb-1  | aof_current_rewrite_time_sec:-1
keydb-1  | aof_last_bgrewrite_status:ok
keydb-1  | aof_last_write_status:ok
keydb-1  | aof_last_cow_size:0
keydb-1  | module_fork_in_progress:0
keydb-1  | module_fork_last_cow_size:0
keydb-1  | 
keydb-1  | # Stats
keydb-1  | total_connections_received:0
keydb-1  | total_commands_processed:0
keydb-1  | instantaneous_ops_per_sec:0
keydb-1  | total_net_input_bytes:211
keydb-1  | total_net_output_bytes:0
keydb-1  | instantaneous_input_kbps:0.00
keydb-1  | instantaneous_output_kbps:0.00
keydb-1  | rejected_connections:0
keydb-1  | sync_full:0
keydb-1  | sync_partial_ok:0
keydb-1  | sync_partial_err:0
keydb-1  | expired_keys:0
keydb-1  | expired_stale_perc:0.00
keydb-1  | expired_time_cap_reached_count:0
keydb-1  | expire_cycle_cpu_milliseconds:0
keydb-1  | evicted_keys:0
keydb-1  | keyspace_hits:0
keydb-1  | keyspace_misses:0
keydb-1  | pubsub_channels:0
keydb-1  | pubsub_patterns:0
keydb-1  | latest_fork_usec:0
keydb-1  | total_forks:0
keydb-1  | migrate_cached_sockets:0
keydb-1  | slave_expires_tracked_keys:0
keydb-1  | active_defrag_hits:0
keydb-1  | active_defrag_misses:0
keydb-1  | active_defrag_key_hits:0
keydb-1  | active_defrag_key_misses:0
keydb-1  | tracking_total_keys:0
keydb-1  | tracking_total_items:0
keydb-1  | tracking_total_prefixes:0
keydb-1  | unexpected_error_replies:0
keydb-1  | total_error_replies:0
keydb-1  | dump_payload_sanitizations:0
keydb-1  | total_reads_processed:0
keydb-1  | total_writes_processed:0
keydb-1  | instantaneous_lock_contention:1
keydb-1  | avg_lock_contention:0.046875
keydb-1  | storage_provider_read_hits:0
keydb-1  | storage_provider_read_misses:0
keydb-1  | 
keydb-1  | # Replication
keydb-1  | role:slave
keydb-1  | master_global_link_status:down
keydb-1  | connected_masters:0
keydb-1  | master_host:redis
keydb-1  | master_port:6379
keydb-1  | master_link_status:down
keydb-1  | master_last_io_seconds_ago:-1
keydb-1  | master_sync_in_progress:0
keydb-1  | slave_read_repl_offset:1
keydb-1  | slave_repl_offset:1
keydb-1  | master_link_down_since_seconds:-1
keydb-1  | slave_priority:100
keydb-1  | slave_read_only:1
keydb-1  | replica_announced:1
keydb-1  | connected_slaves:0
keydb-1  | master_failover_state:no-failover
keydb-1  | master_replid:e1a34b206254c51663f4576488050e743f19d151
keydb-1  | master_replid2:0000000000000000000000000000000000000000
keydb-1  | master_repl_offset:0
keydb-1  | second_repl_offset:-1
keydb-1  | repl_backlog_active:0
keydb-1  | repl_backlog_size:1048576
keydb-1  | repl_backlog_first_byte_offset:0
keydb-1  | repl_backlog_histlen:0
keydb-1  | 
keydb-1  | # CPU
keydb-1  | used_cpu_sys:0.034106
keydb-1  | used_cpu_user:0.040707
keydb-1  | used_cpu_sys_children:0.006105
keydb-1  | used_cpu_user_children:0.002272
keydb-1  | server_threads:2
keydb-1  | long_lock_waits:0
keydb-1  | used_cpu_sys_main_thread:0.010015
keydb-1  | used_cpu_user_main_thread:0.016024
keydb-1  | 
keydb-1  | # Modules
keydb-1  | 
keydb-1  | # Commandstats
keydb-1  | 
keydb-1  | # Errorstats
keydb-1  | 
keydb-1  | # Cluster
keydb-1  | cluster_enabled:0
keydb-1  | 
keydb-1  | # Keyspace
keydb-1  | 
keydb-1  | # KeyDB
keydb-1  | mvcc_depth:0
keydb-1  | 
keydb-1  | ------ CLIENT LIST OUTPUT ------
keydb-1  | 
keydb-1  | ------ MODULES INFO OUTPUT ------
keydb-1  | 
keydb-1  | ------ FAST MEMORY TEST ------
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.507 # main thread terminated
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.508 # Bio thread for job type #0 terminated
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.508 # Bio thread for job type #1 terminated
keydb-1  | 1:25:S 03 Dec 2024 09:51:28.508 # Bio thread for job type #2 terminated
keydb-1  | 
keydb-1  | Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.
keydb-1  | 
keydb-1  | === KEYDB BUG REPORT END. Make sure to include from START to END. ===
keydb-1  | 
keydb-1  |        Please report the crash by opening an issue on github:
keydb-1  | 
keydb-1  |            https://github.com/JohnSully/KeyDB/issues
keydb-1  | 
keydb-1  |   Suspect RAM error? Use keydb-server --test-memory to verify it.

@keithchew
Copy link

hi @dt-bernd

Just briefly looking at the code, I can see a possible code flow which can cause the crash. Can you try this to see if the crash still happens:

  • start redis
  • start keydb in standalone mode
  • once keydb starts up, issue cli command replicaof

If there is no crash, keydb is likely calling beforeSleep() when doing replication at startup, before it can fully initialise.

@dt-bernd
Copy link

dt-bernd commented Dec 3, 2024

Thanks for having a look @keithchew - I've tried that with the same result:

keydb-1  | === KEYDB BUG REPORT START: Cut & paste starting from here ===
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.279 # === ASSERTION FAILED ===
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.279 # ==> db.cpp:2935 'm_fTrackingChanges >= 0' is not true
keydb-1  | 
keydb-1  | ------ STACK TRACE ------
keydb-1  | 
keydb-1  | Backtrace:
keydb-1  | keydb-server *:6379(redisDbPersistentData::processChanges(bool)+0x100) [0xaaaab9d00530]
keydb-1  | keydb-server *:6379(beforeSleep(aeEventLoop*)+0x8e0) [0xaaaab9d139f8]
keydb-1  | keydb-server *:6379(aeProcessEvents+0x348) [0xaaaab9d3f598]
keydb-1  | keydb-server *:6379(aeMain+0x94) [0xaaaab9d408cc]
keydb-1  | keydb-server *:6379(workerThreadMain(void*)+0x80) [0xaaaab9c7b3f0]
keydb-1  | /lib/aarch64-linux-gnu/libpthread.so.0(+0x7088) [0xffffa6c13088]
keydb-1  | 
keydb-1  | ------ INFO OUTPUT ------
keydb-1  | # Server
keydb-1  | redis_version:6.3.4
keydb-1  | redis_git_sha1:7e7e5e57
keydb-1  | redis_git_dirty:1
keydb-1  | redis_build_id:eddbc1b3c341a077
keydb-1  | redis_mode:standalone
keydb-1  | os:Linux 6.11.9-orbstack-00279-g4cf512143f2e aarch64
keydb-1  | arch_bits:64
keydb-1  | multiplexing_api:epoll
keydb-1  | atomicvar_api:atomic-builtin
keydb-1  | gcc_version:7.5.0
keydb-1  | process_id:1
keydb-1  | process_supervised:no
keydb-1  | run_id:ba1c9411277b059008195e719c892e2f55fa41ad
keydb-1  | tcp_port:6379
keydb-1  | server_time_usec:1733264595280848
keydb-1  | uptime_in_seconds:152
keydb-1  | uptime_in_days:0
keydb-1  | hz:10
keydb-1  | configured_hz:10
keydb-1  | lru_clock:5211347
keydb-1  | executable:/data/keydb-server
keydb-1  | config_file:/etc/keydb/keydb.conf
keydb-1  | availability_zone:
keydb-1  | features:cluster_mget
keydb-1  | 
keydb-1  | # Clients
keydb-1  | connected_clients:3
keydb-1  | cluster_connections:0
keydb-1  | maxclients:10000
keydb-1  | client_recent_max_input_buffer:56
keydb-1  | client_recent_max_output_buffer:0
keydb-1  | blocked_clients:0
keydb-1  | tracking_clients:0
keydb-1  | clients_in_timeout_table:0
keydb-1  | current_client_thread:0
keydb-1  | thread_0_clients:3
keydb-1  | thread_1_clients:0
keydb-1  | 
keydb-1  | # Memory
keydb-1  | used_memory:2170128
keydb-1  | used_memory_human:2.07M
keydb-1  | used_memory_rss:13836288
keydb-1  | used_memory_rss_human:13.20M
keydb-1  | used_memory_peak:2271136
keydb-1  | used_memory_peak_human:2.17M
keydb-1  | used_memory_peak_perc:95.55%
keydb-1  | used_memory_overhead:2164264
keydb-1  | used_memory_startup:2102720
keydb-1  | used_memory_dataset:5864
keydb-1  | used_memory_dataset_perc:8.70%
keydb-1  | allocator_allocated:3098320
keydb-1  | allocator_active:3907584
keydb-1  | allocator_resident:7913472
keydb-1  | total_system_memory:5571837952
keydb-1  | total_system_memory_human:5.19G
keydb-1  | used_memory_lua:37888
keydb-1  | used_memory_lua_human:37.00K
keydb-1  | used_memory_scripts:0
keydb-1  | used_memory_scripts_human:0B
keydb-1  | number_of_cached_scripts:0
keydb-1  | maxmemory:0
keydb-1  | maxmemory_human:0B
keydb-1  | maxmemory_policy:noeviction
keydb-1  | allocator_frag_ratio:1.26
keydb-1  | allocator_frag_bytes:809264
keydb-1  | allocator_rss_ratio:2.03
keydb-1  | allocator_rss_bytes:4005888
keydb-1  | rss_overhead_ratio:1.75
keydb-1  | rss_overhead_bytes:5922816
keydb-1  | mem_fragmentation_ratio:6.38
keydb-1  | mem_fragmentation_bytes:11667920
keydb-1  | mem_not_counted_for_evict:0
keydb-1  | mem_replication_backlog:0
keydb-1  | mem_clients_slaves:0
keydb-1  | mem_clients_normal:61544
keydb-1  | mem_aof_buffer:0
keydb-1  | mem_allocator:jemalloc-5.2.1
keydb-1  | active_defrag_running:0
keydb-1  | lazyfree_pending_objects:0
keydb-1  | lazyfreed_objects:0
keydb-1  | storage_provider:none
keydb-1  | available_system_memory:unavailable
keydb-1  | 
keydb-1  | # Persistence
keydb-1  | loading:0
keydb-1  | current_cow_size:0
keydb-1  | current_cow_size_age:0
keydb-1  | current_fork_perc:0.00
keydb-1  | current_save_keys_processed:0
keydb-1  | current_save_keys_total:0
keydb-1  | rdb_changes_since_last_save:0
keydb-1  | rdb_bgsave_in_progress:0
keydb-1  | rdb_last_save_time:1733264443
keydb-1  | rdb_last_bgsave_status:ok
keydb-1  | rdb_last_bgsave_time_sec:-1
keydb-1  | rdb_current_bgsave_time_sec:-1
keydb-1  | rdb_last_cow_size:0
keydb-1  | aof_enabled:0
keydb-1  | aof_rewrite_in_progress:0
keydb-1  | aof_rewrite_scheduled:0
keydb-1  | aof_last_rewrite_time_sec:-1
keydb-1  | aof_current_rewrite_time_sec:-1
keydb-1  | aof_last_bgrewrite_status:ok
keydb-1  | aof_last_write_status:ok
keydb-1  | aof_last_cow_size:0
keydb-1  | module_fork_in_progress:0
keydb-1  | module_fork_last_cow_size:0
keydb-1  | 
keydb-1  | # Stats
keydb-1  | total_connections_received:3
keydb-1  | total_commands_processed:24
keydb-1  | instantaneous_ops_per_sec:0
keydb-1  | total_net_input_bytes:64241951
keydb-1  | total_net_output_bytes:49441
keydb-1  | instantaneous_input_kbps:37466.81
keydb-1  | instantaneous_output_kbps:0.00
keydb-1  | rejected_connections:0
keydb-1  | sync_full:0
keydb-1  | sync_partial_ok:0
keydb-1  | sync_partial_err:0
keydb-1  | expired_keys:0
keydb-1  | expired_stale_perc:0.00
keydb-1  | expired_time_cap_reached_count:0
keydb-1  | expire_cycle_cpu_milliseconds:3
keydb-1  | evicted_keys:0
keydb-1  | keyspace_hits:0
keydb-1  | keyspace_misses:0
keydb-1  | pubsub_channels:0
keydb-1  | pubsub_patterns:0
keydb-1  | latest_fork_usec:0
keydb-1  | total_forks:0
keydb-1  | migrate_cached_sockets:0
keydb-1  | slave_expires_tracked_keys:0
keydb-1  | active_defrag_hits:0
keydb-1  | active_defrag_misses:0
keydb-1  | active_defrag_key_hits:0
keydb-1  | active_defrag_key_misses:0
keydb-1  | tracking_total_keys:0
keydb-1  | tracking_total_items:0
keydb-1  | tracking_total_prefixes:0
keydb-1  | unexpected_error_replies:0
keydb-1  | total_error_replies:0
keydb-1  | dump_payload_sanitizations:0
keydb-1  | total_reads_processed:22
keydb-1  | total_writes_processed:22
keydb-1  | instantaneous_lock_contention:1
keydb-1  | avg_lock_contention:0.046875
keydb-1  | storage_provider_read_hits:0
keydb-1  | storage_provider_read_misses:0
keydb-1  | 
keydb-1  | # Replication
keydb-1  | role:slave
keydb-1  | master_global_link_status:down
keydb-1  | connected_masters:0
keydb-1  | master_host:redis
keydb-1  | master_port:6379
keydb-1  | master_link_status:down
keydb-1  | master_last_io_seconds_ago:-1
keydb-1  | master_sync_in_progress:0
keydb-1  | slave_read_repl_offset:1
keydb-1  | slave_repl_offset:1
keydb-1  | master_link_down_since_seconds:-1
keydb-1  | slave_priority:100
keydb-1  | slave_read_only:1
keydb-1  | replica_announced:1
keydb-1  | connected_slaves:0
keydb-1  | master_failover_state:no-failover
keydb-1  | master_replid:17e6236dfa59326d11c1ebd7e78cd9c4e000c764
keydb-1  | master_replid2:0000000000000000000000000000000000000000
keydb-1  | master_repl_offset:0
keydb-1  | second_repl_offset:-1
keydb-1  | repl_backlog_active:0
keydb-1  | repl_backlog_size:1048576
keydb-1  | repl_backlog_first_byte_offset:0
keydb-1  | repl_backlog_histlen:0
keydb-1  | 
keydb-1  | # CPU
keydb-1  | used_cpu_sys:0.745687
keydb-1  | used_cpu_user:0.781449
keydb-1  | used_cpu_sys_children:0.011979
keydb-1  | used_cpu_user_children:0.000931
keydb-1  | server_threads:2
keydb-1  | long_lock_waits:16
keydb-1  | used_cpu_sys_main_thread:0.459630
keydb-1  | used_cpu_user_main_thread:0.424092
keydb-1  | 
keydb-1  | # Modules
keydb-1  | 
keydb-1  | # Commandstats
keydb-1  | cmdstat_client:calls=4,usec=69,usec_per_call=17.25,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_config:calls=2,usec=123,usec_per_call=61.50,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_info:calls=12,usec=3185,usec_per_call=265.42,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_module:calls=1,usec=10,usec_per_call=10.00,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_scan:calls=2,usec=181,usec_per_call=90.50,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_dbsize:calls=2,usec=3,usec_per_call=1.50,rejected_calls=0,failed_calls=0
keydb-1  | cmdstat_replicaof:calls=1,usec=5084475,usec_per_call=5084475.00,rejected_calls=0,failed_calls=0
keydb-1  | 
keydb-1  | # Errorstats
keydb-1  | 
keydb-1  | # Cluster
keydb-1  | cluster_enabled:0
keydb-1  | 
keydb-1  | # Keyspace
keydb-1  | 
keydb-1  | # KeyDB
keydb-1  | mvcc_depth:0
keydb-1  | 
keydb-1  | ------ CLIENT LIST OUTPUT ------
keydb-1  | id=4 addr=192.168.97.1:65200 laddr=192.168.97.3:6379 fd=20 name=redisinsight-common-12899187-7a14-43f6-987b-ece14bf5767f---1-1- age=34 idle=6 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20496 events=r cmd=info user=default redir=-1
keydb-1  | id=5 addr=192.168.97.1:24622 laddr=192.168.97.3:6379 fd=21 name=redisinsight-browser-12899187-7a14-43f6-987b-ece14bf5767f---1-1- age=34 idle=26 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20536 events=r cmd=scan user=default redir=-1
keydb-1  | id=6 addr=192.168.97.1:33488 laddr=192.168.97.3:6379 fd=22 name=redisinsight-cli-12899187-7a14-43f6-987b-ece14bf5767f--9e492a1a-0311-4573-82db-1b509f46e06e-1-1- age=23 idle=6 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 argv-mem=0 obl=0 oll=0 omem=0 tot-mem=20512 events=r cmd=replicaof user=default redir=-1
keydb-1  | 
keydb-1  | ------ MODULES INFO OUTPUT ------
keydb-1  | 
keydb-1  | ------ FAST MEMORY TEST ------
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.281 # main thread terminated
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.281 # Bio thread for job type #0 terminated
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.281 # Bio thread for job type #1 terminated
keydb-1  | 1:27:S 03 Dec 2024 22:23:15.281 # Bio thread for job type #2 terminated
keydb-1  | 
keydb-1  | Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.
keydb-1  | 
keydb-1  | === KEYDB BUG REPORT END. Make sure to include from START to END. ===

@keithchew
Copy link

keithchew commented Dec 3, 2024

hi @dt-bernd

Did the crash happen as soon as you entered the cli command? Note that you need to remove the replicaof command from your docker-compose.yml so keydb starts up as master.

@dt-bernd
Copy link

dt-bernd commented Dec 4, 2024

@keithchew - Yes. I updated the docker compose command option, so that it was in standalone mode. I made sure everything was working. Then manually issued the "REPLICAOF" command.

Then get the following output in the logs:

keydb-1  | 1:27:S 04 Dec 2024 19:39:03.721 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.721 * Connecting to MASTER redis:6379
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.750 * MASTER <-> REPLICA sync started
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.750 * REPLICAOF redis:6379 enabled (user request from 'id=6 addr=192.168.97.1:48926 laddr=192.168.97.3:6379 fd=22 name=redisinsight-cli-12899187-7a14-43f6-987b-ece14bf5767f--e00c05f9-a8a6-40cf-968e-0a22c4dfe822-1-1- age=6 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=40954 argv-mem=18 obl=0 oll=0 omem=0 tot-mem=61482 events=r cmd=replicaof user=default redir=-1')
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.751 * Non blocking connect for SYNC fired the event.
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.755 * Master does not support REPLPING, sending PING instead...
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.755 * Non blocking connect for SYNC fired the event.
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.763 * Master replied to PING, replication can continue...
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.764 # non-fatal: Master doesn't understand REPLCONF uuid
keydb-1  | 1:27:S 04 Dec 2024 19:39:03.764 * Trying a partial resynchronization (request 7ccdaad7eb49851df611e9c74c4da280ec2463f8:1).
redis-1  | 1:M 04 Dec 2024 19:39:03.764 * Replica 192.168.97.3:6379 asks for synchronization
redis-1  | 1:M 04 Dec 2024 19:39:03.765 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '7ccdaad7eb49851df611e9c74c4da280ec2463f8', my replication IDs are '774e6cb9dc767e5f9e3f788ddcce566793d2a960' and 'c80356c7af9820a1b7462d8cd50d9d8f3d05b339')
redis-1  | 1:M 04 Dec 2024 19:39:03.765 * Delay next BGSAVE for diskless SYNC
redis-1  | 1:M 04 Dec 2024 19:39:08.181 * Starting BGSAVE for SYNC with target: replicas sockets
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.182 * Full resync from master: 774e6cb9dc767e5f9e3f788ddcce566793d2a960:4053
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.182 * Discarding previously cached master state.
redis-1  | 1:M 04 Dec 2024 19:39:08.193 * Background RDB transfer started by pid 23
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.213 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
redis-1  | 23:C 04 Dec 2024 19:39:08.737 * Fork CoW for RDB: current 1 MB, peak 1 MB, average 0 MB
redis-1  | 1:M 04 Dec 2024 19:39:08.737 * Diskless rdb transfer, done reading from pipe, 1 replicas still up.
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.738 * MASTER <-> REPLICA sync: Flushing old data
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.738 * MASTER <-> REPLICA sync: Loading DB in memory
redis-1  | 1:M 04 Dec 2024 19:39:08.750 * Connection with replica 192.168.97.3:6379 lost.
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.749 # Can't handle RDB format version 12
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.749 # Failed trying to load the MASTER synchronization DB from disk
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.750 * Reconnecting to MASTER redis:6379 after failure
keydb-1  | 1:27:S 04 Dec 2024 19:39:08.775 * MASTER <-> REPLICA sync started

followed by the error above.

@keithchew
Copy link

keithchew commented Dec 4, 2024

I see it! When it encounters "Can't handle RDB format version 12", the method did not "goto eoferr" to resume tracking changes.

And looking further, keydb's supported RDB version is 9, so your redis is quite far ahead.

Note sure the best workaround, the only thing I can think of is to first migrate your redis data to version 9, then move to keydb. Alternatively, if you are sure you don't have version specific data, then perhaps you can update the keydb's code to accept higher versions. Since you are just evaluating, the latter might be a quick way just to see if keydb can get the data.

@dt-bernd
Copy link

dt-bernd commented Dec 4, 2024

@keithchew Thanks, I had also tried the diskless sync option, hoping that might avoid the RDB version issue - but got similar results. Another option was using riot replication to move the data from the Redis 7 format to KeyDB. I was able to replicate the data using the --struct option:

docker run riotx/riot replicate --struct --mode live redis://redis:6379 redis://keydb:6379

And was able to get most of the data across. Unfortunately one of our use cases is streams, and while it moved the actual stream data across, it didn't move the consumer metadata across too. For anyone else, this might be an option in doing a "live" migration.

@keithchew
Copy link

keithchew commented Dec 4, 2024

The issue is your data, not the replication method. Because of the version mismatch, keydb does not even attempt to replicate, and its incorrect error handling causes the crash.

Your attempt with riot struct method seems painful, if you decide to try my suggestion above, it is only a 1-liner change in keydb's source code.

@keithchew
Copy link

keithchew commented Dec 4, 2024

Looking into this a bit more, I think the one-liner has a good chance of working, as streams are from version 9:
https://github.com/sripathikrishnan/redis-rdb-tools/blob/master/docs/RDB_Version_History.textile

Don't know what is in version 10, 11, 12 so maybe you can double check this before proceeding.

@dt-bernd
Copy link

dt-bernd commented Dec 4, 2024

Thanks, will see what I can do with the code change and test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants