Support for DEBUG DIGEST module data type callback #21

nnmehta · 2024-11-15T19:49:49Z

Adds callback for digest and debug digest integration tests for different scenarios like COPY, RDB load and AOF Rewrite

Closes #9

YueTang-Vanessa · 2024-11-15T20:13:41Z

Please check the DCO guide and signoff your PR: https://github.com/valkey-io/valkey-bloom/pull/21/checks?check_run_id=33062120817.

KarthikSubbarao · 2024-11-15T20:30:07Z

src/digest.rs

+
+/// `Digest` is a high-level rust interface to the Valkey module C API
+/// abstracting away the raw C ffi calls.
+pub struct Digest {


I guess this is a solution until the DEBUG wrapper functionality is added to the valkeymodule-rs SDK.

We can remove this once the DEBUG wrapper functionality is released in a new version

KarthikSubbarao · 2024-11-15T20:37:20Z

src/wrapper/bloom_callback.rs

+    let mut dig = Digest::new(md);
+    let val = &*(value.cast::<BloomFilterType>());
+    dig.add_long_long(val.expansion.into());
+    dig.add_long_long(val.capacity());


capacity() returns a per BloomFilterType result which is summed across all filters in the object.

We will need to add data which is specific to the overall BloomFilterType (including every sub filter).

This means, we need to add the struct member data from the top level - BloomFilterType structure, and then we will need to add the struct member values from the inner BloomFilter structures in the vector

This is needed for data correctness.

For example: If we just add capacity, we can have an bloom with overall 100 capacity from one single filter. But we can also have another object where this is split across 5 filters adding up to a total of 100. These objects are not the same, hence updating the debug logic as mentioned above will handle this

Signed-off-by: Nihal Mehta <[email protected]>

KarthikSubbarao · 2024-11-22T19:14:09Z

src/digest.rs

+    pub dig: *mut raw::RedisModuleDigest,
+}
+
+impl Digest {


Let's move this file into the wrapper directory

Signed-off-by: Nihal Mehta <[email protected]>

KarthikSubbarao · 2024-11-22T23:05:30Z

tests/test_aofrewrite.py

@@ -20,7 +20,10 @@ def test_basic_aofrewrite_and_restore(self):
        bf_info_result_1 = client.execute_command('BF.INFO testSave')
        assert(len(bf_info_result_1)) != 0
        curr_item_count_1 = client.info_obj().num_keys()
-
+        # cmd debug digest
+        client.debug_digest()


Can we check that this is not None? Also, when we restart the server later on, can we compare and check that they are the same?

KarthikSubbarao · 2024-11-22T23:06:27Z

tests/test_aofrewrite.py

        assert bf_info_result_2 == bf_info_result_1
+        assert debug_restore == debug_original
        client.execute_command('DEL testSave')

    def test_aofrewrite_bloomfilter_metrics(self):


Let's also add a debug digest test here

KarthikSubbarao · 2024-11-22T23:07:57Z

tests/test_basic.py

@@ -39,10 +39,15 @@ def test_copy_and_exists_cmd(self):
        assert client.execute_command('EXISTS filter') == 1
        mexists_result = client.execute_command('BF.MEXISTS filter item1 item2 item3 item4')
        assert len(madd_result) == 4 and len(mexists_result) == 4
+        # cmd debug digest
+        client.debug_digest()


Can we check that this is not None? Also, when we restart the server later on, can we compare and check that they are the same?

KarthikSubbarao · 2024-11-22T23:08:03Z

tests/test_save_and_restore.py

@@ -14,7 +14,9 @@ def test_basic_save_and_restore(self):
        bf_info_result_1 = client.execute_command('BF.INFO testSave')
        assert(len(bf_info_result_1)) != 0
        curr_item_count_1 = client.info_obj().num_keys()
-
+        client.debug_digest()


Can we check that this is not None? Also, when we restart the server later on, can we compare and check that they are the same?

And not 0000000000000000000000000000000000000000

KarthikSubbarao · 2024-11-22T23:10:26Z

tests/valkeytests/valkey_test_case.py

@@ -474,7 +474,7 @@ def port_tracker_fixture(self, resource_port_tracker):
        self.port_tracker = resource_port_tracker

    def _get_valkey_args(self):
-        self.args.update({"maxmemory":self.maxmemory, "maxmemory-policy":"allkeys-random", "activerehashing":"yes", "databases": self.num_dbs, "repl-diskless-sync": "yes", "save": ""})
+        self.args.update({"maxmemory":self.maxmemory, "maxmemory-policy":"allkeys-random", "activerehashing":"yes", "databases": self.num_dbs, "repl-diskless-sync": "yes", "save": "", "enable-debug-command":"yes"})


Let us also add testing in two other places:

test_correctness.py - both scaling and non scaling filters should have ensured of correctness

test_replication.py - replicated commands should have the same digest

KarthikSubbarao · 2024-11-22T23:11:40Z

src/wrapper/bloom_callback.rs

+    dig.add_long_long(val.expansion.into());
+    dig.add_string_buffer(&val.fp_rate.to_le_bytes());
+    for filter in &val.filters {
+        dig.add_string_buffer(&filter.bloom.bitmap());


Let's also add the sip_keys of every filter into the digest. When we compare two bloom objects, the sip keys of hash functions of the bloom filters should be compared as well.

If they are different, they are not the same object

Step1 - Implement sip_keys on the BloomFilter struct

/// Return the keys used by the sip hasher of the raw bloom. pub fn sip_keys(&self) -> [(u64, u64); 2] { self.bloom.sip_keys() }

Step 2 - Here, from the callback, write the 4 numbers (which are u64) into the digest using add_long_long()

Signed-off-by: Nihal Mehta <[email protected]>

KarthikSubbarao reviewed Nov 15, 2024

View reviewed changes

nnmehta force-pushed the unstable branch from 6586124 to 67c5c09 Compare November 15, 2024 20:43

KarthikSubbarao assigned nnmehta Nov 15, 2024

KarthikSubbarao force-pushed the unstable branch 3 times, most recently from 188835b to a33e0e3 Compare November 21, 2024 21:52

nnmehta closed this Nov 21, 2024

nnmehta force-pushed the unstable branch from 67c5c09 to a33e0e3 Compare November 21, 2024 23:10

Support for DEBUG DIGEST module data type callback

4edc19c

Signed-off-by: Nihal Mehta <[email protected]>

nnmehta reopened this Nov 21, 2024

Update test cases

743516d

Signed-off-by: Nihal Mehta <[email protected]>

KarthikSubbarao reviewed Nov 22, 2024

View reviewed changes

src/digest.rs Outdated

pub dig: *mut raw::RedisModuleDigest,

}

impl Digest {

Copy link

Member

KarthikSubbarao Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this file into the wrapper directory

Move digest to wrapper

988f056

Signed-off-by: Nihal Mehta <[email protected]>

KarthikSubbarao reviewed Nov 22, 2024

View reviewed changes

Update tests

cba8f17

Signed-off-by: Nihal Mehta <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for DEBUG DIGEST module data type callback #21

Support for DEBUG DIGEST module data type callback #21

nnmehta commented Nov 15, 2024

YueTang-Vanessa commented Nov 15, 2024

KarthikSubbarao Nov 15, 2024 •

edited

Loading

KarthikSubbarao Nov 15, 2024 •

edited

Loading

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024

KarthikSubbarao Nov 22, 2024 •

edited

Loading

KarthikSubbarao Nov 22, 2024

Support for DEBUG DIGEST module data type callback #21

Are you sure you want to change the base?

Support for DEBUG DIGEST module data type callback #21

Conversation

nnmehta commented Nov 15, 2024

YueTang-Vanessa commented Nov 15, 2024

KarthikSubbarao Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

KarthikSubbarao Nov 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KarthikSubbarao Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KarthikSubbarao Nov 15, 2024 •

edited

Loading

KarthikSubbarao Nov 15, 2024 •

edited

Loading

KarthikSubbarao Nov 22, 2024 •

edited

Loading