-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for DEBUG DIGEST module data type callback #21
base: unstable
Are you sure you want to change the base?
Conversation
Please check the DCO guide and signoff your PR: https://github.com/valkey-io/valkey-bloom/pull/21/checks?check_run_id=33062120817. |
src/digest.rs
Outdated
|
||
/// `Digest` is a high-level rust interface to the Valkey module C API | ||
/// abstracting away the raw C ffi calls. | ||
pub struct Digest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a solution until the DEBUG wrapper functionality is added to the valkeymodule-rs SDK.
We can remove this once the DEBUG wrapper functionality is released in a new version
src/wrapper/bloom_callback.rs
Outdated
let mut dig = Digest::new(md); | ||
let val = &*(value.cast::<BloomFilterType>()); | ||
dig.add_long_long(val.expansion.into()); | ||
dig.add_long_long(val.capacity()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capacity()
returns a per BloomFilterType
result which is summed across all filters in the object.
We will need to add data which is specific to the overall BloomFilterType
(including every sub filter).
This means, we need to add the struct member data from the top level - BloomFilterType structure, and then we will need to add the struct member values from the inner BloomFilter
structures in the vector
This is needed for data correctness.
For example: If we just add capacity, we can have an bloom with overall 100 capacity from one single filter. But we can also have another object where this is split across 5 filters adding up to a total of 100. These objects are not the same, hence updating the debug logic as mentioned above will handle this
188835b
to
a33e0e3
Compare
Signed-off-by: Nihal Mehta <[email protected]>
Signed-off-by: Nihal Mehta <[email protected]>
src/digest.rs
Outdated
pub dig: *mut raw::RedisModuleDigest, | ||
} | ||
|
||
impl Digest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this file into the wrapper directory
Signed-off-by: Nihal Mehta <[email protected]>
tests/test_aofrewrite.py
Outdated
@@ -20,7 +20,10 @@ def test_basic_aofrewrite_and_restore(self): | |||
bf_info_result_1 = client.execute_command('BF.INFO testSave') | |||
assert(len(bf_info_result_1)) != 0 | |||
curr_item_count_1 = client.info_obj().num_keys() | |||
|
|||
# cmd debug digest | |||
client.debug_digest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that this is not None
? Also, when we restart the server later on, can we compare and check that they are the same?
tests/test_aofrewrite.py
Outdated
assert bf_info_result_2 == bf_info_result_1 | ||
assert debug_restore == debug_original | ||
client.execute_command('DEL testSave') | ||
|
||
def test_aofrewrite_bloomfilter_metrics(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add a debug digest test here
tests/test_basic.py
Outdated
@@ -39,10 +39,15 @@ def test_copy_and_exists_cmd(self): | |||
assert client.execute_command('EXISTS filter') == 1 | |||
mexists_result = client.execute_command('BF.MEXISTS filter item1 item2 item3 item4') | |||
assert len(madd_result) == 4 and len(mexists_result) == 4 | |||
# cmd debug digest | |||
client.debug_digest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that this is not None
? Also, when we restart the server later on, can we compare and check that they are the same?
tests/test_save_and_restore.py
Outdated
@@ -14,7 +14,9 @@ def test_basic_save_and_restore(self): | |||
bf_info_result_1 = client.execute_command('BF.INFO testSave') | |||
assert(len(bf_info_result_1)) != 0 | |||
curr_item_count_1 = client.info_obj().num_keys() | |||
|
|||
client.debug_digest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check that this is not None
? Also, when we restart the server later on, can we compare and check that they are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not 0000000000000000000000000000000000000000
@@ -474,7 +474,7 @@ def port_tracker_fixture(self, resource_port_tracker): | |||
self.port_tracker = resource_port_tracker | |||
|
|||
def _get_valkey_args(self): | |||
self.args.update({"maxmemory":self.maxmemory, "maxmemory-policy":"allkeys-random", "activerehashing":"yes", "databases": self.num_dbs, "repl-diskless-sync": "yes", "save": ""}) | |||
self.args.update({"maxmemory":self.maxmemory, "maxmemory-policy":"allkeys-random", "activerehashing":"yes", "databases": self.num_dbs, "repl-diskless-sync": "yes", "save": "", "enable-debug-command":"yes"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us also add testing in two other places:
- test_correctness.py - both scaling and non scaling filters should have ensured of correctness
- test_replication.py - replicated commands should have the same digest
src/wrapper/bloom_callback.rs
Outdated
dig.add_long_long(val.expansion.into()); | ||
dig.add_string_buffer(&val.fp_rate.to_le_bytes()); | ||
for filter in &val.filters { | ||
dig.add_string_buffer(&filter.bloom.bitmap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add the sip_keys of every filter into the digest. When we compare two bloom objects, the sip keys of hash functions of the bloom filters should be compared as well.
If they are different, they are not the same object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step1 - Implement sip_keys on the BloomFilter
struct
/// Return the keys used by the sip hasher of the raw bloom.
pub fn sip_keys(&self) -> [(u64, u64); 2] {
self.bloom.sip_keys()
}
Step 2 - Here, from the callback, write the 4 numbers (which are u64) into the digest using add_long_long()
Signed-off-by: Nihal Mehta <[email protected]>
Adds callback for
digest
anddebug digest
integration tests for different scenarios like COPY, RDB load and AOF RewriteCloses #9