Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Malicious node do not send BlockResponse, making it difficult for honest node to complete synchronization #3320

Closed
elderhammer opened this issue Jun 17, 2024 · 1 comment
Assignees
Labels
bug Incorrect or unexpected behavior

Comments

@elderhammer
Copy link
Contributor

elderhammer commented Jun 17, 2024

Steps to Reproduce

  1. Honest node sends BlockRequest to peer
2024-06-17T15:16:44.205806Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55193' from '127.0.0.1:5001'
2024-06-17T15:16:45.576134Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55218..55223' from '127.0.0.1:5002'
2024-06-17T15:16:46.051672Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55228..55233' from '127.0.0.1:5002'
2024-06-17T15:16:46.556993Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55248..55253' from '127.0.0.1:5002'
2024-06-17T15:16:47.510517Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55253..55258' from '127.0.0.1:5002'
2024-06-17T15:16:48.252671Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55268..55273' from '127.0.0.1:5002'
2024-06-17T15:16:49.127308Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55283..55288' from '127.0.0.1:5002'
2024-06-17T15:16:49.863806Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55288..55293' from '127.0.0.1:5002'
2024-06-17T15:16:50.554798Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55313..55318' from '127.0.0.1:5002'
2024-06-17T15:16:51.315506Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55328..55333' from '127.0.0.1:5002'
2024-06-17T15:16:51.819418Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55338..55343' from '127.0.0.1:5002'
2024-06-17T15:16:52.492713Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55343..55348' from '127.0.0.1:5002'
2024-06-17T15:16:53.220933Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55348..55353' from '127.0.0.1:5002'
2024-06-17T15:16:53.968895Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55358..55363' from '127.0.0.1:5002'
2024-06-17T15:16:54.744532Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55368..55373' from '127.0.0.1:5002'
2024-06-17T15:16:55.473482Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55398..55403' from '127.0.0.1:5002'
2024-06-17T15:16:56.349889Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55413..55418' from '127.0.0.1:5002'
2024-06-17T15:16:57.224970Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55443..55448' from '127.0.0.1:5002'
2024-06-17T15:26:06.736907Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55203..55208' from '127.0.0.1:5001'
2024-06-17T15:26:09.487232Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55208..55213' from '127.0.0.1:5001'
2024-06-17T15:26:16.721589Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55233..55238' from '127.0.0.1:5001'
2024-06-17T15:26:21.862726Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55243..55248' from '127.0.0.1:5001'
2024-06-17T15:26:26.615743Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55263..55268' from '127.0.0.1:5002'
2024-06-17T15:26:32.238841Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55273..55278' from '127.0.0.1:5001'
2024-06-17T15:26:36.738487Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55293..55298' from '127.0.0.1:5002'
2024-06-17T15:26:37.513502Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55298..55303' from '127.0.0.1:5002'
2024-06-17T15:26:38.650444Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55308..55313' from '127.0.0.1:5002'
2024-06-17T15:26:41.711059Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55318..55323' from '127.0.0.1:5002'
2024-06-17T15:26:56.902826Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55383..55388' from '127.0.0.1:5001'
2024-06-17T15:26:57.573136Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55388..55393' from '127.0.0.1:5001'
2024-06-17T15:27:06.938781Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55423..55428' from '127.0.0.1:5001'
2024-06-17T15:27:11.896375Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55453..55458' from '127.0.0.1:5002'
2024-06-17T15:27:11.943583Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55448..55453' from '127.0.0.1:5001'
2024-06-17T15:27:12.624713Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'BlockResponse 55458..55463' from '127.0.0.1:5002'
  1. Malicious node receives the request but does not send BlockResponse 55203..55208
  2. The honest node stops the while loop because BlockResponse 55203..55208 is not received
    // Determine if we can sync the ledger without updating the BFT first.
    if current_height <= max_gc_height {
    // Try to advance the ledger *to tip* without updating the BFT.
    while let Some(block) = self.block_sync.process_next_block(current_height) {
    info!("Syncing the ledger to block {}...", block.height());
    self.sync_ledger_with_block_without_bft(block).await?;
    // Update the current height.
    current_height += 1;
    }
    // Sync the storage with the ledger if we should transition to the BFT sync.
    if current_height > max_gc_height {
    if let Err(e) = self.sync_storage_with_ledger_at_bootup().await {
    error!("BFT sync (with bootup routine) failed - {e}");
    }
    }
    }
  3. The honest node executes try_block_sync, but because construct_requests returns empty, no new BlockRequest is sent
    // Ensure the current height is not canonized or already requested.
    if self.check_block_request(height).is_err() {
    // If the sequence of block requests is interrupted, then return early.
    // Otherwise, continue until the first start height that is new.
    match request_hashes.is_empty() {
    true => continue,
    false => break,
    }
    }
  4. Honest node execute remove_timed_out_block_requests to clean up timed out requests after 10 minutes
    // Remove timed out block requests.
    request_timestamps.retain(|height, timestamp| {
    let is_obsolete = *height < current_height;
    // Determine if the duration since the request timestamp has exceeded the request timeout.
    let is_time_passed = now.duration_since(*timestamp).as_secs() > BLOCK_REQUEST_TIMEOUT_IN_SECS;
    // Determine if the request is incomplete.
    let is_request_incomplete =
    !requests.get(height).map(|(_, _, peer_ips)| peer_ips.is_empty()).unwrap_or(false);
    // Determine if the request has timed out.
    let is_timeout = is_time_passed && is_request_incomplete;
    // If the request has timed out, or is obsolete, then remove it.
    if is_timeout || is_obsolete {
    trace!("Block request {height} has timed out: is_time_passed = {is_time_passed}, is_request_incomplete = {is_request_incomplete}, is_obsolete = {is_obsolete}");
    // Remove the request entry for the given height.
    requests.remove(height);
    // Remove the response entry for the given height.
    responses.remove(height);
    // Increment the number of timed out block requests.
    num_timed_out_block_requests += 1;
    }
    // Retain if this is not a timeout and is not obsolete.
    !is_timeout && !is_obsolete
    });
  5. Repeat the above process

In the worst case, the smallest BlockRequest of the node is always sent to the malicious node, making the node synchronization almost stagnant. Even in the average case, the synchronization of the node is extremely time-consuming.

Expected Behavior

If a node is found to have not sent a BlockResponse for multiple consecutive times (for example, 3 times), stop sending BlockRequest to it.

Your Environment

snarkOS Version: cf83035

@elderhammer elderhammer added the bug Incorrect or unexpected behavior label Jun 17, 2024
@elderhammer elderhammer reopened this Jun 21, 2024
@elderhammer elderhammer changed the title [Bug] Malicious validators do not send BlockResponse, making it difficult for honest validators to complete synchronization [Bug] Malicious node do not send BlockResponse, making it difficult for honest node to complete synchronization Jun 23, 2024
@elderhammer
Copy link
Contributor Author

Duplicate of #3322.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect or unexpected behavior
Projects
None yet
Development

No branches or pull requests

2 participants