Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support retry logic for incremental snapshot download #487

Open
clarkezone opened this issue Jan 9, 2025 · 6 comments
Open

feat: support retry logic for incremental snapshot download #487

clarkezone opened this issue Jan 9, 2025 · 6 comments
Assignees

Comments

@clarkezone
Copy link

clarkezone commented Jan 9, 2025

  • investigate existing download logic in accountsdb/download.zig
  • implement automatic retry logic to ensure that an incremental snapshot is always retrieved
  • at a UT

  • rn the method downloadSnapshotsFromGossip downloads a full snapshot and then trys to download an incremental snapshot from the same peer, and if it fails it returns as success
  • we should have an option where people can define they need either:
    • only a full snapshot
    • or a full snapshot and a matching incremental (if no incremental can be found it should error)
@InKryption InKryption self-assigned this Jan 10, 2025
@0xNineteen
Copy link
Contributor

for additional ref: https://github.com/orgs/Syndica/projects/2/views/10?pane=issue&itemId=90046148

@0xNineteen 0xNineteen moved this to 🏗 In progress in Sig Jan 10, 2025
@dnut
Copy link
Contributor

dnut commented Jan 10, 2025

for additional ref: https://github.com/orgs/Syndica/projects/2/views/10?pane=issue&itemId=90046148

This seems different than auto-retry. Do we want to unify these into a single task?

@0xNineteen
Copy link
Contributor

0xNineteen commented Jan 10, 2025

yeah should be a single task i think -- just merged them

@clarkezone
Copy link
Author

This is my proposed logic in pseudocode (didn't include original for brevity, cmd args would be updated also):

//accountsdb/download.zig

function getOrDownloadAndUnpackSnapshot(snapshotcombination, forcenewsnapshotdownload, forceunpack) returing manifest and snapshots
{
  logic to check is accountsdb exists
  if we are forcing, blow away existing accountsdbstate
  
  // full and incremental could be done in parallel in the future as suggested per // PERF: maybe do this in another thread? while downloading the full snapshot
  
  compute if we should downloadFull based on args
  
  if we should downloadFull {
    figure out full download paths
    reliablyDownloadSnapshotFromGossip (full)
  }
  
  compute if we should downloadPartial based on args
  
  if we should downloadIncremental {
    figure out incremental paths
    reliablyDownloadSnapshotFromGossip (incremental)
  }
  
  if we have a snapshot {
    try and unpack it using multiple threads
  }
  
  if we have an incremental snapshot {
    try and unpack it using multiple threads
  }

if we ended up here, ensure we have requested download types and 
  {
    return manifests and snapshots 
  }
else
  {
    return error
  }
}

// download snapshot of requested type with retries
function reliablyDownloadSnapshotFromGossip(snapshottype) returning nothing
{
  loop forever {
    availablesnapshotpeers = find peers from gossip ignoring blacklisted / toslow
    
    for over peers who have snapshot of type snapshottype {
    attempt download
    if too slow {
	    add to slow peers
	    continue loop,  try next peer
    }
  }
}

function downloadFile() returning nothing
{
  // no change
}

@0xNineteen
Copy link
Contributor

unfortunately i dont think itll be that separate -- likely it will look something like this:

  // download full snapshot 
  // for (0..n_incremental_attempts) 
  //           attempt_download_with_matching_full_snapshot() // find a matching peer and attempt download
  //           if success: return // can exit the method woo
  //           else: search for more peers with matching base_slot + hash // add more peers which could match (if any)
  // if no success: download a different full snapshot and re-try again 

@clarkezone
Copy link
Author

Yea I actually got it working o rate weekend but didn't update the issue yet, thx for chiming in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

No branches or pull requests

4 participants