-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: DAG sync and move merge outside of net package #2658
refactor: DAG sync and move merge outside of net package #2658
Conversation
internal/db/merge.go
Outdated
if err != nil { | ||
return err | ||
} | ||
mt, err := mp.getHeads(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: The flow and concept-status of mergeProcessor
could be improved I think. At the moment it has the following problems:
- It is only called in this function, and given that it's functions do not return anything at any stage, it is unclear why it is not wrapped up into a single func on
db
- Party because most of the functions do not return anything, it is unclear without reading the implementation of
mergeProcessor
as to whether the order in which each function is called matters or not, and if assuming it does matter, what that order needs to be. - It is strange that
getComposites
takes the return value ofgetHeads
, when the host object is essentially a state-machine whose primary reason to be appears to manage state produced by it's host functions. - That
getComposites
does not return anything despite being calledgetFoo
confuses things further, and casts more doubt onto how many side affects each function has, making me want to read the implementation ofgetHeads
to see if it also stores state in the state-machine as well as returningmt
. - Partly because of the uncertainty around state-machineness and
mt
, and partly because it is quite dense and recursive code, I found it quite hard to be sure whatgetComposites
was actually doing when reading it's implementation.
Overall I think mergeProcessor
needs some work, although what kind of work is required I am not fully sure. Maybe removing the semi-statemachineness and some more documentation would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. I pushed it as is because I was tired of all the troubleshooting I did but I will improve it before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you like the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, and it is very definitely an improvement, but I think mergeProcessor
needs some love before this can be merged.
internal/db/merge.go
Outdated
return nil, err | ||
} | ||
|
||
col := cols[0].(*collection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Please document this [0]
: why it is okay to do so now, why it is safe (and we are sure that there will be at least one), and why it might need to change in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks :)
a7d3b0a
to
524c547
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice changes! Should make future network improvements much easier.
// DeleteDocIndex deletes the index for the given document. | ||
// WARNING: This method is only for internal use and is not supposed to be called by the client | ||
// as it might compromise the integrity of the database. This method will be removed in the future | ||
DeleteDocIndex(context.Context, *Document) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: nice to see these removed from the client interface
bp.handleChildBlocks(ctx, session, block) | ||
// Initiate a sync of the block's children | ||
bp.wg.Add(1) | ||
bp.handleChildBlocks(ctx, block) | ||
|
||
return nil | ||
} | ||
|
||
func (bp *blockProcessor) handleChildBlocks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: this is much nicer
} | ||
|
||
// If the CRDT is nil, it means the field is not part | ||
// of the schema and we can safely ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: When can this happen? The only time I can think of is if the block was created on a higher schema version with a new field, and then we would still want the block synced - I cant remember if this was fully working, or if there is a ticket for handling the datastore, but I thought the blocks would still be synced (they affect the composite cid, so I don't think it is right to drop them even if we cant yet handle the datastore side).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the block sync process. That has already been taken care of in the net
package. Here we merge the synced blocks into the datastore. Like you said, this can only happen if the block was created on a different schema version. If the node merging doesn't know about this field, it can't merge it into the datastore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good Fred - thanks a load for this :) Just one more question for you before merge :)
525783d
to
6cf9e4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
submitting now (will continue later today)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! just have couple questions and suggestions
This helps the current race condition a lot.
f1b6cab
to
c5660b4
Compare
c5660b4
to
9a817eb
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #2658 +/- ##
===========================================
- Coverage 78.06% 77.97% -0.10%
===========================================
Files 308 308
Lines 23077 23134 +57
===========================================
+ Hits 18015 18037 +22
- Misses 3690 3714 +24
- Partials 1372 1383 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 12 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
Relevant issue(s)
Resolves #2624
Description
This PR simplifies the DAG sync process withing the
net
package and moves the merge functionality to thedb
package. The merge is now initiated via an event channel.Note: I did a search and replace for
SchemaVersionId
toSchemaVersionID
. It's in its own commit. I've also remove thetests/integration/net/order
tests as they are now annoying to maintain an will become even more irrelevant when we refactor the WaitForSync functionality of our test framework.Another note: I've reduced the severity of the race condition on my Mac. We had a lot of leaking go routines and what is left of them is WaitForSync methods that sometimes seem to leak and also badger cache and libp2p transport that seem to leak go routines on close but I'm not sure how to handle these last two.
Tasks
How has this been tested?
make test
Specify the platform(s) on which this was tested: