Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Datastore interfaces replaced with corekv #3298

Draft
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

jsimnz
Copy link
Member

@jsimnz jsimnz commented Dec 8, 2024

Relevant issue(s)

Related to #3275 #1813

Description

This PR is ONLY as a reference for #3275 on going by Andy. This refactor is based on a rather old (back in February ~v0.9) version of Defra. There hasn't been many changes to the datastore package since then, but there has been a few changes to the fetcher refactor, beyond the notable #3277 open PR.

This Draft PR should be able to clearly demonstrate the design approach for how to integrate corekv, how we can delete almost all of the datastore package.

What isn't shown in this PR that is a win for replacing go-datatsore with corekv is just how much more simple the internals of the corekv.Iterator is in comparison to the ds.Query, not just in API, but internals of the how the Query system actual is evaluated.

Initial refactor to integrate corekv into the codebase, focusing on two goals

  1. Simplifying the datastore package design
  2. Setup the fetcher to use the Seek API in the fetcher.DocumentFetcher design to enable early filter optimization

Note: The 2nd goal isn't actually implemented, work was started recently on converting the fetcher seek functionality into this refactor branch, but considering the current open PR for refactoring the fetcher #3277, and the delay to integrate the filter-seek optimization until it is formally verified to be more efficient, i'll leave that work out of this branch.

Additionally, this refactor should be reviewed as a "plain text" refactor. Which means we're not aiming to optimize or redesign, soley to swap datastore interfaces and implementations with the least number of semantic differences.

How has this been tested?

IT HASN'T, it isn't finished to actually be compilable and therefore testable.

Notes for Reviewers (Andy ;) )

The commits are in a farily organized and chronological order of operations for the refactor. Some of the commits have some extra context that might help understand reasons.

jsimnz added 11 commits January 3, 2024 04:29
Currently focusing on some low level structures/APIs such as the
basic getter/setter interface changes, handling keys that are now
`[]byte` instead of `ds.Key`, and a few other smaller items.

Haven't started the tests, fetcher, or any calls to the old
`Query` API (need to be migrated to the new Iterator API).
This was a "plain reading" refactor to replace the query API with the
Iterator API. There may need to be different behavior for some reason
(I don't think so, but maybe). The one thing to note that is different
is that creating an Iterator doesn't return an error, and there is no
`result.Error` value per iterator. The reason for the latter is that
the old query.Query batched/collected a lot of results before the
user actually asks for it, so it needs to collect the errors as well.
Since this is a simpler iterator, there is no need for that.

Addtionally, there may be a few call sites that don't close the
iterator/query from before. I've updated a few that I noticed, but
there may be a few remaining. THIS IS A BUG if they aren't closed.
With the new design, we get to remove the dual Iterator structure
(the first isn't really an iterator, but a handle to create
iterators). Now there is just a single iterator `kvIter`.

We also don't need the previous complex `Iterable` interface that was
defined in the `datastore` package. The corekv Iterator natively
supports both Prefix and Range (start, end) iteration controlled by
the `corekv.IterOptions`.

Again, this is a "plain text" refactor without assuming anything.
NOTE: There is a filter in the indexer iterator file that hasn't been
included, again focusing on the "plain text" refactoring.
@jsimnz jsimnz added area/datastore Related to the datastore / storage engine system refactor This issue specific to or requires *notable* refactoring of existing codebases and components labels Dec 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datastore Related to the datastore / storage engine system refactor This issue specific to or requires *notable* refactoring of existing codebases and components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant