-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement deleter #470
Conversation
dac5f62
to
008a185
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #470 +/- ##
==========================================
- Coverage 73.22% 73.09% -0.14%
==========================================
Files 49 49
Lines 4602 4642 +40
==========================================
+ Hits 3370 3393 +23
- Misses 923 935 +12
- Partials 309 314 +5 ☔ View full report in Codecov by Sentry. |
Hello @shizhMSFT @Wwwsylvia, This is the start for the delete implementation, I would definitely appreciate a bit of feedback on this PR. |
@carabasdaniel Thanks for your contribution. We recognize and are reviewing this PR. Since it is related to design, it takes more time to be reviewed but should finish in 1 day or two. |
@@ -192,6 +192,22 @@ func (s *Store) Resolve(ctx context.Context, reference string) (ocispec.Descript | |||
return desc, nil | |||
} | |||
|
|||
func (s *Store) Delete(ctx context.Context, target ocispec.Descriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One tricky thing about deleting is that we need to maintain the predecessors relationship as well. See Store.Push and Store.Predecessors.
This also applies to other GraphStorage
that implement PredecessorsFinder
.
content/oci/oci.go
Outdated
resolvers := s.tagResolver.Map() | ||
for reference, desc := range resolvers { | ||
if content.Equal(desc, target) { | ||
s.tagResolver.Remove(reference) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if some is pushing and deleting at the same time? There is a race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The targetResolver.Map() represent an inconsistent snapshot of the storage content as far as I understand, therefore I'm not sure if a delete of a pushing content would be possible as the tagResolver is actually a syncMap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carabasdaniel Since it is possible that someone is trying to push and delete at the same time, we have to trade-off some performance for the deletion.
Similar to what I have mentioned in other comments, the current oci.Store
is append-only and thus it has the benefits that multiple push and fetch can happen at the same time without race conditions.
However, adding the functionality of Delete()
makes the store mutable, and we have to change the underlying implementation from sync maps to a RWMutex
to ensure consistency. The drawback is that we no longer can push and delete simultaneously.
Overall, I suggest
- Drop
Delete()
foroci.Store
. - Develop
oci.MutableStore
withDelete()
support.
/cc @Wwwsylvia
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I suggest
Drop Delete() for oci.Store.
Develop oci.MutableStore with Delete() support.
It makes sense.
@carabasdaniel I would also like to learn your use cases of deleting content in OCI layout. Is it for GC purpose? Could you share a little bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Wwwsylvia, @shizhMSFT,
We have a tool used for building OCI images from rego policies: policy CLI.
We have been using the previous major version of oras-go to build our containers and interact with upstream OCI compliant registries. In our case we want the user to be able to build and easily remove containers from the local OCI storage, thus the need to have a delete mechanism in place, keeping the same workflow feel as the docker CLI.
At the moment we use a fork that includes these changes in our v0.2.0 release candidate versions of the policy CLI and it seems to work with our current needs.
I think having a different store might make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thank you for sharing this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Wwwsylvia for additional use case info... I am working on https://github.com/awslabs/soci-snapshotter which uses https://pkg.go.dev/oras.land/oras-go/content to keep content in the local filesystem. Our artifacts are conceptually similar to image and layer metadata; we store one artifact per image we handle, and one artifact per layer in those images, with one-way references (via a list of digests equivalent to image layers) from artifacts of the first type to artifacts of the second type.
Hi @shizhMSFT @Wwwsylvia, Thanks for the feedback, I'll start working on the required changes. |
8812296
to
8c25175
Compare
content/storage.go
Outdated
// DeleteStorage represents an extension of the Storage interface that includes the Deleter. | ||
type DeleteStorage interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Storage
interface abstracts append-only storages. Adding Deleter
makes the storage a MutableStorage
.
@Wwwsylvia Do you have a better name suggestion on DeleteStorage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MutableStorage
sounds like that the existing data can be modified or deleted. But in our case, the data can be deleted but cannot be modified.
How about StorageDeleter
or DeletableStorage
?
BTW, do we really need to define this interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type StorageWithDelete interface {
Storage
Deleter
}
This is from my current efforts in the same direction as this PR. If you don't define this interface, folks like me will just have to do it themselves, and our naming divergence will make it harder to find usage examples in the wild.
content/storage.go
Outdated
// DeleteStorage represents an extension of the Storage interface that includes the Deleter. | ||
type DeleteStorage interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MutableStorage
sounds like that the existing data can be modified or deleted. But in our case, the data can be deleted but cannot be modified.
How about StorageDeleter
or DeletableStorage
?
BTW, do we really need to define this interface?
I find myself preparing to implement similar functionality as part of https://github.com/awslabs/soci-snapshotter and want to check in on this PR. I would much prefer to contribute upstream and then consume that. Is this still under development? Would additional effort be welcome? |
@@ -1993,3 +2042,114 @@ func equalDescriptorSet(actual []ocispec.Descriptor, expected []ocispec.Descript | |||
} | |||
return true | |||
} | |||
|
|||
func TestStore_Delete(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can some/most of this be deduplicated with the memory_test.go TestStoreDelete with similar contents? Ditto in storage_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like maybe two other changes needed here as well as the deletes
Signed-off-by: carabasdaniel <[email protected]>
Co-authored-by: Terry Howe <[email protected]> Signed-off-by: carabasdaniel <[email protected]>
Co-authored-by: Terry Howe <[email protected]> Signed-off-by: carabasdaniel <[email protected]>
Co-authored-by: Terry Howe <[email protected]> Signed-off-by: carabasdaniel <[email protected]>
Merged suggested changes and rebased on top of latest upstream. Thanks for the help @TerryHowe. |
@carabasdaniel There appear to be two outstanding comments referring to needed functionality for updating My limited understanding is that graph is used for predecessor/successor (aka parent/child) lookups and traversal of content and is populated in Store.Push where it calls graph.Index and during loadIndex when it loads index.json and then calls graph.IndexAll, and resolver is used for looking up content by tags and is populated by Tag (which may require something like your Untag to undo although I haven't looked far enough to determine if it would meet that need) |
Yes it's right. |
Hi @sparr @Wwwsylvia, Currently my bandwidth is very limited so I'm not sure when I'll be able to come back to this. For our current needs we are using a fork that has this change and the UnTag change merged. In addition to the recent changes I made here I also prepared an untag branch that would build on top of this PR for the I would appreciate any help to get this set implemented/merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is technically challenging to add Delete()
to the existing *oci.Store
, as mentioned in #470 (comment), I would suggest developing a new store named *oci.DeletableStore
where the functionality of Predecessors()
can be dropped so that the difficulty of the development should be dramatically reduced.
It should also meet the requirements of https://github.com/opcr-io/policy as it does not leverage the functionality of Predecessors()
.
@@ -72,6 +72,18 @@ func (s *Store) Resolve(ctx context.Context, reference string) (ocispec.Descript | |||
return s.resolver.Resolve(ctx, reference) | |||
} | |||
|
|||
// Delete a target descriptor for storage. | |||
func (s *Store) Delete(ctx context.Context, target ocispec.Descriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.resolver
is for the tagging service, i.e. for Resolve()
, as Resolve()
should not resolve a reference to a deleted descriptor. Similarly, s.graph
is for Predecessors()
.
Since this PR mainly focuses on OCI store, we can keep the changes to the Memory store in another PR.
@@ -116,6 +116,25 @@ func (m *Memory) Predecessors(_ context.Context, node ocispec.Descriptor) ([]oci | |||
return res, nil | |||
} | |||
|
|||
// RemoveFromIndex removes the node and all its predecessors from the index and predecessor maps. | |||
func (m *Memory) RemoveFromIndex(ctx context.Context, fetcher content.Fetcher, node ocispec.Descriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems fetcher
is not used anywhere.
// Delete removed a target descriptor from index and storage. | ||
func (s *Store) Delete(ctx context.Context, target ocispec.Descriptor) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #470 (comment), the Delete()
method might be called concurrently with other methods like Push()
and Resolve()
. Race condition is a concern here.
Speaking only for myself, an implementation of a new Store type with Delete functionality but without Predecessors would not fit my current use case. My current solution is to delete files then recreate index.json then recreate the content store so that the graph will be reindexed from scratch. |
The main technical challenge here is that the current To unblock this PR attempt, @wangxiaoxuan273 could you prototype a |
Where is this going. Any progress on the |
No progress so far. I'm still in the stage of investigating. But it's likely that this PR will not be continued, and we will open a new one to fix this issue. @TerryHowe |
FYI, here is the PR authored by @wangxiaoxuan273: #614 |
This should close: #454