Multiple creations of the same data across a partitioned network results in repeat results in query APIs #394

pospi · 2023-07-18T05:52:22Z

I think this is partially an issue with anchored_record_helpers.rs, but it's only possible to de-duplicate this optimistically for now. (further work needed on reads to resolve duplicate writes of the same data in re-synced network partitions).

The text was updated successfully, but these errors were encountered:

pospi · 2023-07-18T06:13:14Z

This issue mostly only affects Unit records, since the majority of other record types are considered unique in the moment they're created via the addition of some randomBytes() in an internal nonce field.

Now resolved for writes- a sychronised network will no longer unnecessarily add create Action headers. Reopening as an issue to resolve generically for reads of "idempotently unique" data (ie. data with a manually defined retrieval key) which has been duplicated as a result of a network partition or partial sync.

Aside from behaviour in hdk_records/src/anchored_record_helpers.rs this also affects indexing retrieval logic in hdk_semantic_indexes and hdk_time_indexing. At minimum, any feature which depends on link_if_not_linked in the write phase to ensure uniqueness, also must be able to de-duplicate accidental repeat writes in its associated read phase.

There may be other elements to consider in a complete solution, and platform features such as 'bucketing' that could be leveraged to 'compact' data in this way in future versions of the Holochain hdi & hdk libs.

pospi · 2023-07-27T03:18:19Z

I have fixed issues with duplicate Unit record writes in an unpartitioned network in 589aeff. A single agent can no longer cause duplicate entries in the Units read API response by repeatedly writing the same record. This would have actually affected any content-addressable data written into a time index, but Unit is the only record type that operates this way (others have a nonce injected to force them to be unique, so recreating records will result in different hashes).

This does not resolve the issue for writes under partitioned network conditions, which still persists as above.

pospi added the bug Something isn't working label Jul 18, 2023

pospi self-assigned this Jul 18, 2023

pospi closed this as completed in d09dfe9 Jul 18, 2023

pospi changed the title ~~Multiple creations of the same Unit result in repeat results in query.units~~ Multiple creations of the same data across a partitioned network results in repeat results in query APIs Jul 18, 2023

pospi reopened this Jul 18, 2023

pospi removed their assignment Jul 18, 2023

pospi added this to the Holochain core stabilising milestone Jul 27, 2023

pospi mentioned this issue Jul 27, 2023

ResourceSpecifications and Units are duplicated in sync, Agents unclear Carbon-Farm-Network/app-carbon-farm-network#18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple creations of the same data across a partitioned network results in repeat results in query APIs #394

Multiple creations of the same data across a partitioned network results in repeat results in query APIs #394

pospi commented Jul 18, 2023

pospi commented Jul 18, 2023

pospi commented Jul 27, 2023 •

edited

Loading

Multiple creations of the same data across a partitioned network results in repeat results in query APIs #394

Multiple creations of the same data across a partitioned network results in repeat results in query APIs #394

Comments

pospi commented Jul 18, 2023

pospi commented Jul 18, 2023

pospi commented Jul 27, 2023 • edited Loading

pospi commented Jul 27, 2023 •

edited

Loading