Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migration to fix database edges state #5640

Open
wants to merge 2 commits into
base: stable
Choose a base branch
from

Conversation

LucasG0
Copy link
Contributor

@LucasG0 LucasG0 commented Jan 31, 2025

Fixes IFC-1204

Detail of the issue is within migration code

I updated the PR to have 3 separated queries:

  • One fixes incorrect -global- edgrs
  • One fixes missing to time
  • One fixes incorrectly deleted relationships edges when a node has been deleted

Note that, testing separatily each query might not be relevant as:

  • NodeListGetRelationshipsQuery filters on both requested branch AND -global- even for full aware nodes/rels, so setting an edge on global would not break it
  • Removing to time of a deleted edge might not have impact as query would still work as it would detect the deleted edge.

So in the end, a single test has been added reproducing (almost) corrupted state observed on user db.
Note this test does not trigger IFC-1204 bug, but a manual test confirmed this migration fixes it.

@github-actions github-actions bot added the group/backend Issue related to the backend (API Server, Git Agent) label Jan 31, 2025
Copy link

codspeed-hq bot commented Jan 31, 2025

CodSpeed Performance Report

Merging #5640 will not alter performance

Comparing lgu-fix-to-time-migration (699adfb) with stable (7da5c9f)

Summary

✅ 11 untouched benchmarks

pyproject.toml Outdated Show resolved Hide resolved
@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch 7 times, most recently from 083b5df to a6518f2 Compare February 5, 2025 18:55
"""
Fix corrupted state introduced by Migration012 when duplicating a CoreAccount (branch Aware)
being part of a CoreStandardGroup (branch Agnostic). Database is corrupted at multiple points:
- Old CoreAccount node <> group_member node `active` edge has no `to` time (possibly because of #5590).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, #5590 deserves a dedicated migration fixing missing to time for any couple of deleted edge on -global- with an active edge on another branch.

@LucasG0 LucasG0 requested a review from a team February 5, 2025 19:01
Copy link
Contributor

@ajtmccarty ajtmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good start
as I say in my comments, I think we should split this into at least 2 migrations
I took a shot at writing a more general version of what I think the first migration should be

@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch 7 times, most recently from ab99efb to 4ea0e08 Compare February 11, 2025 15:30
@LucasG0 LucasG0 marked this pull request as ready for review February 11, 2025 15:34
@LucasG0 LucasG0 requested review from a team and ajtmccarty February 11, 2025 15:34
@LucasG0 LucasG0 changed the title Add migration to restore time Add migration to fix wrong db edges state Feb 11, 2025
@LucasG0 LucasG0 changed the title Add migration to fix wrong db edges state Add migration to fix database edges state Feb 11, 2025
@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch 2 times, most recently from a1d546e to 7828def Compare February 11, 2025 15:50
@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch from 7828def to a842f14 Compare February 11, 2025 17:38
Copy link
Contributor

@ajtmccarty ajtmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is most of the way there
some comments on the queries

@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch from cd4c02b to 746e80e Compare February 13, 2025 08:50
@LucasG0 LucasG0 force-pushed the lgu-fix-to-time-migration branch from 746e80e to 699adfb Compare February 13, 2025 09:07
@LucasG0 LucasG0 requested review from ajtmccarty and a team February 13, 2025 18:03
Copy link
Contributor

@ajtmccarty ajtmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking great, just a few small comments to go

"""

params = {"global_branch": GLOBAL_BRANCH_NAME}
self.params.update(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can drop this params update

MATCH (rel)-[active_edge {status: "active"}]-(peer_2)
RETURN active_edge.branch as active_edge_branch
LIMIT 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a relationship is created on a branch and then merged into main, there will actually be 2 active edges between the Node node and Relationship node
and it is possible that a node in this situation is deleted on just one of the branches for which it has an active edge
would it work to just use deleted_edge_branch here instead of trying to get active_edge_branch? deleted_edge_branch should be the correct branch to delete, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have added more comments, taking into consideration the fact that relationship and node might not have the same branch support type, we may need to retrieve active branch, in details:

  • If res is agnostic, we should delete on global branch (and we do not use deleted_edge_branch)
  • If rel is aware and deleted node is aware, we should use deleted edge branch
  • If rel is aware and delete node is agnostic, we need to create deleted edges for any branch on which this relationship exists. I did not have in mind situations where there could be multiple active branches, so I think we should actually retrieve any distinct branch for which an active edge exist, instead of a single one as it's currently done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense to me

WITH rel, peer_2, branch, branch_level, deleted_time
MATCH (rel)<-[:IS_VISIBLE]-(peer_2)
MERGE (rel)<-[:IS_VISIBLE {status: "deleted", branch: branch, branch_level: branch_level, from: deleted_time}]-(peer_2)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could also be HAS_OWNER or HAS_SOURCE edges linking a Relationship edge to a Node. I believe that these are not widely used right now, but adding 4 more subqueries to handle them won't hurt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
group/backend Issue related to the backend (API Server, Git Agent)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants