You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What we're trying to capture with a CVR is a record of all data that the server has ever sent to the client.
The point of this is so that the server can exclude data it has previously sent the client from future sends. In Replicache, clients get new data by pulling from the server.
What the CVR is doing, on each pull, is the following:
-- initial pullSELECT id, version FROM foo
-- pull 2SELECT id, version FROM foo EXCEPT _what was sent in pull1_
-- pull 3SELECT id, version FROM foo EXCEPT _what was sent in pull1 + pull2_
-- pull 4SELECT id, version FROM foo EXCEPT _what was sent in pull1 + pull2 + pull3_
-- etc...
Where the contents of the CVR is what is being excepted / excluded from the result set.
For the except step to work correctly:
Every new CVR must include everything that was in the prior CVR. CVR_n = data_from_this_pull + CVR_n-1
Or each CVR just represents the current pull but we check all historical CVRs on each pull
Note that we also want to EXCEPT in the opposite direction. To detect rows that were deleted we need to diff the contents of the CVR with what currently exists in the database.
Note that in both cases we can make an optimization by only keeping one entry for a row in the CVR. In other words, the CVR doesn't need to record every version of every row that was sent but only the latest version of the row sent down. A CVR missing a row is the same as the CVR showing the row at an earlier version that what the database has.
Why not use cursors?
Why not sync like:
SELECT*FROM foo WHERE version > ?last_pulled_version
where each row has a version that comes from a universal and monotonically increasing source.
The issue is that data can come into view without ever being modified.
Consider this query:
SELECT*FROM todo JOIN filter ONfilter.owner= ?userid ANDfilter.status=todo.statusWHERE version > ?last_pulled_version;
We're syncing todos but only those that match a given filter. This query could return the TODO with the largest version and then the user changes their filter to some other status. Since we've used a cursor we'll not see the new TODOs that have come into view and never sync them down.
CVRs and client driven sync
Ideally the client drives sync by specifying the queries it is interested in. These queries will change over time as a client navigates through an application.
Since the server needs to know about all the data the client has, we should union all CVRs produced by all pulls sent to a given client group.
Note that CVRs must be versioned. We can't be certain that the client actually received the data that was part of a CVR. The only way we know that the client in fact did receive this data is if the client, on the next pull, sends a CVR version equal to the last CVR we sent the client. If they send an old CVR we wind back to that CVR on the server during that pull.
Implementing a CVR in Postgres
todo: document your findings related to join performance, space saving by overwriting versions, delete culling, etc.
Performance (time)
todo: document query tricks to make this fast as well as how client driven queries make perf a minimal to no problem
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
CVR Definition
What we're trying to capture with a CVR is a record of all data that the server has ever sent to the client.
The point of this is so that the server can exclude data it has previously sent the client from future sends. In Replicache, clients get new data by pulling from the server.
What the CVR is doing, on each pull, is the following:
Where the contents of the CVR is what is being excepted / excluded from the result set.
For the except step to work correctly:
CVR_n = data_from_this_pull + CVR_n-1
Note that we also want to
EXCEPT
in the opposite direction. To detect rows that were deleted we need to diff the contents of the CVR with what currently exists in the database.Why not use cursors?
Why not sync like:
where each row has a version that comes from a universal and monotonically increasing source.
The issue is that data can come into view without ever being modified.
Consider this query:
We're syncing todos but only those that match a given filter. This query could return the TODO with the largest version and then the user changes their filter to some other status. Since we've used a cursor we'll not see the new TODOs that have come into view and never sync them down.
CVRs and client driven sync
Ideally the client drives sync by specifying the queries it is interested in. These queries will change over time as a client navigates through an application.
Since the server needs to know about all the data the client has, we should union all CVRs produced by all pulls sent to a given client group.
Implementing a CVR in Postgres
Performance (time)
Performance (space)
Beta Was this translation helpful? Give feedback.
All reactions