-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New vocabulary for "Version" #102
Comments
What about the term "Update"? An Update can optionally have a version (ID) We can say that "a Braid subscription gives you a stream of updates" which is true whether there are versions involved, patches involved, or parents involved. Colloquially, an "update" brings you up to date on the news. "What's the latest update?" This also makes it easier to talk about the "Update headers" or the "Update body" (whereas, "Version headers" is very confusing, since "Version" is itself a header; and "Version body" is somewhat confusing, because a "Version" may not have a
Here's another example, with the
|
I'm trying to disambiguate between:
I want different words to disambiguate 2 from 1. We can't call them both "version" - thats confusing. "Here's version X" "You only gave me a patch!" "Yeah thats a version" "I wanted the whole document at version X!" ... etc. I support Duane's suggestion of "update" for (2) because that term can refer to:
And that makes good semantic sense. |
The term 'version' has always been a little confusing to me in this context (assuming I'm understanding everyone's latest thinking on it). Maybe someone can clear this up for me. Let's say Alice and Bob are both in possession of "version" D:
Their current state is clearly different. If a client connected to each of them and requested the state associated with version D, what would be reasonable for the client to expect? And if it's valid for "D" to mean different things to different actors, then how is it a useful distinction? Alternatively, maybe this concept of "version" is not intended to be used in the context of requesting state? Or maybe we don't actually want to deal with requesting state at all (meaning that every client has to be capable of doing its own merge resolution)? I've been out of the loop for a bit, so I'm probably just missing something basic. |
Yeah it is a bit confusing. I feel like there’s some patch theory we need to write down in a document somewhere. The way braid sees the world, version D refers to Bob’s document state. Alice’s document state is the merger of versions D and F. It’s named Some versioning systems have compact ways to express “the merger of A and B” - like vector clocks merge |
I had a chat with @toomim about this the other day. I think the best way to think about it is there's two different concepts at play: edges and nodes. Nodes are document versions, and edges show how one version turns into a new version with a set of patches. Mike likes that there's enough overlap that the word "Version" can be overloaded to mean either an edge and a node in different contexts. I find it confusing (it adds cognitive load), and I'm obviously not alone. I might put together a PR to replace "version" with "update". Eg:
becomes:
(Each update corresponds to an edge. When an update is applied to the parent version(s), a new version is produced.) |
I'd like to move forward with this issue, because this is coming up in code I'm writing and I'd like to avoid code churn. @toomim - As WG chair, what is our path forward here? Do you want to discuss this issue further in person? Do you have an alternate proposal? Do you agree with the proposal as written? |
Maybe we could use |
All: I'm sorry for dropping the ball as "chair" and letting this issue sit. I should have relinquished that responsibility, as I wasn't able to keep up. For now, let me take off the "chair" hat, and just comment as a group member. As a member, I've been convinced by the arguments of @canadaduane and @josephg, and agree with calling the set of changes that update state to a new version an I also see two additional questions coming up in this thread:
These concepts are all related, and it's probably worth thinking them through together so that we know they are coherent. Perhaps we can define them as something like:
(As for "transaction", I think we might like to reserve the term for future use, to refer to a set of updates in a branch that should all be commited or discarded as an atomic unit.) |
Note that point (2) above also implies that our This is useful, for instance, in the case that a server has received two parallel edits:
And now a new client wants to receive the current snapshot. The server could provide it like:
Without specifying multiple IDs inside a single |
I've drafted these changes in #113. A review would be appreciated! |
(Chair:) The discussion on PR #113 found consensus on Version and Current-Version, but led to some new ideas for Version ID, Update, and Snapshot. The task is now to find consensus on these terms. We have some suggested alternatives: Alternatives to Version ID:
Alternatives to Update:
Alternative to Snapshot:
Argument for Event:
Argument for Event ID:
Argument against Update:
|
While this might not affect Braid, but it matters for notifications in general: A POST on a resource might not necessarily change the representation of the resource. A resource can define its own arbitrary semantics for POST. However, one might still want to emit (and receive) a notification for that POST. In such a case, Update, Mutation and Change are less suitable as terms than Event. |
Ah, you're talking about the architecture of synchronization that I think of as "State Machine Synchronization." This architecture is characterized by:
POST is used for this type of synchronization. Typical example:
This is simple enough when you only have one event type (the new_message event) and only one peer that needs to implement the meaning of it (the server). However, once you start introducing multiple peers, and more complex state, this architecture really sucks. Let's say that you want to add a new state transition, like The problem is entangling the semantics of the application with the synchronization algorithm. The sync algorithm ends up being custom-implemented for the application. And any change to the data schema, or the sync algorithm, has to be faithfully re-implemented in all components of the system, or they get out of sync with race conditions, and risk data corruption. This gets super nasty. This is what we're solving by implementing general state synchronization algorithms, that don't depend on any application's semantics. Mutations become syntactic patches to general state. Merge semantics become a general merge-type. This is what React solved for front-end development. Instead of each UI component implementing a state machine (like you had to do with backbone.js before react), the UI just became a function on top of general synchronized state. This is a big key to the state synchronization revolution— moving from Synchronized State Machines → Functions on Synchronized State. So now, back to your point on general notification— yes, in the model of State Machine Synchronization, it's true that peers subscribe to semantic events, rather than syntactic changes to state. However, it turns out that any semantic event can be expressed equivalently with a syntactic state mutation. For instance, in the In conclusion, I think POST is going the way of the dodo, and so is the "Event" style of synchronization, and I think we'll do better to think in terms of Updates, Changes, or Mutations to state. We should still support the event information using a header, like "Mutation-Description: new_message(...)", in order to ease interoperability with legacy systems built as synchronized state machines, but I think our children will thank us if we give them this higher-level abstraction to work with. |
All of the above not withstanding, if I subscribe to a HTTP resource (for notifications in general, not necessarily Braid's state synchronization) and someone does a POST on that resource, I would rather like to be informed of the (irrespective of whether it mutates the resource or not). Thus, the trigger of the notification is an event that may or may not result in a mutation of that resource (that information could be useful, for example, "to know" to start observing a newly created resource elsewhere). |
Can you illustrate that with a concrete example? Can you show me the network messages that you're envisioning? What would be the POST to the server? What would the notification message look like that you want back? What would you do with that message? |
https://cxres.github.io/prep/draft-gupta-httpbis-per-resource-events.html#figure-7 I know after that message that I might want to look at |
So this notification says:
Is this just echoing the post request back to the clients? If there was a body, would that body go into the notification too? And in this scenario, are you imagining the post to /foo would result in a change to /foo's state? I figured it was going to mutate state somewhere else. I'm trying to understand the scope of the data you want back on the client, and why, by getting more of the scenario fleshed out. |
GET / HTTP/1.1
Accept-Events: PREP
POST / HTTP/1.1
Slug: foo
<body>
Now, it is not required here that state of The body of the POST request is intended for |
Thank you, that clarifies. So I'm hearing that you don't think this is needed for State Synchronization or Braid-HTTP, but that you want it for notifications in general. Do you think Braid-HTTP should support or relate to such general notifications? I have been thinking we don't need general notification streams in Braid or HTTP, because we can express everything as State Synchronization. The way I'd do the above as State Synchronization is:
GET / HTTP/1.1
Subscribe: true ...the server responds with the current index of articles: HTTP/1.1 209 Subscription
Subscribe: true
Content-Length: 2
[] ...which happens to be empty thus far.
PUT /
Content-Range: json [0:0]
Content-Length: 423
[{"slug": "/foo", ...}]
Content-Range: json [0:0]
Content-Length: 423
[{"slug": "/foo", ...}] This expresses the same behavior using State Sync, and we don't need general notifications. If you think we do need general notifications, I'd be curious why. I think constraining realtime updates to State Synchronization is more in-line with HTTP and ReST. Remember that the big advantage of ReST was that it constrained how people program networked computing systems:
Following these constraints meant that your system was more likely to be scalable, interoperable, etc. Likewise, I think the constraint of State Sync has better properties than a system running on general event streams. |
What I am saying is that we need a common language because the updates to effect State Synchronization is a proper subset of general notifications. Having a notion of events, and that not all events mutate the state of a resource but instead affect other resources, has to be a part of language. This (the language part, the thing we are discussing in the issue) is independent of how you want to do state synchronization. |
Thank you for clarifying. I can now see more specifically now where my disagreement lies:
Although it is true that State Sync can be expressed through a general notification mechanism, that does not mean that it should be. Consider that HTTP could be expressed on top of a general RPC standard. But it is not. HTTP's strength is in being constrained in the right ways, as Roy Fielding articulates in his thesis. My argument is that State Sync is an advantageous set of constraints, because it decouples synchronization from application logic. This allows, for instance:
I don't see a win for programming with general event notification streams, and I've thought about this quite a bit. I would be very curious to hear if there is a practical reason for having such a standard built into Braid, other than "it's possible to do." Possible != good. To take the argument further, consider that it's also possible to implement a notification stream on top of state synchronization. Imagine we create some
Thus, although you might see a world in which Event Streams > State Synchronization, it's also equally valid to see State Synchronization > Event Streams. Do we need to build one on top of the other? I don't see a pragmatic benefit to doing so. But I would love to be shown something I've missed. |
(This is in response not to the last comment but still the one previous to it) Now, as to the side-tracked issue, I do not like the idea of "index" resources/containers that are semantically different from regular resources. I am having the same fight over at Solid. Any resource should have its own representations and should be a point from where you can define contained resource (the notion of indexing/containment is needed primarily for inherited access control). So here if I access The thing about the constraints is that you have the burden to prove that without the proposed constraint, the property that you desire cannot be effected (something Roy does rather admirably in his thesis). I do not think this qualifies! Now, I must admit I have a bit of an unfair advantage when reasoning about this (something we can speak about next time), which is why I am so confident. |
Can you be more specific with your problem in this paragraph? I don't understand. I would greatly appreciate examples.
The constraint is separating application logic (which defines the state transitions that are allowed, ie. the state machine) from the synchronization algorithm and protocol. If you don't decouple them, you lose the property of supporting dumb middleboxes (e.g. a CDN or proxy) that interpret patches, and store and serve state without implementing the application's state machine. Is that clear? Update: Consider that your POST example relies on a server to implement the state machine for the |
There are so many different threads of conversation here that I fear I cannot respond to all of them in the time before the @toomim bot responds. It will have to wait for me to dis-entangle all the issues. |
Feel free to use the mailing list, which supports actual threading. |
That can only deal with space issue, not the time issue. (Also, I am watching a Cricket World Cup match on the side). |
Very interesting read. From what I gather, the rationale behind https://cxres.github.io/prep/draft-gupta-httpbis-per-resource-events.html is more about being able to observe requests made towards a server in a general way, so that you can decouple different systems ("oh, somebody made a POST against resource X, let me check if anything changed at X"). This looks to me like something which is easy for a server to add to enable such decoupling. While state syncing is much more integrated into the server and generally goes two ways. So I also do not see how those two things could be based on each other. |
Despite how the above discussion might appear, I believe that both @toomim (bot? ;)) and I see a lot of common ground. Synchronization will always need a notifications mechanism (ie you subscribe to a (set of) resource for updates). But does that mean the notification mechanism has to be specific to state sync or can a general notification mechanism (which can be used to listen for events in general) be constrained for it (and is still desirable). I very much think the latter is true. But this discussion has spread all over the place from the original discussion about terminology. |
A state only comes about due to events. Now there are three kinds of events:
That you might not want to include events of type 1 and 3 in Braid does not mean they go away and should be banished from a discussion about language itself. Actually, doing so will later just constrain your design space. Again, whether you choose to include event or changes in Braid, the Braid "notification" will always be a proper subset of general notifications. If you choose NOT to build a/build on top of a general notifications mechanism, that can only be justified on grounds of simplicity/efficiency/practicality, not because the two are distinct. All things said, this was a discussion about vocabulary and not design. To the extent that this discussion is conflating the two is an epistemic error. To address some specific comments:
I'll take a wager that it is not as long as HTTP exists (so at least another decade at a minimum). So I'll wait until you obsolete RFC9110.
You are actually adding (hypermedia) application semantics on top of HTTP by constraining Furthermore, the notion of having resources that are for "content" and resources that act as "index" gives me the creeps. It makes applications that support ordinary data and where schema of data is not known in advance (such as Syntropize) orders of magnitude harder, because I need to treat resources (in a logical grouping) in two different ways.
I support this in general. But in the example you have created in #102 (comment), this is not what you are doing. You are actually adding (hypermedia) application semantics on top of HTTP by constraining My larger claim: Nothing in adopting a general notifications mechanism (and then imposing constraining semantics) on it for state sync precludes this decoupling.
If you constrain this discussion for updating client (or a peer in client role) part and not the magic that server does with multiple versions in creating a consistent state before sending out updates, which is what the substance of this discussion is about, this claim is a logical impossibility (see the top of this comment)! At best (state sync <= general notifications) since updates for sync are a kind of notifications amongst other types of notifications. Also see above the problems highlighted in your example. It seems to me that you are conflating two orthogonal things as part of one mechanism -- creating consistent state and updating clients with that state. A design that conflates both this way will only create complexity and be less robust (This itself is a larger issue which needs a course and not a thread).
I am not assuming application semantics here. I am taking a rather common use case for POST and stating that I would like to be informed of it. I really don't understand this objection, so you might want elaborate. Between listening to the resource and the As I worked through answering this, I stand corrected wrt my comment to @mitar. I believe the gap between us is greater than expected after all. This is not a good media to bridge that gap as it takes a lot of time to parse through and reply to (a luxury I do not have right now, given 10-100x costs of doing the same things as you, as you are well aware from our private discussions). Apart from the fact that discussions hijacks the title/purpose of the issue (of ontology where I stand firm!), I also have a sense of being impulsively, even if very politely, bombarded with conflated issues which distracts from the stated issue. I want to say more but for the sake of bandwidth and sanity I will refrain! |
NOTE: This is not asking to change the header "Version:" to something else, merely the concept relating to the "Version:" header.
I want to elevate the discussion here to a proper braid-spec issue.
I was writing some code that used the term "Version" like the Braid spec uses it, but it caused the existing references to "version" or "patches" in the code to become ambiguous. If possible, we need clearer concepts.
I'll summarize by quoting from the above PR discussion:
The text was updated successfully, but these errors were encountered: