Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting multiple API versions #2353

Closed

Conversation

gravitystorm
Copy link
Collaborator

I'd like to support multiple API versions, so that we can deploy API 0.7 in parallel to API 0.6. It's not something we've ever done before, but I think it's the only reasonable way to handle API version changes these days!

I'm opening this PR just to receive any initial feedback on this proposed approach to the code changes. So far this approach demonstrates the basic concepts, including:

  • Supporting different scenarios, e.g. dropping old api calls, adding new api calls, api calls that are the same in each version, api calls that are similar but return different results (e.g. different response codes)
  • Optimised for the assumption that almost all api calls are the same between different versions.
  • Being able to choose which versions are deployed. This allows the codebase to support multiple API versions during development and when running the tests, and so it avoids long-running branches.
  • Only code that is different between api versions is in any way duplicated. Otherwise the same code is used for each api version.
  • Establishes a naming convention for controllers that behave differently in different API versions. The general idea is that old code lives in specific controllers, e.g. API::V06::CapabilitiesController, and new code lives in the normal place e.g. API::CapabilitiesController. This makes future upgrades easier, since the assumption is that new code will be used in version N+1 too.
  • It's all designed to allow multiple, non-contiguous versions. For example, if we yank a future version 0.8, it'll cope fine with e.g. deployed_versions = ["0.6", "0.7", "0.9"]

If you are interested in seeing how it works, then the changes to "config/routes.rb" along with the output from "bundle exec rake routes" are the best way to see the overall idea.

The biggest drawbacks so far are around the tests.

First, I think it makes sense that every API version that the codebase knows about is fully tested, i.e. even if the result is expected to be the same, a given api test should run once for each api version. This leads to the indentation of all the tests changing, since we need to add a "all_api_versions" loop around almost every test. So the diffs and git blame are horrible.

Secondly, the controller methods involve lots of changes like this:

- get :show
+ get :show, :params => { :api_version => version }

It leads to a lot of extra params => ... to skim read, which isn't ideal. I've tried working around this but the workarounds have their own drawbacks.

The tests don't all pass yet, so this is not yet in shape to be committed, but I hope to get it there soon. In the meantime, and before I make any further changes that you might not be happy with, any feedback is very welcome!

@gravitystorm gravitystorm added api Related to the XML or JSON APIs work-in-progress Pull request is not ready to be merged labels Aug 21, 2019
@mmd-osm
Copy link
Contributor

mmd-osm commented Aug 21, 2019

I haven't looked through all the details yet, so please bear with me. I was wondering a bit how you would handle evolutionary or even revolutionary changes to the data model. Typically this is one of the most troubling and complicated aspects of new API versions, and I think it would be good to have some basic concept in place as well here.

@gravitystorm
Copy link
Collaborator Author

I was wondering a bit how you would handle evolutionary or even revolutionary changes to the data model.

I can sum this up with the phrase "it depends"!

  • Different versions of the API can use different views, so if we added something to a future api (for example "node coordinate precision" or somesuch) it can be shown in the latest version and ignored in previous versions, or shown in a different way (e.g. as tags).
  • Different api versions can use different controllers too, as already demoed in this PR for the capabilities controller. This allows more subtantial changes in behaviour between api versions, and changes like fixing response codes.
  • We could use different models too. I'm not sure why we'd want to, but e.g. api/0.7/node/1 could call a node_controller that fetches data from the ShinyNewNode model.

But it all really depends. I think the underlying question might be "how do we handle drastically non-backwards-compatible changes, like converting all closed ways into an area type (or if we introduce an area type into API N+1, how does that work with API N clients); or how would a change like removing segments work with two api versions running in parallel. I don't have an answer for these major structural changes, except to say that if it's logically possible at all to access the data between versions, the code approach in this PR will be able to handle it. And perhaps there'll be some change to the datastructure that prevents parallel API versions and it'll trigger a hard cutover between versions.

But lets not get stuck on the biggest problem, and one that's not yet in hand. In the meantime, there's a ton of backwards-compatible but needs-API-bump changes that have been stuck for years, so we can at least sort out those ones 😄

@gravitystorm
Copy link
Collaborator Author

I should have said, if there's specific API changes that you're thinking about, even if they are just to illustrate a point, let me know and I'll see how they fit.

@simonpoole
Copy link
Contributor

simonpoole commented Aug 22, 2019

But lets not get stuck on the biggest problem, and one that's not yet in hand. In the meantime, there's a ton of backwards-compatible but needs-API-bump changes that have been stuck for years, so we can at least sort out those ones 😄

Likely making myself very unpopular:

  • where are all these "backward-compatible but needs-API-bump changes that have been stuck for years" changes documented?
  • in general I oppose all of this on the grounds that the GDPR related changes need to be implemented first and I'm actually surprised that "nice to have" API changes are even being given a seconds thought in the current situation.

@gravitystorm
Copy link
Collaborator Author

gravitystorm commented Aug 22, 2019

where are all these "backward-compatible but needs-API-bump changes that have been stuck for years changes" documented?

Find your favourite API-0.7 wishlist, ignore the stuff about areas, and there's your list. 😄

Different people have created different lists over the years. I'm not intending to implement many changes, just the ones that I've been personally complaining about since API 0.6 was released over a decade ago (like incorrect http status codes, and plain-text responses, and stuff like that).

in general I oppose all of this on the grounds that the GDPR related changes need to be implemented first

I know that you want them first, but that doesn't mean that they need to be done first.

In particular, if the GDPR-related changes can be made without changing the API version then they can be implemented in parallel to this work, and so they are not interdependent. If they need to break API compatibility, then this work will need to come first anyway. So either way, they don't need to come first.

@bhousel
Copy link
Member

bhousel commented Aug 22, 2019

I should have said, if there's specific API changes that you're thinking about, even if they are just to illustrate a point, let me know and I'll see how they fit.

Just off the top of my head, we'd see some immediate performance improvement on the iD side from:

(these require some coordination from the CGImap side too, but I don't think that's a blocker)

@simonpoole
Copy link
Contributor

simonpoole commented Aug 22, 2019

Except, naturally, that neither of #2221 and #2348 depend on this PR.

Without it (this PR) we would have simply added them as a 0.6 feature, as we have done so many times before and not doing that raises the whole question of versioning of the OSM API in general which is a rabbit hole of its own.

This ensures that raw XML links point to the latest available version
This could be reworked with some meta programming to get the latest API version in future.
This is because the changesets api is now multi-version
@gravitystorm gravitystorm removed the work-in-progress Pull request is not ready to be merged label Aug 28, 2019
@gravitystorm gravitystorm changed the title [WIP] Supporting multiple API versions Supporting multiple API versions Aug 28, 2019
@gravitystorm
Copy link
Collaborator Author

I've updated this PR so that the test suite passes, and it's now ready for review.

Currently only a few routes have been adapted for multiple API version support, namely:

  • capabilities
  • permissions
  • changesets

I intend to work on the rest of the routes in subsequent PRs. The settings in this PR ensure that only 0.6 is deployed by default, so it can be merged without any side effects.

@simonpoole
Copy link
Contributor

Could we get it on the sandbox first?

@gravitystorm
Copy link
Collaborator Author

Could we get it on the sandbox first?

First? Do you mean before code review?

Or if you want it before merging, to what end? So far this PR is just internal code refactoring, there's no changes to the API (other than dropping one line from the api/0.7/capabilities response) and even if this is merged there will still be no changes since 0.7 is disabled. So we can ask Tom to set up a sandbox "first", but I'm not sure what you would want to do with it?

Of course, it'll be worth having a sandbox available later on, but I don't think it's worthwhile effort at this stage.

If you still have concerns, let me know what I can do to help.

@simonpoole
Copy link
Contributor

simonpoole commented Aug 28, 2019

Could we get it on the sandbox first?

First? Do you mean before code review?

Before deployment, which in our case implies before merger.

Or if you want it before merging, to what end? So far this PR is just internal code refactoring, there's no changes to the API (other than dropping one line from the api/0.7/capabilities response) and even if this is merged there will still be no changes since 0.7 is disabled.

Famous last words. In reality there are always things that might break, for example as when the authorisation refactoring was deployed.

Being able to test against a deployment, while not a panacea, at least gives us a fighting chance to ferret out any assumptions that no longer hold true and so on.

@mmd-osm
Copy link
Contributor

mmd-osm commented Aug 30, 2019

I'm not intending to implement many changes, just the ones that I've been personally complaining about since API 0.6 was released over a decade ago (like incorrect http status codes, and plain-text responses, and stuff like that).

That's the bit I least like about the current multiple API version idea: it seems to focus on something I would describe as cosmetic changes only. That's ok, except for it only adds work to consumers of the API while offering no real value to them at all. Since they already have a working API integration, they would need to spend extra development time and effort to get back to a status quo.

Isn't there anything more compelling to do that would give API consumers more of an incentive to move to the latest and greatest version?


Unrelated question: do we want to support API clients using both 0.6 and 0.7 at the same time to support a gradual transition phase?

@gravitystorm
Copy link
Collaborator Author

adds work to consumers of the API

existing consumers of the API. Some of these headaches and quirks need to be solved by every API consumer, and the total number of future API consumers yet to be written vastly exceeds the ones written so far. So the sooner we fix them, the better, and it's a shame they've been known about and unfixed for so long already.

Isn't there anything more compelling to do that would give API consumers more of an incentive to move to the latest and greatest version?

Maybe we will want to put some of the more compelling things in v0.8, or v0.9, or something. But I'm determined to avoid getting into the same situation as has happened over, and over, and over again, where the scope of v0.7 expands inexorably until it collapses under its own weight! I'd rather break up the logjam and work on smaller, more frequent improvements (e.g. every 18-36 months, rather than 10+ years and counting) to the API. And I'd rather keep the API changes small so that a developer can upgrade their app with a small amount of code changes that's feasible in a weekend of hacking, rather than some significant API upgrade that puts them off doing any of it and leaves them stuck on v0.6. And making life easier for the developers is the whole point of adding multiple version support, so that they can upgrade when it suits them best.

Anyway, let's talk about this further in a future issue, since we're burying the point of this PR ("is this the best code approach to multiple version support? Can you see a better way of coding the tests?") in wider discussions.

@simonpoole
Copy link
Contributor

simonpoole commented Aug 30, 2019

@gravitystorm you might want to consider making a statement as to versioning of the API (for example if it will follow semver semantics going forward), or will we have to continue to assume that every version change is breaking as it is now (which is the real reason why the version is stuck at 0.6).

PS: you probably have to consider decoupling data model version from the API version too, as changing the versions used for the former will likely break every tool out there.

@@ -7,27 +7,54 @@ class CapabilitiesControllerTest < ActionController::TestCase
def test_routes
assert_routing(
{ :path => "/api/capabilities", :method => :get },
{ :controller => "api/capabilities", :action => "show" }
{ :controller => "api/v06/capabilities", :action => "show", :api_version => "0.6" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we want to use semantic versioning like 1.2.0, how would this be reflected specifically in directory names like v06?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmd-osm a reasonable assumption IMHO would be to only actually have different "directory" names for major versions (aka breaking changes), the client can then determine from the capabilities which minor version is actually supported and from that determine which backwards compatible features are present.

@@ -10,7 +10,7 @@ OSM = {
MAX_REQUEST_AREA: <%= Settings.max_request_area.to_json %>,
SERVER_PROTOCOL: <%= Settings.server_protocol.to_json %>,
SERVER_URL: <%= Settings.server_url.to_json %>,
API_VERSION: <%= Settings.api_version.to_json %>,
API_VERSION: <%= Settings.api_versions.min_by(&:to_f).to_json %>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this value set to "min_by", and what are the implications of it? Does &:to_f play nice with semver (e.g. 1.2.0)?

Copy link
Collaborator Author

@gravitystorm gravitystorm Aug 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mainly just min_by to keep everything working for now, since it will pick 0.6 unless the site operator chooses to only deploy 0.7.

It's used for the bits of the website that talk to the API, like notes and changeset comments. Since there's no changes yet, it's more of a "pick either" situation. .max_by would work fine too.

On a wider point, I'd rather work on refactoring those bits of the site to just be regular webpages like diary entry comments, but that's a different project!

# Our api versions are decimals, but controllers cannot start with a number
# or contain punctuation
def v_string(version)
"v#{version.delete('.')}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of semver again, would it be better to replace "." by "_" maybe, so it's v0_7 and v1_2 ? How about patch version number? Ignore or include?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the comments, semver would only use major version numbers here. So e.g. v7 or v124. But it's a great point to raise, thank you.


# simple diff to create a node way and relation using placeholders
diff = <<CHANGESET.strip_heredoc
<osmChange>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At one point we'd also need to include a version number in <osmChange> (maybe not now)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I noticed that during the refactoring, but lets leave that for now since we're not planning any changes to the osmchange format (afaik).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I pointed out above, we return the data model version in all kinds of places, not just in OSC format. That needs to be decoupled from the API version.

@gravitystorm
Copy link
Collaborator Author

So, trying (desperately) to drag the conversation back to the contents of this PR - does anyone have any comments on handling the indentation of the tests? I really dislike PRs that combine widespread indentation with additional changes to the code, since I find it very hard to spot where the real changes are and which lines are just indented with no other change.

I've been considering whether I should split out the indentation into a separate PR, with a dummy indent method like

def indent
  yield
end

and then make a PR with the methods indented but with no other changes

  def test_something
-   code
-   code
-   post :foo
-   code
+   indent
+    code
+    code
+    post :foo
+    code
+   end
  end

and then in the second PR it will be more obvious which of the lines have real changes.

  def test_other_thing
-   indent
+   all_api_versions.each do |version|
      code
      code
-     post :foo
+     post :foo, params => { etc }
      code
    end
end

Or am I worrying about this too much, and the PR is fine the way it is now?

@iandees
Copy link
Contributor

iandees commented Sep 4, 2019

@gravitystorm If it helps, one of the options in the GitHub pull request view is to ignore whitespace changes. It's in one of the hamburger menus when viewing the diffs.

@simonpoole
Copy link
Contributor

@gravitystorm you are essentially making an argument for versioning the representation too. Up to now the abstract underlying data representation has essentially been documented (if you so want) by the XML representation and has been in lock step with that.

Now clearly we could change the representation in isolation, as your example suggests, without changing the API nor the underlying abstract data model and this would clearly require changing the API version too, because without that you would not know that you are getting a non-backwards compatible XML doc. But the ITU lies in the direction of doing and supporting that kind of change, so I hope we are not venturing there.

But the other way around does not imply the same thing, for example all the changes that have been made to the API to date work just fine with 0.6 XML documents. And for example it would be -very- surprising if implementing #2348 and jacking up the API version would suddenly result in output that is not parseable by tools (at least those that do proper validation of their input) that would in principle work just fine if the gratious change to the "representation version" hadn't been made.

@gravitystorm
Copy link
Collaborator Author

you are essentially making an argument for versioning the representation too.

No, I'm explaining that the version is the API response version, not a version of the data model. For example, nothing changed in the trace responses when we last changed the API version, but the responses to api/0.6/gpx/nnn are all <osm version="0.6"> and not 0.4 or 0.3 or whenever that part of the underlying data model changed. So in API 0.7, if they have the same underlying data model, or even the same representation, they will definitely have 0.7 in their responses.

So I'll say it again, the version in the documents indicates the version of the API used, not the version of any underlying data model.

Up to now the abstract underlying data representation has essentially been documented (if you so want) by the XML representation and has been in lock step with that.

No, it hasn't. There's been lots of changes to the underlying data model in the last 10 years, none of which have had a change in API version number - because even though the data model has changed, the API request/responses have only changed in backwards compatible ways. For example, we added notes. That was a backwards-incompatible change to the data model (you can't fit notes into a database that doesn't have a table for them, for example) but a backwards-compatible change to the API, hence no version number change.

And for example it would be -very- surprising if implementing #2348 and jacking up the API version

Nobody has suggested that adding a new backwards-compatible API call needs a new API version - except for you, when you wrote "Without it (this PR) we would have simply added them as a 0.6 feature". So I'm not going to rebut something of your own creation that you are now arguing against.

If you have another example, that would be helpful.

if the gratious change to the "representation version" hadn't been made.

I'm not proposing any "gratuitous" changes to the API version. The only reason I'm proposing to change the API version is to allow us to make backwards-incompatible changes that can't otherwise be made. The whole point of this PR is to allow us to run multiple API versions in parallel - it's even in the title! So any tools that can only parse one version can keep using that version. Gratuitous change would be to yank the old version when the new one is deployed, or to bump version numbers for backwards-compatible changes, and I'm proposing neither of those.

@gravitystorm
Copy link
Collaborator Author

@gravitystorm If it helps, one of the options in the GitHub pull request view is to ignore whitespace changes. It's in one of the hamburger menus when viewing the diffs.

Thanks @iandees, that's really helpful. Took me a while to find it, even after you'd told me it was there somewhere!

@simonpoole
Copy link
Contributor

simonpoole commented Sep 4, 2019

If you have another example, that would be helpful.

OK, you change the error responses to be returned in a structured format and not the current plain text.

@gravitystorm
Copy link
Collaborator Author

@tomhughes I'd love to hear your feedback on this PR - for example on the indentation question, or the overall approach for how to support multiple versions

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 20, 2019

I still very much disagree with the overall approach to tightly couple the OSM XML header version number to the version number that is part of the URL I found this blog post raises a valid point here:

Another conversation that I have sometimes is about how to relate format changes to API versions. In short, they should be completely separate; formats can have lives of their own, and to get the most value out of them, they should do so. It’s fine to say “Version 2 of the API requires the foo resource to support version 5 of the bar format,” of course.

Let's be fair, changing the OSM XML version header from 0.6 to 0.7 would cause some breakage for no good reason, where in reality you're maybe only trying to change the HTTP response codes or produce some nice error messages. In both cases there's no valid reason at all to also change the OSM XML version number.

I still haven't seen an answer to a point I raised earlier on, how the long term evolution of new version numbers should look like, in particular, for how long old versions will be supported. Without a clear strategy in place, you would keep on adding more and more versions over time to accommodate for API consumers unable or unwilling to move to a newer version. Today, they don't think about "sun-setting" a version, but in the future they will have to, and they need to be aware of that up front.

Handling multiple versions in parallel will add some mental strain when working with the code, even when they share a large part of the same code base. Please keep that in mind so the additional complexity won't kill the code in the long run.

@pnorman
Copy link
Contributor

pnorman commented Oct 22, 2019

long old versions will be supported

It'd have to depend on the what is a reasonable time for clients to move to a new version, and how much dev work is involved in maintaining the old version. It might be necessary to make an API version read-only too

@gravitystorm
Copy link
Collaborator Author

to tightly couple the OSM XML header version number to the version number that is part of the URL

But these are, and always have been, the same number. The number in the response is the API version number, full stop. There's no such thing as an "OSM XML header version number" as an independent concept as you are describing, one that provides a "format version" that could stay the same as the API behaviour and version number changes.

Consider the "api/X/gpx/1" response. Every time the API version number changed, the number in the trace response changed, even when there's no other change in the XML. Therefore that number is the API version number, and not some independent "XML format version" which just happens to be 0.7 too.

Anyway, this is not a new concept being introduced by this PR, so I'm going to try to avoid debating it further here. Perhaps there is a need for a "OSM XML format version" (or, as previously discussed, an "OSM data model version", which is a third distinct concept), but that would be separate from the API version currently contained in the responses.

changing the OSM XML version header from 0.6 to 0.7 would cause some breakage for no good reason

Given that we haven't decided what the final list of features are in the next version, it's a bit premature to say that it's going to be for "no good reason"!

One of the points of this PR is to introduce a mechanism that allows us to implement whichever changes we see fit, independently of when they are deployed, by using a "feature flag" concept. That way we can implement a bunch of different improvements, and it gives us flexibility to decide when enough things are implemented to make the deployment of 0.7 "worth it". Until that point, it all lives behind the Settings.deployed_api_versions flag.

Also, in reality few things are going to break when 0.7 is released, since they can keep working with 0.6. Again, one of the purposes of this PR is to allow us to run multiple versions side by side. So no clients will talk to 0.7 until the developers make them compatible with 0.7.

For software that is consuming OSM data without interacting with the API (e.g. osm2pgsql), sure, some of them wouldn't understand today what to do with e.g. an osm file saved from a api/0.7/map call. But that's fine, since we haven't decided what changes there will be yet! By the time we finally release 0.7, those changes could be either trivial or non-trivial to adapt to, and most software will either be adapted already, or users can get their data from a 0.6 source (e.g. api/0.6/map), or run it through an osm07to06 utility, which for some elements like nodes/ways/relations might be a noop, or for notes/traces/changesets might be non-trivial. Who knows yet?

So we'll need to see at that point whether it's for "no good reason" or not, it's not something I can decide on before we've even started implementing anything.

for how long old versions will be supported.

As Paul says, "that depends". I'm not going to debate here how long to keep 0.6 running after 0.7 is released, since at this rate we're never going to see 0.7 in the first place!

Handling multiple versions in parallel will add some mental strain when working with the code, even when they share a large part of the same code base.

Absolutely, that's a big factor I considered while implementing this PR. If you have any comments on the approach used in this PR, I'd like to hear them. I'm very much open to suggestions as to how to streamline having multiple versions in the codebase. I'm happy with the current approach but alternative suggestions are valuable.

@simonpoole
Copy link
Contributor

Just as a data point: we have 100'000s of files, if not millions, on planet.openstreetmap.org that have nothing to do with the API that reference a version 0.6, not even to mention the ubiquitous PBF format that currently has a "OsmSchema-V0.6" field.

@gravitystorm
Copy link
Collaborator Author

Just as a data point: we have 100'000s of files, if not millions, on planet.openstreetmap.org that have nothing to do with the API that reference a version 0.6

Yes, and there are files there that reference, 0.5, 0.4, and even 0.3, so we've survived previous version changes.

I don't know what your question might be, and I'm not going to guess.

@joto
Copy link

joto commented Oct 23, 2019

I have to agree with others here. Changing the version number in the XML and PBF files will break a lot of software, including mine. This software is out there and, even if we change new versions now to accept a new version number, old version of this software will be out there for years. There is absolutely no way we can break the compatibility if it is not absolutely necessary.

@gravitystorm As you mentioned yourself, 0.6 has been around for a very long time, so comparing the situation to 0.5 and earlier versions doesn't make sense. When we changed from 0.5 to 0.6 OSM was much smaller. And writting an osm07to06 program that would just change the 0.7 to 0.6 so old software can read it, that doesn't sound like a sensible solution either. Of course this would be different if the files actually contain data in a format that needs to be different to transport this data, if we had areas or so. :-)

I think the only way forward here is to decouple API and file format versions. And I don't see why this is a problem really. Keep the 0.6 in the API URLs as "legacy", but the next number will be 1, 2, etc. (has already been discussed, we only need "major" version for API versions anyway). The next version of the files can be 0.7 or anything else we like, sometimes new API versions will mean new file versions, sometimes not. Maybe API version 3 will have an extra parameter to support version 0.6 and version 0.7 files or whatever. You could even have new file versions without new API versions if that specific file format isn't created by the API at all but only available as download.

We might also need to think not only about file versions and API versions but versions of the (abstract) data model behind it. I am not sure myself whether they are the same or what exactly their relationship is.

BTW: We already have some variants of the XML file format (JOSM, Overpass, ...) and they are not really compatible so should have gotten some kind of identifier, but that's a totally different issue again.

@gravitystorm
Copy link
Collaborator Author

The next version of the files can be 0.7 or anything else we like, sometimes new API versions will mean new file versions, sometimes not

I'd like to check what you're proposing here. For example, lets say in the next version of the API, the nodes/ways/relations output is the same, but the format of the notes output is different[1]. Are you proposing

a) all endpoints should continue to output version=0.6
b) nodes/ways/relations should output version=0.6 but notes should output version=0.7
c) something else?

Secondly, what if we introduce a new OSM document in the API? Until now, we've just used the existing number. For example, if we add a <diary_entry> API[2]. What number would that then output? Do you think it would be best to

a) start again from 0.1 for fresh OSM document types
b) start from the lowest existing number in use
c) start from the highest existing number in use
d) something else?

[1] This is a genuine plan of mine, to make notes have a description instead of a special first comment, so I'm not making a strawman
[2] Also something I genuinely want to add, but it's not dependent on having a new API version so could come before or after these changes.

@simonpoole
Copy link
Contributor

IMHO if you start considering stuff outside of just the core OSM data and API it becomes tricky because there has never been any consensus on carving up the API so that essentially independent parts can actually be run independent of each other (see @zerebubuth suggestions way back). This is particularly noticeable with the Notes API which is really a third party service bolted on to the existing data and user API (btw just to make things complicated notes use a different XML document for on disk storage than what is returned from the API).

In any case if we really want to indulge in 2nd system syndrome, I would suggest separating at least the documents and potentially the APIs for core user data, osm data, notes and for any social media functionality. Starting with 0.6 for everything that exists right now.

@gravitystorm
Copy link
Collaborator Author

IMHO if you start considering stuff outside of just the core OSM data and API it becomes tricky

Unfortunately that's what I have to do - everything under /api/ is affected by these discussions.

What do you consider as the limits of "core OSM data"? I assume at least nodes, ways, relations, and the output of the map call (e.g. the element). Would you also include the users and changesets output? Both are referenced from a map response but only by id.

carving up the API so that essentially independent parts can actually be run independent of each other

I haven't seen much discussion of that, if any. I know there were discussions in the past about running all of /api/ separately from other non-api stuff, but not about splitting the api itself into multiple projects. But in any case, I've already been told off in this thread for referencing things that happened a while ago, so I won't delve too deeply into historical discussions.

I don't think it would solve any of the problem at hand though. If we split the API up into multiple software projects, it could still be run transparently at /api/... with an apache redirects config file. But then we'll still have all the same discussions about which number to put inside of which responses.

In any case if we really want to indulge in 2nd system syndrome

Since I think given that splitting things up doesn't solve the topic of what numbers to put in the responses, I'm going to avoid this "2nd system syndrome" and declare it out of scope for this PR.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 30, 2019

carving up the API so that essentially independent parts can actually be run independent of each other
I haven't seen much discussion of that, if any.

Yes, that concept sounds familiar, see my comment here: #2162 (comment)

Quoting myself: Today, we have so many different API endpoints under the umbrella of the API 0.6 that don't really belong there. Examples could be changeset discussions, gps traces, user data, and probably also map notes. Any incompatible changes here don't really warrant a new OSM API version in a strict sense IMHO, and this further complicates overall API evolution. Maybe that's worth exploring in another issue.

@gravitystorm
Copy link
Collaborator Author

So it's clear that this PR has got stuck, and that I need to find some way to move this forward. I've been trying to figure out what that would be for the last few weeks.

This PR was (and still is) just a tiny first step towards general support for multiple versions, never mind any decisions as to how the versions will differ from each other when finally deployed. However, it's reasonable that people want to pitch in with their thoughts on the wider concepts around the API versions. Two recurring themes come up:

  • The topic of what the next version number in the url should be (e.g. 0.7 vs 7 and related concepts of semantic versioning for the API)
  • The topic of what version number(s) should be in the responses

The latter is particularly what has got this PR stuck. We can't have a fully informed decision on what number(s) to put into the responses until we see more examples of what's actually going to change. But we can't code any of the specific changes until I have implemented general support for multiple versions. And this is just the first in ~20 PRs that will be needed for the general support, never mind the ~?? PRs which will be about actually changing any API responses.

So my proposal is:

  • To create a new issue discussing what the next version number in the urls should be
  • To create a new issue for continuing the discussion on version numbers in the responses
  • Meanwhile, to open a fresh PR to replace this one, with conflicts resolved
  • To continue to work on the general support for multiple versions
  • To then work on specific changes to the API

I think this will allow us to unblock the general development work while the wider issues are discussed more fully.

@tomhughes
Copy link
Member

I do have this in my queue for a technical review as I think you requested a while ago, but I kind of held of because it sort of blew up again.

I don't expect to find any major technical issues though - if support for multiple API versions in parallel is necessary then I'm sure this is broadly speaking the correct approach.

You and I disagree on whether support for multiple API versions in parallel is required - actually that's not quite true as I have no problem with the idea in principle but I just don't think there is any realistic way we can achieve it.

Separately a lot of people think that the object model should be versioned separately to the API version, which you don't feel is necessary, but which I tend to agree with I think. In fact that ties in very much with my thoughts about multiple API versions in that I think there is no problem with multiple API versions where the object model remains the same but what people really want to do is to enhance the object model and I very doubtful that we can support multiple versions of that in parallel.

@gravitystorm
Copy link
Collaborator Author

enhance the object model and I very doubtful that we can support multiple versions of that in parallel

I agree there are changes that will be hard or impossible to support in parallel. But that shouldn't stop us from making all the other changes that can.

@gravitystorm
Copy link
Collaborator Author

Closing this PR as per #2353 (comment)

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 6, 2020

In their recent S2S meeting the OSMF Board discussed how they can support rewriting of the API ("Supporting API: confirm goals and identify next steps - supporting Andy Allen’s rewrite of the API (Paul, Joost, Guillaume)") -> https://wiki.osmfoundation.org/wiki/Board/Minutes/2020-10-S2S

Maybe this is referring to some previous @gravitystorm blog post on supporting multiple versions? Anyone have some insights what this is all about?

@simonpoole
Copy link
Contributor

... Anyone have some insights what this is all about?

The board doing random stuff without coordination with anybody else? It's the norm, see also a budget without asking the WGs for their requirements and many other things.

@woodpeck
Copy link
Contributor

woodpeck commented Oct 6, 2020

The board doing random stuff without coordination with anybody else?

Don't get carried away. You know as well as I do that it could very well be a harmless question discussed internally ("can/should we use resources on this") and only after a preliminary nod would the board approach third parties.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 6, 2020

In a first step, I really wanted to get some better understanding what the intended scope of those requirements were, irrespective of effort, feasibility, etc.

Agreed, communication could be a bit more transparent at times (otherwise I wouldn't be asking those questions), but this issue isn't a good place for that discussion.

@gravitystorm
Copy link
Collaborator Author

In a first step, I really wanted to get some better understanding what the intended scope of those requirements were, irrespective of effort, feasibility, etc.

Just for clarity - I haven't spoken recently with any board members on this topic, so I don't have any insights as to this particular conversation or what was intended (or even what they mean by 'rewrite of the API'). But I am always happy to have more help, either on the narrow topic of the API or the wider topic of the everything else covered by this repo.

@grischard
Copy link
Contributor

Frederik supposes correctly. We're seeing how important and Andy's work is, how progress there makes the work of others easier, and discussed if and how we could support it. There was a confusion between rails port and API, and we ended up chatting mostly about the rails port. We haven't decided to fund any projects, or indeed even had a chat with Andy recently (hi!).

We're currently running the microgrants, three software Grant pilot projects (osm2pgsql, Nominatim, Potlatch) and paying Quincy to work on iD full-time. We intend experience from those to support a reactivated EWG who would be in charge of managing the projects and allocating a budget.

We're very interested in hearing from anyone who would like their projects to be supported by the OSMF, or anyone who would be interested in joining EWG.

@mmd-osm
Copy link
Contributor

mmd-osm commented Oct 6, 2020

@grischard : thank you for the clarification, the meeting minutes make a whole lot more sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the XML or JSON APIs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants