From 535c54bdc182ec50433c033b436ec0bf14957c93 Mon Sep 17 00:00:00 2001 From: Jake Low Date: Tue, 20 Aug 2024 16:35:53 -0700 Subject: [PATCH 1/2] Refine high-level overview in README.md --- README.md | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index bb2c1f3..ca87ba1 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,26 @@ # osm-adiff-service -Read the minutely replication files published by OpenStreetMap planet, and query changesets on Overpass to create full representations of changesets. It also posts the tag changes summary to the OSMCha API. +This service reads the minutely replication files published by OpenStreetMap, and builds JSON documents which describe each changeset in detail (including information which is not included in the replication file). It publishes these JSON files to S3, and also POSTs a summary of tag changes to the OSMCha API. -# Real OpenStreetMap Changesets - -When a changeset is pushed to OSM, this stack builds a representation of the exact change that happened: +Each changeset JSON contains complete information about the changeset: * Changeset metadata - username, id, timestamp, comment etc. * Elements - each feature that was added, modified, or deleted in the changeset. * For each element, the current and previous version including geometry and metadata. -#### Details +## What is this, and why? + +OSMCha's purpose is to let users view a changeset in its entirety, including metadata about the changeset and the "before" and "after" versions of every OSM element that was changed. + +The OSM API publishes minutely [replication files](https://wiki.openstreetmap.org/wiki/Planet.osm/diffs) in [`.osc` format](https://wiki.openstreetmap.org/wiki/OsmChange) that contain some information about each edit that is made to OSM, but these files are optimized for small size and don't contain all of the details required by OSMCha. Specifically: + +- they do not include old ("before") versions of elements that changed +- they don't include way geometries at all unless the geometry itself was edited (not just the tags) +- they don't include bounding boxes -* New changesets are pushed to `https://s3.amazonaws.com/mapbox/real-changesets/production/.json` -* Augmented Diffs are pushed by Ovepass are pushed to `https://s3-ap-northeast-1.amazonaws.com/overpass-db-ap-northeast-1/augmented-diffs/`. -* The latest state id is published here `https://s3-ap-northeast-1.amazonaws.com/overpass-db-ap-northeast-1/augmented-diffs/latest` +A richer diff format called [augmented diff](https://wiki.openstreetmap.org/wiki/Overpass_API/Augmented_Diffs) addresses these limitations. [Overpass](https://wiki.openstreetmap.org/wiki/Overpass_API) is capable of producing this type of diff. The `osm-adiff-service` can be used to process a replication file from the OSM API, retrieve additional data about each change by getting an augmented diff from Overpass, and republish the resulting info as JSON. These JSON artifacts are then served by the [OSMCha backend](https://github.com/OSMCha/osmcha-django) and rendered in the browser using [changeset-map](https://github.com/osmlab/changeset-map). -#### Example +#### Example JSON changeset output ```json // 20170309131154 @@ -102,14 +106,6 @@ When a changeset is pushed to OSM, this stack builds a representation of the exa } ``` -## What is this, and why? - -A lot of processes around inspecting and searching for potentially bad edits on OpenStreetMap depend on being able to view a "changeset" in its entirety. This helps in gauging the context of an edit, see similar edits by the same user, and see edits in their "finished" state (i.e. not in between a changeset). - -Our primary tool for visualizing changesets has been [changeset-map](http://osmlab.github.io/changeset-map/). We depend on [augmented diffs](http://wiki.openstreetmap.org/wiki/Overpass_API/Augmented_Diffs) generated by Overpass to generate these changeset representations and visualizations. - -Augmented Diffs contains complete representations of changes in OSM for every minute. One can also query for a custom time range, and filter by bounding box or other attributes. These queries can be extremely slow, especially for large changesets, and were a major bottleneck in scaling up changeset reviewing processes. - ### How to run #### JS library From 9f6fa514c4961c1e9ecc5348e6ed3dbce4fbbb14 Mon Sep 17 00:00:00 2001 From: Wille Marcel Date: Wed, 21 Aug 2024 10:03:50 -0300 Subject: [PATCH 2/2] Improve text and add some links --- README.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index ca87ba1..645044c 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # osm-adiff-service -This service reads the minutely replication files published by OpenStreetMap, and builds JSON documents which describe each changeset in detail (including information which is not included in the replication file). It publishes these JSON files to S3, and also POSTs a summary of tag changes to the OSMCha API. +This service reads the minutely replication files published by OpenStreetMap, and builds JSON documents which describe each changeset in detail (including information which is not included in the replication file). It publishes these JSON files to S3, and also POSTs a summary of tag changes to the [OSMCha API](https://osmcha.org/api-docs/). Each changeset JSON contains complete information about the changeset: @@ -10,7 +10,7 @@ Each changeset JSON contains complete information about the changeset: ## What is this, and why? -OSMCha's purpose is to let users view a changeset in its entirety, including metadata about the changeset and the "before" and "after" versions of every OSM element that was changed. +[OSMCha](https://osmcha.org)'s purpose is to let users view a changeset in its entirety, including metadata about the changeset and the "before" and "after" versions of every OSM element that was changed. The OSM API publishes minutely [replication files](https://wiki.openstreetmap.org/wiki/Planet.osm/diffs) in [`.osc` format](https://wiki.openstreetmap.org/wiki/OsmChange) that contain some information about each edit that is made to OSM, but these files are optimized for small size and don't contain all of the details required by OSMCha. Specifically: @@ -18,7 +18,9 @@ The OSM API publishes minutely [replication files](https://wiki.openstreetmap.or - they don't include way geometries at all unless the geometry itself was edited (not just the tags) - they don't include bounding boxes -A richer diff format called [augmented diff](https://wiki.openstreetmap.org/wiki/Overpass_API/Augmented_Diffs) addresses these limitations. [Overpass](https://wiki.openstreetmap.org/wiki/Overpass_API) is capable of producing this type of diff. The `osm-adiff-service` can be used to process a replication file from the OSM API, retrieve additional data about each change by getting an augmented diff from Overpass, and republish the resulting info as JSON. These JSON artifacts are then served by the [OSMCha backend](https://github.com/OSMCha/osmcha-django) and rendered in the browser using [changeset-map](https://github.com/osmlab/changeset-map). +A richer diff format called [augmented diff](https://wiki.openstreetmap.org/wiki/Overpass_API/Augmented_Diffs) addresses these limitations. [Overpass](https://wiki.openstreetmap.org/wiki/Overpass_API) is capable of producing this type of diff. The `osm-adiff-service` can be used to process a replication file from the OSM API, retrieve additional data about each change by getting an augmented diff from Overpass, and republish the resulting info as JSON. + +These JSON artifacts are named as [real-changesets](https://github.com/osmus/osmcha-charter-project/blob/main/real-changesets-docs.md), and currently the OSMCha's data pipeline is publishing the files in an [AWS Open Data S3 Bucket](https://registry.opendata.aws/real-changesets/). The `real-changesets` are used by OSMCha to provide the visualization of changesets to users. The component used to render it on the browser is the [changeset-map](https://github.com/osmlab/changeset-map). #### Example JSON changeset output