Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some suggestions for lessons learned #1

Open
pvgenuchten opened this issue Mar 30, 2016 · 7 comments
Open

some suggestions for lessons learned #1

pvgenuchten opened this issue Mar 30, 2016 · 7 comments

Comments

@pvgenuchten
Copy link

  • Registries, and especially the base registries, need proper uri management. Records should be made available on a unique persistent uri, so other registries can use those uri's to create links between the registries.
  • Be aware that you don't just publish data on the web, by doing so you create data on the web, others will start to link to your data, thus creating added value for your data
  • Each data community (geospatial, linked data, governmental data, search engines, web/app developers, statistics) has it's own data sharing conventions. To properly serve each of them, separate endpoints for the same data is optimal from a consumption perspective. By doing so you potentially create multiple identifiers for the same data object.
  • Using content negotiation is fine to serve multiple encodings from a single endpoint, but gets a challenge when you want to serve the data transformed in multiple schema's/ontologies. Linked data suggests to annotate data using multiple ontologies within a single document, but this is barely accepted in other communities such as search engines, iso19139 etc.
  • The fact that search engines are a black box makes it really unpredictable and labor-intensive to use them as part of a study. It would be better to create a new search engine to demonstrate what search engines will be able to achieve in future.
  • CSW, WFS, GML, iso19139, INSPIRE conventions are quite fit for being used as a base for a linked-data-proxy-approach, most of the challenges are in how the standards are currently (poorly) implemented in various implementations
@adbgnm
Copy link
Contributor

adbgnm commented Mar 30, 2016

Dank hiervoor! Ik neem ze mee…

Met groet,
Arnoud

Van: paul van genuchten [mailto:[email protected]]
Verzonden: woensdag 30 maart 2016 10:33
Aan: geo4web-testbed/lessons-learned
Onderwerp: [geo4web-testbed/lessons-learned] some suggestions for lessons learned (#1)

  • Registries, and especially the base registries, need proper uri management. Records should be made available on a unique persistent uri, so other registries can use those uri's to create links between the registries.
  • Be aware that you don't just publish data on the web, by doing so you create data on the web, others will start to link to your data, thus creating added value for your data
  • Each data community (geospatial, linked data, governmental data, search engines, web/app developers, statistics) has it's own data sharing conventions. To properly serve each of them, separate endpoints for the same data is optimal from a consumption perspective. By doing so you potentially create multiple identifiers for the same data object.
  • Using content negotiation is fine to serve multiple encodings from a single endpoint, but gets a challenge when you want to serve the data transformed in multiple schema's/ontologies. Linked data suggests to annotate data using multiple ontologies within a single document, but this is barely accepted in other communities such as search engines, iso19139 etc.
  • The fact that search engines are a black box makes it really unpredictable and labor-intensive to use them as part of a study. It would be better to create a new search engine to demonstrate what search engines will be able to achieve in future.
  • CSW, WFS, GML, iso19139, INSPIRE conventions are quite fit for being used as a base for a linked-data-proxy-approach, most of the challenges are in how the standards are currently (poorly) implemented in various implementations


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHubhttps://github.com//issues/1

@ndkv
Copy link

ndkv commented Apr 22, 2016

... most of the challenges are in how the standards are currently (poorly) implemented in various implementations

@pvgenuchten What recommendations do you have for data owners? What can they do today to improve the situation?

Context: I just installed the ldproxy Docker container and added RIVM's server. You won't be surprised to hear that the majority of feature types can't be parsed. I get empty pages like the one below.

screenshot 2016-04-22 09 30 55

This clearly isn't ldproxy's fault. The question is: how do we fix this? It might be as easy as pointing data owners to http://validatie.geostandaarden.nl/etf-webapp/testobjects and urging them to fix the reported errors. It should nevertheless be mentioned in reports and 'lessons learned' explicitly.

@azahnen
Copy link

azahnen commented Apr 22, 2016

@ndkv In addition to pointing data owners to ETF or CITE tests, it would be good if ldproxy tests the WFS functionality it depends on. That way the user could be notified of problems and ldproxy could behave accordingly instead of showing broken pages like in your example. I added an issue for this, see ldproxy/ldproxy#37.

@pvgenuchten
Copy link
Author

Hi simeon, did you have any js errors? I will notify the developers about
that wfs endpoint so they can do a bit more research

In recent tests i've found quite a lot of errors on different levels, some
related to metadata fields not filled in properly, some related to
app-schema not properly configured and some related to the WFS server
having bugs... An example:

http://geodata.nationaalgeoregister.nl/omgevingswarmte/wfs?NAMESPACES=xmlns%
28omgevingswarmte%2Chttp%3A%2F%2Fomgevingswarmte.geonovum.
nl%29&STARTINDEX=0&COUNT=25&VERSION=2.0.0&TYPENAMES=omgevingswarmte%
3Akoudeopenwkogem&OUTPUTFORMAT=application%2Fgml%2Bxml%3B+version%3D3.2&
SERVICE=WFS&REQUEST=GetFeature

It's quite interesting to see that this type of errors become more obvious
when using an approach like ldproxy

But it's hard to give recommendations for data publishers. Testing your
setup using CITE and Esdin Test Framework could be a good start (and make
sure to test it every now and then, configurations tend to collapse for all
sorts of reasons)

2016-04-22 11:21 GMT+02:00 Andreas Zahnen [email protected]:

@ndkv https://github.com/ndkv In addition to pointing data owners to
ETF or CITE tests, it would be good if ldproxy tests the WFS functionality
it depends on. That way the user could be notified of problems and ldproxy
could behave accordingly instead of showing broken pages like in your
example. I added an issue for this, see ldproxy/ldproxy#37
ldproxy/ldproxy#37.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1 (comment)

@cportele
Copy link
Member

@ndkv @pvgenuchten

Regarding WFS there are a few comments and recommendations in our report: http://geo4web-testbed.github.io/topic4/#h.be0rojfbpalg

@ndkv
Copy link

ndkv commented Apr 22, 2016

@azahnen Agree!

@pvgenuchten Yes, I get a this._northEast is undefined but that's because the returned GeoJSON is empty. As you point out, this is probably caused by faulty WFS/feature type configurations and what not. It differs per endpoint + feature type, though. For example, inspi3:hospitals_2013 in http://inspire.rivm.nl/geoserver/wfs works fine. However, the layer after it (inspi3:drinking_water_2012) returns a broken GeoJSON

{
    "type" : "FeatureCollection",
    "features" : [ {
    "type" : "Feature",
    "id" : "drinking_water_2012.1"

but a working HTML so I get an empty map but the features + attributes on the left are visible.

It's quite interesting to see that this type of errors become more obvious when using an approach like ldproxy

Yes! I guess we're used to scripting/ETL-ing ourselves out of broken endpoints. Also, QGIS is, apparently, fairly fault tolerant (or simply doesn't care about correctness) and stuff just works there.

@cportele Those are very insightful. Is @beheerPDOK close by? 😇 /cc @adbgnm

@cportele
Copy link
Member

@ndkv Some are, including the services that we have used, but some of the services we tested had a few of those issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants