ML-Schema with Neosemantics #26

Isha5 · 2022-01-25T06:09:40Z

Hi everyone,
I find ML-Schema very interesting for the better interpretablitiy for Machine Learning projects. I am trying to use this schema for my ML models available on gitlab and query some information of these ML models.

I tried looking around but there are no blogs etc. on how to achieve this. I very much appreciate, if someone know any sources on how to achieve this.

joaquinvanschoren · 2022-01-25T08:28:44Z

There are no blogs about this as far as I know. I believe the paper gives some guidance: http://www.semantic-web-journal.net/content/ml-schema-interchangeable-format-description-machine-learning-experiments-0

A practical implementation to map models stored on OpenML to ML-Schema can be found here:
https://github.com/ML-Schema/openml-rdf (not a lot of documentation there, though).

That's all I know...

Isha5 · 2022-01-26T08:00:16Z

Thank you @joaquinvanschoren

I read that we could produce .ttl (turtle files) from some apps like this

out of curiosity, in this github page, you have published WEKA ML model provenance in a turtle file, could you please share which tool did you use for that?

joaquinvanschoren · 2022-01-26T08:10:44Z

Probably best to ask Tommaso @mommi84

Isha5 · 2022-01-26T08:14:13Z

Hi Tommaso @mommi84, would be great if you could share info on this.

I read that we could produce .ttl (turtle files) from some apps like this

out of curiosity, in this github page, you have published WEKA logistic regression ML model provenance in a turtle file, could you please share which tool did you use for that?

Isha5 · 2022-01-26T13:16:55Z

@joaquinvanschoren @mommi84 @agnieszkalawrynowicz this is for my academic project. It would be helpful if you could let me know if I could map ML models available in my gitlab to mlschema.
In this ReadMe it is given, we can load mlschema to protege and edit the schema. could you point me to any resources on how to edit this?

Thanks in advance professor @joaquinvanschoren and the team :)

diegoesteves · 2022-01-27T11:53:14Z

Hi @Isha5 there is no straightforward way to do so.

This is related to the long-standing issue we have in the trade-off between ML Frameworks x ML source-code. If you use the first, we'll probably get that structured information for free (as it's usually a standard feature in any decent framework). If you want to map ML metadata generated out of ML scripts outside such frameworks, there's currently no automatic way of doing so.

In the past I have explored a few different methods (everything open-source):

option 1: create your own ML framework with techniques such as interfaces/annotations/reflection - it also adds an extra layer, but code is cleaner this way. However, at that time the coverage wasn't great (worked just for toy examples due to a number of different reasons - read the paper if you want to know more)
2016, MEX-Interfaces: https://dl.acm.org/doi/10.1145/2993318.2993320

option 2: create a library that implements a logging mechanism to export those directly into a pre-defined format (e.g., MEX, OntoDM, Expose, whatsoever). Works decently, but at the cost of needlessly inflating the source code (from a purely ML point of view).
2017, LOG4MEX: A Library to Export Machine Learning Experiment
http://jens-lehmann.org/files/2017/wi_log4mex.pdf

option 3: create a REST API to receive the ML (input/output) parameters and export the metadata file. Cleaner and my preferred option so far. Still, that requires adapting your source code to communicate with this web interface.
2017, An Interoperable Service for the Provenance of Machine Learning Experiments
https://www.researchgate.net/profile/Diego-Esteves/publication/319051027_An_interoperable_service_for_the_provenance_of_machine_learning_experiments/links/59c17ca3a6fdcc69b92bc467/An-interoperable-service-for-the-provenance-of-machine-learning-experiments.pdf

1,2, and 3: https://github.com/mexplatform

option 4
4.1 - use an ML framework designed for that - if possible (e.g. OpenML https://www.openml.org/). Positive, you solve your problem. Negative: interoperability issues (source-code-wise).

4.2 - explore sequence2sequence methods to generate that automatically, without the need to inflate/adapt your source code. This could be a great research topic.

You just need to consider that there are so many new data platforms (including robust ones like Databricks, GCP, Azure, etc..) available where this (store/export ML feature/provenance) is just a very fundamental feature they provide. So check whether investing a lot of time recreating something from scratch makes sense to you.

Best,
Diego.

mommi84 · 2022-01-28T19:25:29Z

out of curiosity, in this github page, you have published WEKA logistic regression ML model provenance in a turtle file, could you please share which tool did you use for that?

It was 6 years ago so I am just guessing, but I think that the examples were manually created on Protege as a proof of concept. As @diegoesteves pointed out, OpenML does provide a way to publish your Weka experiments (see https://docs.openml.org/Weka/) and export them to JSON, XML or RDF.

However, the RDF export of the single experiments is not complete and does not include all metadata (see for instance https://github.com/ML-Schema/openml-rdf/blob/master/examples/Run/476635.rdf).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ML-Schema with Neosemantics #26

ML-Schema with Neosemantics #26

Isha5 commented Jan 25, 2022

joaquinvanschoren commented Jan 25, 2022

Isha5 commented Jan 26, 2022

joaquinvanschoren commented Jan 26, 2022

Isha5 commented Jan 26, 2022

Isha5 commented Jan 26, 2022

diegoesteves commented Jan 27, 2022 •

edited

Loading

mommi84 commented Jan 28, 2022

ML-Schema with Neosemantics #26

ML-Schema with Neosemantics #26

Comments

Isha5 commented Jan 25, 2022

joaquinvanschoren commented Jan 25, 2022

Isha5 commented Jan 26, 2022

joaquinvanschoren commented Jan 26, 2022

Isha5 commented Jan 26, 2022

Isha5 commented Jan 26, 2022

diegoesteves commented Jan 27, 2022 • edited Loading

mommi84 commented Jan 28, 2022

diegoesteves commented Jan 27, 2022 •

edited

Loading