-
Notifications
You must be signed in to change notification settings - Fork 8
Example process
Create a representation of a simple machine learning modelling process, i.e.
- Open algorithm implementation (e.g. weka.J48)
- Load dataset (e.g. Iris.arff)
- Build model (e.g. decision tree)
- Store model (e.g. a .model file)
You can do this from the perspective of your favourite machine learning environment, but don't overspecialize. In enough detail so that anyone else can repeat the same thing and has all relevant information. Mention the core concepts that you need. Optionally, also state their relations (useful for later).
OpenML starts out by creating a task (e.g. Classification in dataset Iris):
http://www.openml.org/api_new/v1/task/59 (You need to log in on OpenML.org first.)
It also automatically gets a webpage: http://www.openml.org/t/59
The task contains the dataset, which is described as follows: http://www.openml.org/api_new/v1/data/61
Webpage: http://www.openml.org/d/61
You can then upload any algorithm, like this: http://www.openml.org/api_new/v1/flow/1720
Webpage: http://www.openml.org/f/1720
An example run (J48 on iris):
Webpage: http://www.openml.org/r/501579
XML description: http://openml.org/data/download/1745358/weka_generated_run7769371227014322202.xml
A run can also include the (instance-level) predictions:
http://openml.org/data/download/1745359/weka_generated_predictions5123198195447015004.arff
And the model (serialized and/or human-readable):
http://openml.org/data/download/1745360/WekaSerialized_weka.classifiers.trees.J487848219619191034547.model
http://openml.org/data/download/1745361/WekaModel_weka.classifiers.trees.J481306807653080981411.model
- iris-mex-output.ttl => mex file (metadata)
- iris-weka-output.txt => weka output
- iris.arff => dataset
- j48-iris.model => model file
- Workflow
- Model
-
ModelRepresentation
-
Workflow representation in XML RapidMiner workflow in XML
1. Level of Specification:
- //Samples/data/Golf->
mls:Data
- //Local Repository/processes/ML-Schema-Example1->
mls:Workflow
- Retrieve Golf ->
mls:Implementation
- Select Attributes ->
mls:Implementation
- SVM ->
mls:Implementation
- Store ->
mls:Implementation
2. Level of execution:
- SVM.model->
mls:Model
- golfModelSerialization->
mls:ModelRepresentation
DMOP DM-Experiment Sandbox (Iris)
I plan to update this with a more detailed picture. The current one is just for the sake of a discussion Example representation with OntoDM terms