Skip to content
This repository has been archived by the owner on Jan 20, 2025. It is now read-only.

Commit

Permalink
Squashed commit of pipeline creation improvements:
Browse files Browse the repository at this point in the history
Note: This commit is for experimental purposes.

commit 8004dd2700e3e212e805ec134e0e227b727f2444
Author: Antonis Klironomos <[email protected]>
Date:   Sat Apr 22 12:26:06 2023 +0200

    Refactor ExeKG.add_task() method

commit e435f9cb4ba111541440f0bc85c73091f60f3f34
Author: Antonis Klironomos <[email protected]>
Date:   Sat Apr 22 12:10:39 2023 +0200

    Add support for using different KG schemata in the same pipeline

commit 4079234ec9d906148b8d16182f1bd4e3b736d327
Author: Antonis Klironomos <[email protected]>
Date:   Mon Apr 10 15:55:22 2023 +0200

    Update poetry.lock

commit bf1426642e01b4d22eeb3b54bcd10624d3272a56
Author: Antonis Klironomos <[email protected]>
Date:   Mon Apr 10 15:45:21 2023 +0200

    Update codebase and documentation
  • Loading branch information
AntonisKl committed Apr 28, 2023
1 parent 9a5bdc9 commit f50e155
Show file tree
Hide file tree
Showing 13 changed files with 217 additions and 244 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# ExeKGLib
# ExeKGLib

![PyPI](https://img.shields.io/pypi/v/exe-kg-lib)
![Python](https://img.shields.io/badge/python-v3.8+-blue.svg)
Expand Down Expand Up @@ -29,7 +28,7 @@ _Klironomos A., Zhou B., Tan Z., Zheng Z., Gad-Elrab M., Paulheim H., Kharlamov

Detailed information (installation, documentation etc.) about **ExeKGLib** can be found in [its website](https://boschresearch.github.io/ExeKGLib/) and basic information is shown below.

To download, run `pip install exe-kg-lib`.
## Installation

[//]: # (--8<-- [start:installation])
To install, run `pip install exe-kg-lib`.
Expand All @@ -43,7 +42,7 @@ For detailed installation instructions, refer to the [installation page](https:/
<details>
<summary>Click to expand</summary>

[//]: # (--8<-- [start:supportedmethods])
<!-- --8<-- [start:supportedmethods] -->
| KG schema (abbreviation) | Task | Method | Properties | Input (data structure) | Output (data structure) | Implemented by Python class |
| ------------------------ | ------------------------- | ---------------------------- | --------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------------- |
| Machine Learning (ML) | Train | KNNTrain | \- | DataInTrainX (Matrix or Vector)<br>DataInTrainY (Matrix or Vector) | DataOutPredictedValueTrain (Matrix or Vector)<br>DataOutTrainModel (SingleValue) | TrainKNNTrain |
Expand Down
4 changes: 2 additions & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ before ranting! :pray:
conda deactivate
```

### Step 1: Dependency Installation
## Step 1: Dependency Installation

The installation of the project's dependencies should be piece of :cake: in most cases by running

Expand All @@ -75,7 +75,7 @@ from within the project directory.
| _"I get a `ConnectionError`"_ | Maybe you have proxy issues. |
| _"I destroyed my poetry environment"_ | Delete the `.venv` folder and create a new env. |

### Step 2: Pre-commit Git Hooks Installation
## Step 2: Pre-commit Git Hooks Installation

To ensure compatibility of each future commit with the project's conventions (e.g. code format), some predefined git hooks should be installed by running the following commands.

Expand Down
61 changes: 33 additions & 28 deletions examples/ml_pipeline_creation.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from exe_kg_lib import ExeKG

if __name__ == "__main__":
exe_kg = ExeKG(kg_schema_name="Machine Learning")
exe_kg = ExeKG()
feature_columns = ["feature_1", "feature_2", "feature_3", "feature_4", "feature_5"]
label_column = "label"

Expand Down Expand Up @@ -33,20 +33,22 @@
)

concatenate_task = exe_kg.add_task(
task_type="Concatenation",
kg_schema_short="ml",
task="Concatenation",
input_data_entity_dict={"DataInConcatenation": feature_data_entities},
method_type="ConcatenationMethod",
data_properties={},
method="ConcatenationMethod",
properties_dict={},
)

data_splitting_task = exe_kg.add_task(
task_type="DataSplitting",
kg_schema_short="ml",
task="DataSplitting",
input_data_entity_dict={
"DataInDataSplittingX": [concatenate_task.output_dict["DataOutConcatenatedData"]],
"DataInDataSplittingY": [label_data_entity],
},
method_type="DataSplittingMethod",
data_properties={"hasSplitRatio": 0.8},
method="DataSplittingMethod",
properties_dict={"hasSplitRatio": 0.8},
)

train_x = data_splitting_task.output_dict["DataOutSplittedTrainDataX"]
Expand All @@ -55,80 +57,83 @@
test_real_y = data_splitting_task.output_dict["DataOutSplittedTestDataY"]

knn_train_task = exe_kg.add_task(
task_type="Train",
kg_schema_short="ml",
task="Train",
input_data_entity_dict={
"DataInTrainX": [train_x],
"DataInTrainY": [train_real_y],
},
method_type="KNNTrain",
data_properties={},
method="KNNTrain",
properties_dict={},
)
model = knn_train_task.output_dict["DataOutTrainModel"]
train_predicted_y = knn_train_task.output_dict["DataOutPredictedValueTrain"]

knn_test_task = exe_kg.add_task(
task_type="Test",
kg_schema_short="ml",
task="Test",
input_data_entity_dict={
"DataInTestModel": [model],
"DataInTestX": [test_x],
},
method_type="KNNTest",
data_properties={},
method="KNNTest",
properties_dict={},
)
test_predicted_y = knn_test_task.output_dict["DataOutPredictedValueTest"]

performance_calc_task = exe_kg.add_task(
task_type="PerformanceCalculation",
kg_schema_short="ml",
task="PerformanceCalculation",
input_data_entity_dict={
"DataInTrainRealY": [train_real_y],
"DataInTrainPredictedY": [train_predicted_y],
"DataInTestRealY": [test_real_y],
"DataInTestPredictedY": [test_predicted_y],
},
method_type="PerformanceCalculationMethod",
data_properties={},
method="PerformanceCalculationMethod",
properties_dict={},
)
train_error = performance_calc_task.output_dict["DataOutMLTrainErr"]
test_error = performance_calc_task.output_dict["DataOutMLTestErr"]

canvas_task = exe_kg.add_task(
task_type="CanvasTask",
kg_schema_short="visu",
task="CanvasTask",
input_data_entity_dict={},
method_type="CanvasMethod",
data_properties={"hasCanvasName": "MyCanvas", "hasLayout": "1 1"},
visualization=True,
method="CanvasMethod",
properties_dict={"hasCanvasName": "MyCanvas", "hasLayout": "1 1"},
)

train_error_lineplot_task = exe_kg.add_task(
task_type="PlotTask",
kg_schema_short="visu",
task="PlotTask",
input_data_entity_dict={
"DataInVector": [train_error],
},
method_type="ScatterplotMethod",
data_properties={
method="ScatterplotMethod",
properties_dict={
"hasLegendName": "Train error",
"hasLineStyle": "o",
"hasScatterStyle": "o",
"hasLineWidth": 1,
"hasScatterSize": 1,
},
visualization=True,
)

test_error_lineplot_task = exe_kg.add_task(
task_type="PlotTask",
kg_schema_short="visu",
task="PlotTask",
input_data_entity_dict={
"DataInVector": [test_error],
},
method_type="ScatterplotMethod",
data_properties={
method="ScatterplotMethod",
properties_dict={
"hasLegendName": "Test error",
"hasLineStyle": "o",
"hasScatterStyle": "o",
"hasLineWidth": 1,
"hasScatterSize": 1,
},
visualization=True,
)

exe_kg.save_created_kg(f"./pipelines/{pipeline_name}.ttl")
130 changes: 65 additions & 65 deletions examples/pipelines/MLPipeline.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,39 @@
@prefix visu: <https://raw.githubusercontent.com/nsai-uio/ExeKGOntology/main/visu_exeKGOntology.ttl#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

ml:MLPipeline a ds:Pipeline ;
ds:MLPipeline a ds:Pipeline ;
ds:hasInputDataPath "./examples/data/dummy_data.csv"^^xsd:string ;
ds:hasStartTask ml:Concatenation1 .

ml:CanvasMethod1 a visu:CanvasMethod .
ds:feature_1 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_1"^^xsd:string .

ml:CanvasTask1 a visu:CanvasTask ;
ds:hasNextTask ml:PlotTask1 ;
visu:hasCanvasMethod ml:CanvasMethod1 ;
visu:hasCanvasName "MyCanvas"^^xsd:string ;
visu:hasLayout "1 1"^^ds:intPair .
ds:feature_2 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_2"^^xsd:string .

ds:feature_3 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_3"^^xsd:string .

ds:feature_4 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_4"^^xsd:string .

ds:feature_5 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_5"^^xsd:string .

ds:label a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "label"^^xsd:string .

ml:Concatenation1 a ml:Concatenation ;
ds:hasInput ml:DataInConcatenation1_1,
Expand All @@ -28,25 +50,25 @@ ml:Concatenation1 a ml:Concatenation ;
ml:ConcatenationMethod1 a ml:ConcatenationMethod .

ml:DataInConcatenation1_1 a ml:DataInConcatenation ;
ds:hasReference ml:feature_1 .
ds:hasReference ds:feature_1 .

ml:DataInConcatenation1_2 a ml:DataInConcatenation ;
ds:hasReference ml:feature_2 .
ds:hasReference ds:feature_2 .

ml:DataInConcatenation1_3 a ml:DataInConcatenation ;
ds:hasReference ml:feature_3 .
ds:hasReference ds:feature_3 .

ml:DataInConcatenation1_4 a ml:DataInConcatenation ;
ds:hasReference ml:feature_4 .
ds:hasReference ds:feature_4 .

ml:DataInConcatenation1_5 a ml:DataInConcatenation ;
ds:hasReference ml:feature_5 .
ds:hasReference ds:feature_5 .

ml:DataInDataSplittingX1_1 a ml:DataInDataSplittingX ;
ds:hasReference ml:DataOutConcatenatedData1 .

ml:DataInDataSplittingY1_1 a ml:DataInDataSplittingY ;
ds:hasReference ml:label .
ds:hasReference ds:label .

ml:DataInTestModel1_1 a ml:DataInTestModel ;
ds:hasReference ml:DataOutTrainModel1 .
Expand Down Expand Up @@ -94,36 +116,13 @@ ml:PerformanceCalculation1 a ml:PerformanceCalculation ;
ml:DataInTestRealY1_1,
ml:DataInTrainPredictedY1_1,
ml:DataInTrainRealY1_1 ;
ds:hasNextTask ml:CanvasTask1 ;
ds:hasNextTask visu:CanvasTask1 ;
ds:hasOutput ml:DataOutMLTestErr1,
ml:DataOutMLTrainErr1 ;
ml:hasPerformanceCalculationMethod ml:PerformanceCalculationMethod1 .

ml:PerformanceCalculationMethod1 a ml:PerformanceCalculationMethod .

ml:PlotTask1 a visu:PlotTask ;
ds:hasInput visu:DataInVector1_1 ;
ds:hasNextTask ml:PlotTask2 ;
visu:hasLegendName "Train error"^^xsd:string ;
visu:hasLineStyle "o"^^xsd:string ;
visu:hasLineWidth "1"^^xsd:int ;
visu:hasPlotMethod ml:ScatterplotMethod1 ;
visu:hasScatterSize "1"^^xsd:int ;
visu:hasScatterStyle "o"^^xsd:string .

ml:PlotTask2 a visu:PlotTask ;
ds:hasInput visu:DataInVector2_1 ;
visu:hasLegendName "Test error"^^xsd:string ;
visu:hasLineStyle "o"^^xsd:string ;
visu:hasLineWidth "1"^^xsd:int ;
visu:hasPlotMethod ml:ScatterplotMethod2 ;
visu:hasScatterSize "1"^^xsd:int ;
visu:hasScatterStyle "o"^^xsd:string .

ml:ScatterplotMethod1 a visu:ScatterplotMethod .

ml:ScatterplotMethod2 a visu:ScatterplotMethod .

ml:Test1 a ml:Test ;
ds:hasInput ml:DataInTestModel1_1,
ml:DataInTestX1_1 ;
Expand All @@ -139,42 +138,43 @@ ml:Train1 a ml:Train ;
ml:DataOutTrainModel1 ;
ml:hasTrainMethod ml:KNNTrain1 .

ml:feature_1 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_1"^^xsd:string .

ml:feature_2 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_2"^^xsd:string .

ml:feature_3 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_3"^^xsd:string .

ml:feature_4 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_4"^^xsd:string .
visu:CanvasMethod1 a visu:CanvasMethod .

ml:feature_5 a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "feature_5"^^xsd:string .

ml:label a ds:DataEntity ;
ds:hasDataSemantics ds:TimeSeries ;
ds:hasDataStructure ds:Vector ;
ds:hasSource "label"^^xsd:string .
visu:CanvasTask1 a visu:CanvasTask ;
ds:hasNextTask visu:PlotTask1 ;
visu:hasCanvasMethod visu:CanvasMethod1 ;
visu:hasCanvasName "MyCanvas"^^xsd:string ;
visu:hasLayout "1 1"^^ds:intPair .

visu:DataInVector1_1 a visu:DataInVector ;
ds:hasReference ml:DataOutMLTrainErr1 .

visu:DataInVector2_1 a visu:DataInVector ;
ds:hasReference ml:DataOutMLTestErr1 .

visu:PlotTask1 a visu:PlotTask ;
ds:hasInput visu:DataInVector1_1 ;
ds:hasNextTask visu:PlotTask2 ;
visu:hasLegendName "Train error"^^xsd:string ;
visu:hasLineStyle "o"^^xsd:string ;
visu:hasLineWidth "1"^^xsd:int ;
visu:hasPlotMethod visu:ScatterplotMethod1 ;
visu:hasScatterSize "1"^^xsd:int ;
visu:hasScatterStyle "o"^^xsd:string .

visu:PlotTask2 a visu:PlotTask ;
ds:hasInput visu:DataInVector2_1 ;
visu:hasLegendName "Test error"^^xsd:string ;
visu:hasLineStyle "o"^^xsd:string ;
visu:hasLineWidth "1"^^xsd:int ;
visu:hasPlotMethod visu:ScatterplotMethod2 ;
visu:hasScatterSize "1"^^xsd:int ;
visu:hasScatterStyle "o"^^xsd:string .

visu:ScatterplotMethod1 a visu:ScatterplotMethod .

visu:ScatterplotMethod2 a visu:ScatterplotMethod .

ml:DataOutConcatenatedData1 a ds:DataEntity .

ml:DataOutMLTestErr1 a ds:DataEntity .
Expand Down
Loading

0 comments on commit f50e155

Please sign in to comment.