Skip to content

Commit

Permalink
Added jupyter notebook examples
Browse files Browse the repository at this point in the history
* Creating a collection
* Selecting data based on tags and changing the datafile
* Creating a custom feature and adding it to the datafile
  • Loading branch information
MartimChaves authored Aug 12, 2022
1 parent 8688d6a commit 30564cb
Show file tree
Hide file tree
Showing 14 changed files with 1,338 additions and 0 deletions.
342 changes: 342 additions & 0 deletions examples/jupyter_nb/changing_datafile_with_tags_example.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,342 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Changing datafile using Tags"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table of Contents (TOC) <a class=\"anchor\" id=\"toc\"></a>\n",
"\n",
"- [1. Imports](#first-bullet)\n",
"- [2. Zegami Client, Workspace, and Collection](#second-bullet)\n",
"- [3. Chaging Datafile Value of One Item](#third-bullet)\n",
" - [3.1. Creating a Tag](#creating-tag)\n",
" - [3.2. Converting Fahrenheit values to Celsius](#fah_to_cel)\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Imports <a class=\"anchor\" id=\"first-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"First we'll start by importing all of the needed libraries."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from zegami_sdk.client import ZegamiClient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Zegami client, workspace, and collection <a class=\"anchor\" id=\"second-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"Here we'll initialize the Zegami client so that we can have access to our collections.\n",
"\n",
"If you haven't intialized your client before, you'll to have to provide your username and password, in order to create a security token. You only have to do this once. After that, you can initialize the client without providing login details, as the security token will be used instead!"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Used token from '/home/martim-zegami/zegami_com.zegami.token'.\n",
"Client initialized successfully, welcome .\n",
"\n"
]
}
],
"source": [
"zc = ZegamiClient()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After initializing the client, we can retrieve our workspace and collection."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[<Workspace id=GUDe4kRY name=Martim Chaves>]\n"
]
}
],
"source": [
"workspaces_lst = zc.workspaces\n",
"print(workspaces_lst)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Get workspace using the ID\n",
"WORKSPACE_ID = zc.workspaces[0].id # id=GUDe4kRY\n",
"workspace = zc.get_workspace_by_id(WORKSPACE_ID)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Collections in 'Martim Chaves' (3):\n",
"6271698f81e4bccb640d6e24 : Flags of the world\n",
"62792f0488e4d3d7181d8256 : nations_of_the_world\n",
"627e12a31bdd62bb88d6afb8 : X-ray-analysis\n"
]
}
],
"source": [
"workspace.show_collections()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<CollectionV1 id=627e12a31bdd62bb88d6afb8 name=X-ray-analysis>\n"
]
}
],
"source": [
"# Let's get the X-ray-analysis collection, which is the one we're currently working with\n",
"collection = workspace.get_collection_by_name('X-ray-analysis')\n",
"print(collection)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Chaging datafile value of an item using Tags <a class=\"anchor\" id=\"third-bullet\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"When we were investigating the data using the Zegami platform, we noticed that one datapoint didn't have the temperature data in the right unit (Fahrenheit, instead of Celsius). This was causing an issue where the temperature axis didn't have an appropriate scale, as there was an anomaly. We can change the values in the datafile using the SDK!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/weird_axis.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.1. Creating a Tag <a class=\"anchor\" id=\"creating-tag\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"In this case, we're only looking at one sample, so we could manually change the value in the datafile and reupload it. But, for the sake of demonstrating something that is more scalable, let's imagine that there are several samples that have this issue. Using the platform, we can easily create a **Tag** associated with these samples, using the selection tool."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/tag_wrong_unit.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now change the values of the samples belonging to that Tag."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2. Converting Fahrenheit values to Celsius <a class=\"anchor\" id=\"fah_to_cel\"></a> <span style=\"font-size:0.5em;\">[(Back to TOC)](#toc)</span>\n",
"\n",
"First, we'll start by creating the function that we'll use to modify the values. Then, we'll get a copy of the dataframe containing the samples (rows) belonging to the Tag that we created. We'll alter that copy, and then we'll use it to update the datafile."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def fahrenheit_to_celsius(temp: float) -> float:\n",
" temp -= 32\n",
" temp *= 5/9.\n",
" return temp"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"collec_temp_unit_tag = collection.get_rows_by_tags(['wrong_temp_unit']).copy()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"collec_temp_unit_tag['temperature'] = collec_temp_unit_tag['temperature'].apply(fahrenheit_to_celsius)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"collection.rows.update(collec_temp_unit_tag)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"collection.replace_data(collection.rows)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When doing these sorts of operations (uploading data), we should check collection.status. Changes may take some time."
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'changed_at': 'Mon, 16 May 2022 19:13:47 GMT',\n",
" 'progress': 1.0,\n",
" 'status': 'completed'}"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.status"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"108 35.0\n",
"Name: temperature, dtype: float64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"collection.get_rows_by_tags(['wrong_temp_unit']).temperature"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that the value has been updated, let's check Zegami's platform!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/fixed_temp_axis.png\" width=\"1000\"/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Much better!"
]
}
],
"metadata": {
"interpreter": {
"hash": "17361232fb9bcb0ba5939612f19d0269bf9334e24dd707504afe9bf985e11e05"
},
"kernelspec": {
"display_name": "Python 3.9.12 ('env': venv)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 30564cb

Please sign in to comment.