-
Notifications
You must be signed in to change notification settings - Fork 447
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add BiparteGraphSage tutorial to show use of categorical_attrs_desc (#…
…691)
- Loading branch information
Showing
7 changed files
with
551 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
268 changes: 268 additions & 0 deletions
268
tutorials/9_unsupervised_learning_with_bipartite_graphsage.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,268 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Unsupervised Graph Learning with BipartiteGraphSage\n", | ||
"\n", | ||
"\n", | ||
"Bipartite graphs are very common in e-commerce recommendation. In this tutorial, we demostrate how GraphScope trains a model with BipartiteGraphSage on bipartite graph.\n", | ||
"\n", | ||
"The task is link prediction, which estimates the probability of links between user and item nodes in a graph.\n", | ||
"\n", | ||
"In this task, we use our implementation of BipartiteGraphSage algorithm to build a model that predicts user-item links in the [U2I](http://graph-learn-dataset.oss-cn-zhangjiakou.aliyuncs.com/u2i.zip) dataset. In which nodes can represents user node and item node. The task can be treated as a unsupervised link prediction on a homogeneous link network.\n", | ||
"\n", | ||
"In this task, BipartiteGraphSage algorithm would compress both structural and attribute information in the graph into low-dimensional embedding vectors on each node. These embeddings can be further used to predict links between nodes.\n", | ||
"\n", | ||
"This tutorial has following steps:\n", | ||
"- Creating session and loading graph\n", | ||
"- Launching the learning engine and attaching to loaded graph.\n", | ||
"- Defining train process with builtin GraphSage model and hyperparameters\n", | ||
"- Training and evaluating\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"First, let's create a session and load the dataset as a graph." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import os\n", | ||
"import graphscope\n", | ||
"\n", | ||
"k8s_volumes = {\n", | ||
" \"data\": {\n", | ||
" \"type\": \"hostPath\",\n", | ||
" \"field\": {\n", | ||
" \"path\": \"/testingdata\",\n", | ||
" \"type\": \"Directory\"\n", | ||
" },\n", | ||
" \"mounts\": {\n", | ||
" \"mountPath\": \"/home/jovyan/datasets\",\n", | ||
" \"readOnly\": True\n", | ||
" }\n", | ||
" }\n", | ||
"}\n", | ||
"\n", | ||
"# create session\n", | ||
"graphscope.set_option(show_log=True)\n", | ||
"sess = graphscope.session(k8s_volumes=k8s_volumes)\n", | ||
"\n", | ||
"# loading u2i graph\n", | ||
"graph = sess.g()\n", | ||
"graph = graph.add_vertices(\n", | ||
" Loader(\"/home/jovyan/datasets/u2i/node.csv\", delimiter=\"\\t\"),\n", | ||
" label=\"u\",\n", | ||
" properties=[(\"feature\", \"str\")],\n", | ||
" vid_field=\"id\"\n", | ||
")\n", | ||
"graph = graph.add_vertices(\n", | ||
" Loader(\"/home/jovyan/datasets/u2i/node.csv\", delimiter=\"\\t\"),\n", | ||
" label=\"i\",\n", | ||
" properties=[(\"feature\", \"str\")],\n", | ||
" vid_field=\"id\"\n", | ||
")\n", | ||
"graph = graph.add_edges(\n", | ||
" Loader(\"/home/jovyan/datasets/u2i/edge.csv\", delimiter=\"\\t\"),\n", | ||
" label=\"u-i\",\n", | ||
" properties=[\"weight\"],\n", | ||
" src_label=\"u\",\n", | ||
" dst_label=\"i\",\n", | ||
" src_field=\"src_id\",\n", | ||
" dst_field=\"dst_id\"\n", | ||
")\n", | ||
"graph = graph.add_edges(\n", | ||
" Loader(\"/home/jovyan/datasets/u2i/edge.csv\", delimiter=\"\\t\"),\n", | ||
" label=\"u-i_reverse\",\n", | ||
" properties=[\"weight\"],\n", | ||
" src_label=\"i\",\n", | ||
" dst_label=\"u\",\n", | ||
" src_field=\"dst_id\",\n", | ||
" dst_field=\"src_id\"\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Launch learning engine \n", | ||
"Then, we need to define a feature list for training. The training feature list should be seleted from the vertex properties. In this case, we choose the `feature` property as the training features.\n", | ||
"\n", | ||
"With the featrue list, next we launch a learning engine with the learning method of session. (You may find the detail of the method on [Session](https://graphscope.io/docs/reference/session.html).)\n", | ||
"\n", | ||
"In this case, we specify the BipartiteGraphSage training over `user` and `item` nodes and `u-i` edges.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"\n", | ||
"# launch a learning engine.\n", | ||
"lg = sess.learning(\n", | ||
" graph,\n", | ||
" nodes=[(\"u\", [\"feature\"]), (\"i\", [\"feature\"])],\n", | ||
" edges=[((\"u\", \"u-i\", \"i\"), [\"weight\"]), ((\"i\", \"u-i_reverse\", \"u\"), [\"weight\"])],\n", | ||
")\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"\n", | ||
"We use the builtin BipartiteGraphSage model to define the training process. You can find more detail about all the builtin learning models on [Graph Learning Model](https://graphscope.io/docs/learning_engine.html#data-model)\n", | ||
"\n", | ||
"In the example, we use tensorflow as NN backend trainer.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"\n", | ||
"import numpy as np\n", | ||
"from graphscope.learning.examples import BipartiteGraphSage\n", | ||
"from graphscope.learning.graphlearn.python.model.tf.trainer import LocalTFTrainer\n", | ||
"from graphscope.learning.graphlearn.python.model.tf.optimizer import get_tf_optimizer\n", | ||
"\n", | ||
"# unsupervised GraphSage.\n", | ||
"\n", | ||
"def train(config, graph):\n", | ||
" def model_fn():\n", | ||
" return BipartiteGraphSage(graph,\n", | ||
" config['batch_size'],\n", | ||
" config['hidden_dim'],\n", | ||
" config['output_dim'],\n", | ||
" config['hops_num'],\n", | ||
" config['u_neighs_num'],\n", | ||
" config['i_neighs_num'],\n", | ||
" u_features_num=config['u_features_num'],\n", | ||
" u_categorical_attrs_desc=config['u_categorical_attrs_desc'],\n", | ||
" i_features_num=config['i_features_num'],\n", | ||
" i_categorical_attrs_desc=config['i_categorical_attrs_desc'],\n", | ||
" neg_num=config['neg_num'],\n", | ||
" use_input_bn=config['use_input_bn'],\n", | ||
" act=config['act'],\n", | ||
" agg_type=config['agg_type'],\n", | ||
" need_dense=config['need_dense'],\n", | ||
" in_drop_rate=config['drop_out'],\n", | ||
" ps_hosts=config['ps_hosts'])\n", | ||
" trainer = LocalTFTrainer(model_fn,\n", | ||
" epoch=config['epoch'],\n", | ||
" optimizer=get_tf_optimizer(\n", | ||
" config['learning_algo'],\n", | ||
" config['learning_rate'],\n", | ||
" config['weight_decay']))\n", | ||
"\n", | ||
" trainer.train()\n", | ||
"\n", | ||
" u_embs = trainer.get_node_embedding(\"u\")\n", | ||
" np.save('u_emb', u_embs)\n", | ||
" i_embs = trainer.get_node_embedding(\"i\")\n", | ||
" np.save('i_emb', i_embs)\n", | ||
"\n", | ||
"# define hyperparameters\n", | ||
"config = {'batch_size': 128,\n", | ||
" 'hidden_dim': 128,\n", | ||
" 'output_dim': 128,\n", | ||
" 'u_features_num': 1,\n", | ||
" 'u_categorical_attrs_desc': {\"0\":[\"u_id\",10000,64]},\n", | ||
" 'i_features_num': 1,\n", | ||
" 'i_categorical_attrs_desc': {\"0\":[\"i_id\",10000,64]},\n", | ||
" 'hops_num': 1,\n", | ||
" 'u_neighs_num': [10],\n", | ||
" 'i_neighs_num': [10],\n", | ||
" 'neg_num': 10,\n", | ||
" 'learning_algo': 'adam',\n", | ||
" 'learning_rate': 0.001,\n", | ||
" 'weight_decay': 0.0005,\n", | ||
" 'epoch': 10,\n", | ||
" 'use_input_bn': True,\n", | ||
" 'act': tf.nn.leaky_relu,\n", | ||
" 'agg_type': 'gcn',\n", | ||
" 'need_dense': True,\n", | ||
" 'drop_out': 0.0,\n", | ||
" 'ps_hosts': None\n", | ||
" }" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Run training process\n", | ||
"\n", | ||
"After define training process and hyperparameters,\n", | ||
"\n", | ||
"Now we can start the traning process with learning engine `lg` and the hyperparameters configurations." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"train(config, lg)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Finally, don't forget to close the session." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sess.close()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.6" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.