论文Inductive Representation Learning on Large Graphs的tensorflow版本实现是inductive的,但在graphsage-pytorch中(https://github.com/twjiang/graphSAGE-pytorch) 却是transductive的。
The tensorflow version of implementation of graphsage is inductive, however the pytorch version (https://github.com/twjiang/graphSAGE-pytorch) is transductive.
This is because the author included neighors of test set and val dataset in train set. And this means part of the nodes in test set is known in the training period.
So, I made some changes to change it from transductive to inductive.
I use cora as the dataset. When building adj_lists, for nodes in test and val set, direct adding an edge to them is OK; for train set, we need to judge whether both two nodes are in train set, we can add edges to them only when the answer is true to ensure we use no nodes in val or test set when do the training.
# 对于测试集和验证集,直接建立adj_list
if node_map[pair[0]] not in train_index:
if node_map[pair[1]] not in train_index:
# 对于训练集,neighbor只能包含训练集中的节点
if node_map[pair[0]] in train_index and node_map[pair[1]] in train_index:
Using the methods above, the following problems may occur: All the neighbors of a train set's node are in test set or val set, then it turns an isolated node without neighbors.
The problem with isolated nodes is that when calculating losses, we need to do positive and negative sampling of the nodes, and we cannot do positive sampling of isolated nodes.
When building adj_list, add self-loop for nodes.
if node_map[pair[0]] not in adj_lists: adj_lists[node_map[pair[0]]].add(node_map[pair[0]])
if node_map[pair[1]] not in adj_lists: adj_lists[node_map[pair[1]]].add(node_map[pair[1]])
First set the initial value as null.
if node_map[pair[0]] not in adj_lists: adj_lists[node_map[pair[0]]] = set()
if node_map[pair[1]] not in adj_lists: adj_lists[node_map[pair[1]]] = set()
Then delete the adj_list and train_index of isolated nodes
i = 0
while i < len(train_index):
if i<len(train_index) and len(adj_lists[train_index[i]]) == 0:
# print('--------------deleted--------------', i)
del adj_lists[train_index[i]]
train_index = np.delete(train_index, i)
i -= 1
i += 1
myGraphSAGE_inductive_delete.py : 采用删除孤立节点的方式的graphsage的inductive版本
myGraphSAGE_inductive_selfloop.py : 采用增加自循环的方式的graphsage的inductive版本
myGraphSAGE_transductive.py : 原始的graphsage的transductive版本
myGraphSAGE_inductive_delete.py : The inductive version of graphsage by deleting isolated nodes
myGraphSAGE_inductive_selfloop.py : The inductive version of graphsage by adding self-loop
myGraphSAGE_transductive.py : the raw transductive version of graphsage
xxx_sampleCloseCentrality.py : 采用紧密中心性采样 (取最大的5个和最小的5个
xxx_sampleDegreeCentrality.py : 采用度中心性采样 (取最大的10个
xxx_sampleCloseCentrality.py : sample using closeness centrality (top 10
xxx_sampleDegreeCentrality.py : sample using degree centrality (top5 && bottom5
10 epoches with Mean aggregator
supervised | unsupervised | sup+unsup | |
inductive_delete | 0.835920 | 0.545455 | 0.761641 |
inductive_selfloop | 0.853659 | 0.579823 | 0.808204 |
transductive | 0.851441 | 0.701774 | 0.853659 |
supervised | unsupervised | sup+unsup | |
inductive_delete | 0.843681 | 0.558758 | 0.819290 |
inductive_selfloop | 0.859202 | 0.576497 | 0.812639 |
transductive | 0.868071 | 0.729490 | 0.853659 |
supervised | unsupervised | sup+unsup | |
inductive_delete | 0.848115 | 0.705100 | 0.783814 |
inductive_selfloop | 0.873614 | 0.686253 | 0.786031 |
transductive | 0.862528 | 0.742794 | 0.862528 |
mkdir outputFiles
python3 + xxx.py