Graph Embedding and Visualization

This is a repository for graph embedding and visualization. Various different graph embedding methods and dimension reduction methods are combined to produce 2D layouts for graph data.

Instructions

Install Python 3.9 (other unprescribed versions of Python may work, but are not tested).
Clone the repository. Use Clone Git Repository... tab in an empty window of VSCode or use the following command line in Command Prompt:

git clone https://github.com/Charlie-XIAO/embedding-visualization-test.git

Set the Python virtual environment using the following command lines in Command Prompt:

python -m venv myvenv
(For Windows) myvenv\Scripts\activate
(For Mac/Linux) source myvenv/bin/activate

Install required packages in the Python virtual environment using the following command line in Command Prompt:

pip install -r requirements.txt

Run main.py using the following command line in Command Prompt:

python main.py --data <dataset_name> --embed <embedding_method> --vis <visualization_method>
(Example) python main.py --data wiki --embed deepwalk --vis t-sne

To run the program on large datasets mentioned in the experiment 2 of the essay, download zipped datasets from this google drive link, and unzip the file in the datasets folder.

Graph Embedding

Previous Works

Method	Paper	Note
DeepWalk	[KDD 2014] DeepWalk: Online Learning of Social Representations	【Graph Embedding】DeepWalk：算法原理，实现和应用
Node2Vec	[KDD 2016] Node2Vec: Scalable Feature Learning for Networks	【Graph Embedding】Node2Vec：算法原理，实现和应用
LE	[KDD 2001] Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering	【Graph Embedding】LE（拉普拉斯映射）特征提取方法
GLEE	[KDD 2019] GLEE: Geometric Laplacian Eigenmap Embedding
SDNE	[KDD 2016] Structural Deep Network Embedding	【Graph Embedding】SDNE：算法原理，实现和应用

DeepWalk

model = DeepWalk(self.graph, walk_length=10, num_walks=80, workers=1)
model.train(embed_size=128, window_size=5, iter=3)
embeddings = pd.DataFrame.from_dict(model.get_embeddings())
self.embeddings = embeddings.T

Node2Vec

model = Node2Vec(self.graph, walk_length=10, num_walks=80, p=0.25, q=4, workers=1)
model.train(embed_size=128, window_size=5, iter=3)
embeddings = pd.DataFrame.from_dict(model.get_embeddings())
self.embeddings = embeddings.T

LE

model = LEE(self.graph)
embeddings = pd.DataFrame.from_dict(model.get_embeddings(embed_size=128, iter=100))
self.embeddings = embeddings.T

GLEE

model = GLEE(self.graph)
embeddings = pd.DataFrame.from_dict(model.get_embeddings(embed_size=128, iter=100))
self.embeddings = embeddings.T

SDNE

model = SDNE(self.graph, hidden_size=[256, 128])
model.train(batch_size=3000, epochs=40, verbose=2)
embeddings = pd.DataFrame.from_dict(model.get_embeddings())
self.embeddings = embeddings.T

Our Contributions

SP

model = ShortestPath(self.graph)
embeddings = pd.DataFrame.from_dict(model.get_embeddings(embed_size=128, sampling="random"))
self.embeddings = embeddings.T

SPLEE

model = SPLEE(self.graph)
embeddings = pd.DataFrame.from_dict(model.get_embeddings(embed_size=128, iter=10, shape="gaussian", epsilon=6.0, threshold=5))
self.embeddings = embeddings.T

Graph Visualization

Previous Works

Method	Paper	Note
PCA	[WCS 2010] Principal Component Analysis	【Dimension Reduction】主成分分析（PCA）原理详解
t-SNE	[KDD 2016] Visualizing Data Using t-SNE	【Dimension Reduction】降维方法之 t-SNE

PCA

model = PCA(n_components=2, random_state=0)
self.projections = model.fit_transform(self.X)

t-SNE

model = TSNE(n_components=2, verbose=2, random_state=0)
self.projections = model.fit_transform(self.X)

Our Contributions

t-SGNE

model = TSGNE(perplexity=30, n_components=2, verbose=2, random_state=0, knn_matrix=self.knn_matrix, mode="distance")
self.projections = model.fit_transform(self.X)

Datasets

Usage

In the datasets folder, create a folder with the name of the dataset. In this folder, put the edgelist file and the labels file (optional), and name them <dataset_name>_edgelist.txt and <dataset_name>_labels.txt respectively. The program automatically reads graph and label data from the correponding locations.

Source

Index of Complex Networks (colorado.edu)
Stanford Large Network Dataset Collection
Network Repository: An Interactive Scientific Network Data Repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Graph Embedding and Visualization

Instructions

Graph Embedding

Previous Works

DeepWalk

Node2Vec

LE

GLEE

SDNE

Our Contributions

SP

SPLEE

Graph Visualization

Previous Works

PCA

t-SNE

Our Contributions

t-SGNE

Datasets

Usage

Source

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Graph Embedding and Visualization

Instructions

Graph Embedding

Previous Works

DeepWalk

Node2Vec

LE

GLEE

SDNE

Our Contributions

SP

SPLEE

Graph Visualization

Previous Works

PCA

t-SNE

Our Contributions

t-SGNE

Datasets

Usage

Source