Name		Name	Last commit message	Last commit date
parent directory ..
DMGI_trainer.py		DMGI_trainer.py
DeepwalkTrainer.py		DeepwalkTrainer.py
GATNE_trainer.py		GATNE_trainer.py
HeCo_trainer.py		HeCo_trainer.py
HeGAN_trainer.py		HeGAN_trainer.py
README.md		README.md
TransX_trainer.py		TransX_trainer.py
__init__.py		__init__.py
base_flow.py		base_flow.py
demo.py		demo.py
hde_trainer.py		hde_trainer.py
herec_trainer.py		herec_trainer.py
hetgnn_trainer.py		hetgnn_trainer.py
hgt_trainer.py		hgt_trainer.py
kgcn_trainer.py		kgcn_trainer.py
link_prediction.py		link_prediction.py
mp2vec_trainer.py		mp2vec_trainer.py
networkschema.py		networkschema.py
node_classification.py		node_classification.py
node_classification_ac.py		node_classification_ac.py
nshe_trainer.py		nshe_trainer.py
recommendation.py		recommendation.py
slice_trainer.py		slice_trainer.py

README.md

TrainerFlow

A trainerflow is an abstraction of a predesigned workflow that trains and evaluate a model on a given dataset for a specific use case. It must contain a unique training mechanism involving loss calculation and a specific sampler(sample something used in loss calculation) .

Once we select the model and the task, the func get_trainerflow will help us select the trainerflow. So the customized trainerflow needed be added in this func.

Included Object:

task : Task
model : Model (built through given args.model)
optimizer : torch.optim.Optimizer
dataloader(if mini_batch_flag is True) :
- torch.utils.data.DataLoader
- dgl.dataloading

Method:

train()
- decorated with @abstractmethod, so it must be overridden.
_full_train_setp()
- train with a full_batch graph
_mini_train_step()
- train with a mini_batch seed nodes graph
_test_step()
- evaluate in training/validation/testing

Supported trainerflow

Node classification flow
- Supported Model: HAN/MAGNN/GTN
- The task: node classification
  - The task.dataset must include the splited[train/valid/test.] mask.
  - The task.dataset will give the value of input dimension to the args.in_dim.
- The sampler in this flow is supported by dgl.dataloading.
- The flow is the most common in the GNNs cause most GNNs model are involved in the task semi-supervised node classification. Here the task is to classify the nodes of HIN(Heterogeneous Information Network).
- Note: we will set the args.out_dim with num_classes if they are not equivalent.
Link prediction
- The same with entity classification except that it is used for link prediction.
- Supported Model: RGCN/CompGCN/RSHN
- Supported Task: link prediction
HetGNN trainerflow
NSHE trainerflow

How to build a new trainerflow

Create a class your_trainerflow that inherits the BaseFlow and register the trainerflow with @register_flow(str).
We decorate the func train() with @abstractmethod. So the train() must be overridden, or the your_trainerflow cannot be instantiated. Besides train(), the init and _test_step() should both be implement. One of the _full_train_step() and _mini_train_step() must be implemented at least.
Add your_trainerflow into the func get_trainerflow.
Fill the dict SUPPORTED_FLOWS in trainerflow/init.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trainerflow

trainerflow

README.md

TrainerFlow

Included Object:

Method:

Supported trainerflow

Node classification flow

Link prediction

HetGNN trainerflow

NSHE trainerflow

How to build a new trainerflow

Files

trainerflow

Directory actions

More options

Directory actions

More options

Latest commit

History

trainerflow

Folders and files

parent directory

README.md

TrainerFlow

Included Object:

Method:

Supported trainerflow

Node classification flow

Link prediction

HetGNN trainerflow

NSHE trainerflow

How to build a new trainerflow