-
First prepare the dataset figment. http://cistern.cis.lmu.de/figment/
-
Download entity dataset, entity embeddings (around 2gb) into data/figment. Be sure to unzip entity dataset.
-
python data/preprocess_figment.py
-
Prepare the bibtex dataset. http://mulan.sourceforge.net/datasets.html
-
python data/preprocess_bibtex.py
-
Download the bookmarks dataset from here.
-
Place it in
data/bookmarks
. There is no need to run any proprocessing script.
python -m mains.infnet --config configs/figment.json
python -m mains.infnet --config configs/bibtex.json
python -m mains.infnet --config configs/bookmarks.json
base/
: Contains the base model and base trainer. The model and trainer are inherited from here.configs/
: Contains the configuration files stored in json format. All hyper-parameters are stored here.data/
: Contains scripts to process the data files and store them into pickle formatdata_loader/
: Contains the class DataGenerator which is used to get data from the pipeline. Since most of our models are small, a naive implementation was fine. In case of bigger datasets, it might be worth looking into the tensorflow dataset api.mains/
: Contains the main file to be called which isinfnet.py
This takes in the configuration file, and uses it to initiliaze which model, trainer, hyper-parameteres to choose, which parameters to save for tensorboard etc.models/
: Contains model definitions, each of which is a class. There are 4 such classes. EnergyNet, InferenceNet, FeatureNet, Spen. The first three are simple feed forward networks, and the last one is the actual model which is used and combines all the different networks together.trainers/
: Contains the trainer, which schedules the training, evaluation, tensorboard logging among different things.utils/
: Contains utility function like the process_config which is used to parse the configuration file.analysis.py
,generate_configs.py
,run.py
: are all used for hyper-parameter tuning.
exp_name
: Name of the experimentdata
: Contains info about the datadataset
: Name of the datasetdata_dir
: Path for the top directory of datasplits
: Splits for train, validation, test.embeddings
: True/False. True if pre-trained embeddings are available.vocab
: Same as abovedata_generator
: Name of the data generator defined indata_loader.py
.
tensorboard_train
: Set True to save tensorboard in the stage2tensorboard_infer
: Same as above in stage3feature_size
: Size of hidden layer in Feature Networklabel_measurements
: Same as above in Energy Networktype_vocab_size
: Number of output labels.entities_vocab_size
: Lookup table for embeddings.embeddings_tune
: Set to true if embeddings vector to be updated.max_to_keep
: Requiredby tensorflow saver that will be used in saving the checkpoints.num_epochs
: Number of epochs in each stagetrain
: Info about how to traindiff_type
: \nabla operator in the paperbatch_size
: batch size for trainingstate_size
: Not required. Kept for historical reasons.hidden_units
: Hidden units in inference and feature net (depends on embeddings is true or false)lr_*
: learning rate for optimization of corresponding variablelambda_*
: lambda regularization for optimization of corresponding variablelambda_pretrain_bias
: How much to weigh the pretrained network (another term in paper).wgan_mode
: Improved WGAN penalty or not.lamb_wgan
: regularization for wgan penalty.
ssvm
: Implementation of SPEN 2016 by Belanger. Not complete since we couldn't find implementation of entropic gradient descent.enable
: To be ssvm or not to besteps
: Number of optimization steps in ssvmeval
: True if ssvm inference to be used.lr_inference
: learning rate for ssvm inference
eval_print
: What all to print for evaluationf1
: f1 scoreaccuracy
: accuracyenergy
: energypretrain
: energy / loss of pretraininfnet
: energy / loss of inference networkf1_score_mode
: Set to examples to compute F1 score averaged over examples. Do label for F1 score averaged over labels. The paper does it over examples.threshold
: Threshold adjusted on validation settime_taken
: For time evaluation. Only training/inference step time. Not whole time.
A major thanks to Lifu Tu and Kevin Gimpel (authors of the paper we have implemented) for sharing their Theano code and responding promptly to our queries on the paper. We thank Lifu for sharing his Theano Code. We also thank David Belanger for the Bookmarks dataset and his original SPEN implementation.
We also thank the authors of tensorflow template https://github.com/MrGemy95/Tensorflow-Project-Template which served as a starting point for our project.
Lifu Tu and Kevin Gimpel. Learning approximate inference networks for structured prediction. CoRR, abs/1803.03376, 2018. URL http://arxiv.org/abs/1803.03376