TrainDSMs

Background

Research code. Under active development.

Usage

The user defines multiple jobs (e.g. which DSM to train on which corpus) and submits each job to one of 8 machines owned by the UIUC Learning & Language Lab. To do so, we use Ludwig, a command line interface for communicating the job submission system. To use Ludwig, you must be a member of the lab.

DSM Architectures

We examined a number of distributional semantic models (DSMs), including:

W2Vec
Simple RNN, LSTM
Transformer
LON, CTN (graphical)

Evaluation

Currently, we are using the MissingAdjunct corpus to evaluate the ability of models to infer a missing instrument. This ability requires compositional generalization, given that the model has never seen the correct answer during training, but is provided all the components to make the correct (i.e. structurally licensed) inference.

We assign a hit every time a model predicts the structurally licensed instrument, given a verb phrase (VP).

There are many conditions, such as verb type, theme type, etc.

All evaluations are pooled across model replications (random seed used to sample from the corpus), and items (VPs) of the same type.

Advanced

If you update the training data, e.g. MissingAdjunct, make sure to move the data to the file server:

ludwig -r10 -e ../MissingAdjunct/missingadjunct ../MissingAdjunct/items

Compatibility

Developed using Python 3.7.9 on Ubuntu 18.04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

TrainDSMs

Background

Usage

DSM Architectures

Evaluation

Advanced

Compatibility

Files

README.md

Latest commit

History

README.md

File metadata and controls

TrainDSMs

Background

Usage

DSM Architectures

Evaluation

Advanced

Compatibility