Skip to content

Commit

Permalink
Fixed readme.
Browse files Browse the repository at this point in the history
  • Loading branch information
Bryan Plummer committed Feb 3, 2018
1 parent 4e5ff1f commit e711008
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ This code was tested on an Ubuntu 16.04 system using Tensorflow 1.2.1.
### Phrase Localization Evaluation Demo
After you download our precomputed features/model you can test it using:

python main.py --test --spatial --name runs/cite_spatial_k4/model_best
python main.py --test --spatial --resume runs/cite_spatial_k4/model_best

### Training New Models
Our code contains everything required to train or test models using precomputed features. You can train a new model on Flickr30K Entites using:

python main.py --name <name of experiment>

When it completes training it will output the localization accuracy using the best model on the testing and validation sets. You can see a listing and description of many tuneable parameters with:
When it completes training it will output the localization accuracy using the best model on the testing and validation sets. Note that the above does not use the spatial features we used in our paper (needs the `--spatial` flag). You can see a listing and description of many tuneable parameters with:

python main.py --help

Expand Down
2 changes: 1 addition & 1 deletion data_processing_example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The code currently assumes datasets are divided into three hdf5 files named `<sp
2. phrases: array of #num_phrase strings corresponding to the phrase features
3. pairs: 3 x M matrix where each column contains a string representation for the `[image name, phrase, pair identifier]` pairs in the split.
4. Each `<image name>` should return a #num_boxes x feature_dimensional matrix of the visual features. The features should contain the visual representation as well as the spatial features for the box followed by its coordinates (i.e. the precomputed features we released are 4096 (VGG) + 5 (spatial) + 4 (box coordinates) = 4105 dimensional).
5. Each `<image name>_<phrase>_<pair identifier> should contain a vector containing the intersection over union with the ground truth box followed by the box's coordinates (i.e. for N boxes the vector should be N + 4 dimensional).
5. Each `<image name>_<phrase>_<pair identifier>` should contain a vector containing the intersection over union with the ground truth box followed by the box's coordinates (i.e. for N boxes the vector should be N + 4 dimensional).


The example script uses the [pl-clc](https://github.com/BryanPlummer/pl-clc) repo for parsing and computing features of the Flick30K Entities dataset. It assumes it uses the built-in MATLAB PCA function, and not the one in the `toolbox` external module.

0 comments on commit e711008

Please sign in to comment.