Fixed readme.

BryanPlummer · Feb 3, 2018 · e711008 · e711008
1 parent 4e5ff1f
commit e711008
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -14,14 +14,14 @@ This code was tested on an Ubuntu 16.04 system using Tensorflow 1.2.1.
 ### Phrase Localization Evaluation Demo
 After you download our precomputed features/model you can test it using:
 
-    python main.py --test --spatial --name runs/cite_spatial_k4/model_best
+    python main.py --test --spatial --resume runs/cite_spatial_k4/model_best
 
 ### Training New Models
 Our code contains everything required to train or test models using precomputed features.  You can train a new model on Flickr30K Entites using:
 
     python main.py --name <name of experiment>
 
-When it completes training it will output the localization accuracy using the best model on the testing and validation sets. You can see a listing and description of many tuneable parameters with:
+When it completes training it will output the localization accuracy using the best model on the testing and validation sets.  Note that the above does not use the spatial features we used in our paper (needs the `--spatial` flag). You can see a listing and description of many tuneable parameters with:
 
     python main.py --help
 

diff --git a/data_processing_example/README.md b/data_processing_example/README.md
@@ -6,7 +6,7 @@ The code currently assumes datasets are divided into three hdf5 files named `<sp
 2. phrases: array of #num_phrase strings corresponding to the phrase features
 3. pairs: 3 x M matrix where each column contains a string representation for the `[image name, phrase, pair identifier]` pairs in the split.
 4. Each `<image name>` should return a #num_boxes x feature_dimensional matrix of the visual features.  The features should contain the visual representation as well as the spatial features for the box followed by its coordinates (i.e. the precomputed features we released are 4096 (VGG) + 5 (spatial) + 4 (box coordinates) = 4105 dimensional).
-5. Each `<image name>_<phrase>_<pair identifier> should contain a vector containing the intersection over union with the ground truth box followed by the box's coordinates (i.e. for N boxes the vector should be N + 4 dimensional).
+5. Each `<image name>_<phrase>_<pair identifier>` should contain a vector containing the intersection over union with the ground truth box followed by the box's coordinates (i.e. for N boxes the vector should be N + 4 dimensional).
 
 
 The example script uses the [pl-clc](https://github.com/BryanPlummer/pl-clc) repo for parsing and computing features of the Flick30K Entities dataset.  It assumes it uses the built-in MATLAB PCA function, and not the one in the `toolbox` external module.