-
Notifications
You must be signed in to change notification settings - Fork 0
Training and evaluating your own dataset
In this Wiki, we will explore how you can train and inference your own custom dataset. There are number of options available in LENS to tune a model to best fit your data. LENS can be used with a range of DVS cameras which simply need a few parameters modified.
Before starting, please ensure you have the repo downloaded and the necessary requirements installed.
# Get the repo
git clone [email protected]:AdamDHines/LENS.git
# Install dependencies
pip install -r requirements.txt
Our model trains on event frames which count the number of events detected over a certain time window. The time window that should be selected is based on your data and the kind of event frame representation you wish to create. It is best to ensure that you are counting a sufficient number of events per pixel, an average of 50-100 events per pixel is usually a good starting point.
Counting the events and generating an event frame depends on the DVS camera you have. So while we cannot provide a generic script to do this, please take a look at an example in ./lens/tools/manual_eventframe_generator.py
. This should give you an idea on how to modify it based on your camera dimensions and the type of event data that is output.
All we are doing in manual_eventframe_generator.py
is looping through each event collected from our SPECKTM and mapping the x and y to an empty data tensor of zeros, then we just += 1 to that coordinate. In our case, we consider both positive and negative events but you could easily filter based on polarity using this method. At the moment, only square images are useable in LENS (i.e. the height and width are the same) due to the 2D convolutional filtering used later on.
If you create a script for a specific camera and would like to contribute it for others to use, please submit a pull request.
Generated event frames should be in 8-bit grayscale .png
format and stored in the ./lens/dataset/
folder. Please follow the directory convention as shown below:
--dataset
|--dataset1
|--camera1
|experiment001
|experiment002
|--camera2
|experiment003
|experiment004
By following this convention, a single dataset evaluated using different DVS cameras can be nicely oranized and easily referenced by LENS.
Once you have created your dataset, you can generate the necessary .csv
file for the torch model by using the ./lens/tools/create_data_csv.py
script. Simply modify the path to your dataset and it will automatically create the file for you.
To train your model using the default hyperparameters to start, this is the base of what we parse in a console:
# Ensure you are in the LENS folder
cd LENS
# Train a model with the default hyperparameters
python main.py \
--train_model \
--dataset <your dataset> \
--camera <your camera> \
--reference <your reference experiment> \
--reference_places <number of reference images>
Modify the dataset, camera, reference, and reference_places to the relevant names for the images you just created. For example, let's say we created a new dataset called "office-building" using a DAVIS346. After running an experiment we generated 100 images, the data would be stored as follows:
--dataset
|--office-building
|--davis346
|--traverse001
To train this dataset, we would parse the following in a terminal:
python main.py \
--train_model \
--dataset office-building \
--camera davis346 \
--reference traverse001 \
--reference_places 100
However, there is a parameter we would need to modify first. In our model, we used 80x80 images but this might be too big or too small based on your data. We need to parse in another parameter --dims
which we can increase or decrease based on our image size. For our DAVIS346 example, let's say we generate frames that are slightly bigger being 20x20:
python main.py \
--train_model \
--dataset office-building \
--camera davis346 \
--reference traverse001 \
--reference_places 100 \
--dims 20
This will modify the network architecture to increase the number of input neurons as well as the 2D convolutional layer that selects out pixels within these images.
We can also further modify the network architecture by changing the size of the Feature layer by parsing the --feature_multiplier
argument. This float value can change how large our Feature layer is relative to the Input layer size:
python main.py \
--train_model \
--dataset office-building \
--camera davis346 \
--reference traverse001 \
--reference_places 100 \
--dims 20 \
--feature_multiplier 4.0
The preset hyperparameters in LENS have been set for our specific data. This likely will not be ideal for your specific datasets, so we recommend tuning your model using our optimizer.py
script. Please see Setting up and using the optimizer wiki for more information on how to use this tool which also provides an explanation how to modify/parse different hyperparameters in your model.
Once a model has been trained, we can quickly visually evaluate whether or not the features in the images were learnt properly. The easiest way to do this is to run your training data through the evaluation to check if it robustly learnt each unique place. To run the inferencing model, we need to parse the reference information so that it correctly loads the right model and specify which dataset we'd like to inference using --query
and --query_places
.
To visually evaluate the training, we can parse --sim_mat
which will generate a similarity matrix:
python main.py \
--train_model \
--dataset office-building \
--camera davis346 \
--reference traverse001 \
--reference_places 100 \
--dims 20 \
--query traverse001 \
--query_places 100 \
--sim_mat
If satisfied, you can go on to use this model to inference any experiment you'd like. The model generated is named only by the experiment, in this case traverse001
so if you'd like to evaluate results against another camera or dataset that is possible.
We also provide a way to evaluate the Recall@N and Precision-Recall of your dataset by allowing you to use your own ground truth (GT) data. The GT should be a .npy
file which contains a np.array
binary matrix where each corresponding reference to query == 1 and everything == 0. The GT file should be named <reference>_<query>_GT.npy
where reference
is the name of your reference dataset and query
is your query dataset. Store this file in your dataset subfolder:
--dataset
|--office-building
|--davis346
reference_query_GT.npy
To use this file for matching during inferencing, parse the --matching
flag:
python main.py \
--dataset office-building \
--camera davis346 \
--reference traverse001 \
--reference_places 100 \
--dims 20 \
--query traverse002 \
--query_places 100 \
--sim_mat \
--matching
We can also generate a Precision-Recall curve and data, output as a .json
file, by parsing --PR_curve
.
If you're experiencing issues with training your own datasets, please raise an issue.
Written by Adam Hines ([email protected] - https://www.qut.edu.au/about/our-people/academic-profiles/adam.hines)