-
Notifications
You must be signed in to change notification settings - Fork 0
Setting up and using the optimizer
This guide will show you how to use the optimizer.py
for your custom dataset to find tune model hyperparameters to best fit your data. We assume you have already succesfully trained and evaluated your own dataset. If not, please visit the Wiki page to learn how to set up your own custom dataset.
The following will be required for this guide:
- A custom dataset with at least one reference and query set of images has been generated
- A ground truth (GT)
.npy
file is available for your reference and query - An account with Weights & Biases
If you haven't already, set up an account with Weights & Biases and follow their Quickstart instructions for logging in through your Python environment. Once signed in through the API, you no longer need to log in again.
Instead of calling main.py
we will instead use optimizer.py
. Set up your reference and query datasets as you would normally in a new terminal, excluding calls like --PR_curve
, --sim_mat
, and --matching
:
python optimizer.py \
--dataset <your dataset> \
--camera <your camera> \
--reference <your reference images> \
--reference_places <number of references> \
--query <your query images> \
--query_places <number of queries>
Also ensure that your reference_query_GT.npy
file is stored in the correct location:
--dataset
|--office-building
|--davis346
reference_query_GT.npy
Once this is done, we can now look at the hyperparameters we can start to tune.
There are several hyperparameters that can be tuned for your model. We allow users to set different hypeparameters for the Input → Feature and Feature → Output layer for better flexibility in training models. Below is a description of each of them:
--epoch_feat - Number of epochs to train Input -> Feature layer for
--epoch_out - Number of epochs to train Feature -> Output layer for
--thr_l_feat - Lower bound of the firing threshold for Feature layer
--thr_h_feat - Upper bound of the firing threshold for Feature layer
--fire_l_feat - Lower bound of the firing rate for Feature layer
--fire_h_feat - Upper bound of the firing rate for Feature layer
--ip_rate_feat - Intrinsic plasticity learning rate for Feature layer
--stdp_rate_feat - STDP learning rate for Feature layer
--thr_l_out - Lower bound of the firing threshold for Output layer
--thr_h_out - Upper bound of the firing threshold for Output layer
--fire_l_out - Lower bound of the firing rate for Output layer
--fire_h_out - Upper bound of the firing rate for Output layer
--ip_rate_out - Intrinsic plasticity learning rate for Output layer
--stdp_rate_out - STDP learning rate for Output layer
--f_exc - Excitatory connection probability between Input -> Feature layer
--f_inh - Inhibitory connection probability between Input -> Feature layer
--o_exc - Excitatory connection probability between Feature -> Output layer
--f_inh - Inhibitory connection probability between Feature -> Output layer
All the hyperparameters are float
values between [0, 1], except for epoch
which is an integer
Whilst there are no hard and fast rules for what the hyperparameters should be set to, there are some general good practice starting points to help you tune them.
- Inhibitory connection probabilities generally should be higher than excitatory, if not fully connected
- For the output layer, keeping the firing rate and firing threshold at or around 0.5 is ideal since the delta learning rule is based on this spike amplitude
- A lower number of epochs for your Input → Feature layer can sometimes be helpful as not to overfit reference data
- Keep thr_l to 0
If in doubt, submit an issue and we can help point in the right direction. Or, you can keep hyperparameters set to default and modify them one at a time to see what mostly influences your network behavior.
In order to set up your model to iterate over different hyperparameters with Wandb, you need to define the parameters in a dictionary inside the optimizer.py
script.
The dictionary to edit is the variable parameters_dict
found at line 64
:
parameters_dict = {
'fire_l_feat': {'values': [0.1,0.2,0.3]},
'fire_h_feat': {'values': [0.6,0.7,0.8]},
'thr_h_feat': {'values': [0.3,0.4,0.5]},
}
Here, we're looking at how the firing rate and threshold affects model performance between the Input and Feature layer. The 'values'
arguments must be a list
variable of values. You can also set up a range of values to test over easily by using np.linspace
:
parameters_dict = {
'fire_l_feat': {'values': list(np.linspace(0.1, 0.49, 3))},
'fire_h_feat': {'values': list(np.linspace(0.5, 1.0, 3))},
'thr_h_feat': {'values': list(np.linspace(0.1, 0.5, 3))},
}
Just note, there is a limit to how many potential combinations of hyperparameters you can use Wandb with so it is best to not overload it with every single hyperparameter at once. A good strategy might be to focus on training one layer's weights at a time to keep the total parameter combination count down.
Now that we've set the parameters and values we want to iterate over, we have to set the config
for Wandb as follows in the wandsearch()
function:
# Initialize w&b search
def wandsearch(config=None):
with wandb.init(config=config):
# Initialize config
config = wandb.config
# Set arguments for the sweep (modify based on what you want to search)
args.fire_l_feat = config.fire_l_feat
args.fire_h_feat = config.fire_h_feat
args.thr_h_feat = config.thr_h_feat
On each new iteration of Wandb which trains and tests a new model, it will set the args
of the model to one of the values defined in the parameters_dict
.
And that is it. All that's left to do is let it run and follow the link that prints in your terminal to the web based interface to track and monitor how your model is performing!
We by default set the searching algorith to be random
, but if you want to use a grid search instead you can modify the sweep_config
method:
# For random searching (default)
sweep_config = {'method':'random'}
# For grid searching
sweep_config = {'method':'grid'}
It's also a good idea to set a unique project name to separate out optimizer sweeps in the sweep_id
variable:
# Start sweep controller
sweep_id = wandb.sweep(sweep_config, project="random-sweep-001")
With the provided GT file, we run a Recall@N evaluation with N = [1, 5, 10, 15, 20, 25]
. The Recall values get returned from the inferencing model and we run an area under the curve (AUC) using np.trapz
to get a unitary value that evalutes the overall performance of the model. Wandb will plot the best model performance and the corresponding hyperparameters used to achieve that in the web based viewer.
However, if you want to maximize performance stricter than that you can always modify the metric
. For example, if we wanted to maximize for Recall@1 we can just use the first indexed value from our R_all
variable that is returned from the inferencing model:
# Modify the metric variable for Recall@1
metric = {'name':'R1', 'goal':'maximize'}
# Change the wandb logging
wandb.log({"R1" : R_all[0]})
print("R1: ", R_all[0])
If you're experiencing issues with using the optimizer, please raise an issue.
Written by Adam Hines ([email protected] - https://www.qut.edu.au/about/our-people/academic-profiles/adam.hines)