Goal: solve Witness puzzles using Levin Tree Search (LTS) in a way that generates a curriculum of puzzles. We compare the ordering of puzzles to Jonathan Blow's hand-crafted curriculum.
For more information on LTS: https://arxiv.org/abs/1811.10928
Overview: The LTS solver is parametrized by a neural-network (NN), whose initial weights are randomly sampled.
The program runs a number of iterations over the dataset of puzzles.
In each iteration, we invoke LTS to solve as many puzzles as possible with the current budget.
Each time we solve a puzzle, we remove it from the set.
If at least one puzzle is solved in the current iteration, we train the NN with the set of solved puzzles.
Else, if no puzzles are solved in the current iteration, we double the budget and try again.
To get an ordering, we compute the cosine and dot product between each iteration's current NN weights and the final NN weights.
Once all the iterations are done, we look at the cosine and dot-product measurements.
If the cosines or dot-products in one iteration were very high, this means that the iteration's set of solved puzzles aligns with the final NN weights (which can solve all puzzles).
Therefore, the current iteration's puzzles are important and should be included in the curriculum.
The way experiments are conducted is:
- Run the program and store the NN weights after every iteration of the algorithm.
- Rerun the program a second time and compute the metrics we need for evaluation (cosine similarity, dot-products, etc).
- Evaluate and plot results.
Case I: working from a Compute Canada remote machine:
- Log into Compute Canada
ssh cedar # <---- logging into the Cedar cluster
- To submit your experiment, run:
sbatch ./FD_get_parameters_BFS.sh
Important: If you are submitting the first round of the experiment, make sure that FD_get_parameters_BFS.sh
doesn't have the --load_debug_data
flag. Otherwise, add the --load_debug_data
flag.
You must change source ~/my_env/bin/activate
to state the name of your virtual environment.
4. Once the jobs are ready, you will find the results in the folders solved_puzzles/
and logs_large/
.
5. To plot results, cd
into solved_puzzles
and run open_pkl_files_Batch_data.py
.
If ./FD_get_parameters_BFS.sh
was not run with the default arguments, you will need to give open_pkl_files_Batch_data.py
and pass the respective arguments.
The flags and descriptions are provided in open_pkl_files_Batch_data.py
. When you run this script, a line will print out saying "The plots are found in ".
Case II: You are working locally.
- Instead of using sbatch to submit a job, you can directly run
main.py
:
python main.py --learn
Important: If you are submitting the first round of the experiment, make sure that FD_get_parameters_BFS.sh
doesn't have the --load_debug_data
flag. Otherwise, add the --load_debug_data
flag.
This runs the experiment with the default arguments. If you wish to select different arguments, go to src/parameter_parser.py
where the parameter names, flags and descriptions are listed.
2. Same as step #4 above.
3. Same as step #5 above.
If using Compute Canada, go to ./FD_get_parameters_BFS.sh
and set the following variables of your experiment:
loss="CrossEntropyLoss" --> loss function that you want to use.
algorithm="Levin" --> search algorithm ("Levin" stands for Levin Tree Search)
domain_name="Witness" --> domain of the puzzles
problems_dir="problems/witness/puzzles_4x4/" --> directory where the puzzles are stored
size_puzzle="4x4" --> dimensions of the puzzles
output="output_test_witness_4x4/" --> directory where the outputs generated by sbatch are saved. These are generally .out files.
You could also set other variables, such as search_budget
, gradient_steps
, etc. For the sake of my preliminary experiments, I kept these fixed.
This directory contains the most important code that you need to run.
main.py
: parses the parameters that you decided on, and runs the experimentparameter_parser.py
: parses the parameters you decided onbootstrap.py
: main algorithm that iterates through all the puzzles found inproblems_dir/
and calls the LTS solver. Cosine and dot-product data are computed and stored insolved_puzzles/
.bootstrap_no_debug_data.py
: main algorithm that iterates through all the puzzles found inproblems_dir/
and calls the LTS solver. No cosine or dot-product measurements are computed.compute_cosines.py
: script that computes the cosines, dot-products and any other measurement that we wish to record.game_state.py
: script that computes all necessary operations to determine the state of the game.
Contains all the measurements made: cosines, dot-products, ordering of puzzles, etc.
Contains the NN weights.
Contains folders that store the puzzles. If you wish to run a small toy experiment, I suggest using puzzles_small/
.
Stored logging information on the number of puzzles solved on each iteration, the current LTS budget, the NN loss, etc.