Task Planning with Pretrained Prior

Consider the problem of determining the next action of an agent trying to accomplish a goal, $g$, the current and previous observations $o_{1:t}$, as well as the previous actions $a_{1:t-1}$.

$$p(a_t|g, o_{1:t}, a_{1:t-1})$$

We can factor out a prior by formulating using Bayes Rule:

$$p(g, o_{1:t-1}|a_{1:t-1}, a_t, o_t)p(a_t|a_{1:t-1}, o_t)$$

where the first term is the likelihood and second prior, which can be a pretrained language model if $o_t$ is encoded using text. Observe the likelihood is discrimative i.e. it operates on binary labels/outputs probability that denote whether $a_t$ is the correct action. Assume that all goals come can unified relatively well under the same prior, and the training data for prior fine-tuning can be aggregated from multiple goals. We can sample actions as follows: sample the top $k$ choices from the prior and reweight them using the likelihood.

In our case, we use BabyAI as the environment. The prior is GPT2 fine-tuned on many goals, where $a_{1:t-1}, o_t$ are represented as text. The likelihood is a classifier that involves BabyAI's 7x7 grid, the aggregated actions, observations, and the goal. We generate negative samples by changing the action of the last element of a goal sequence, which can be viewed as analogous to teacher forcing.

The prior achieves decent accuracy (~70%) and the likelihoods achieve very high accuracy (~93%) with low data (few hundred data points) only on the given goal.

View detailed experiment logs here.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
datasets		datasets
environments		environments
experiments		experiments
metrics		metrics
workflows		workflows
.gitignore		.gitignore
README.md		README.md
babyai_data_visualizations.ipynb		babyai_data_visualizations.ipynb
babyai_env_description.py		babyai_env_description.py
babyai_env_description_gen_data.py		babyai_env_description_gen_data.py
babyai_task_sequence.py		babyai_task_sequence.py
babyai_task_sequence_dataset.py		babyai_task_sequence_dataset.py
babyai_task_sequence_renderer.py		babyai_task_sequence_renderer.py
task_sequence_classifier.py		task_sequence_classifier.py
task_sequence_classifier_train.py		task_sequence_classifier_train.py
task_sequence_experiment_num_data.py		task_sequence_experiment_num_data.py
task_sequence_gpt2_finetune.py		task_sequence_gpt2_finetune.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Task Planning with Pretrained Prior

About

Releases

Packages

Languages

gtangg12/task-planning-bayes-factor

Folders and files

Latest commit

History

Repository files navigation

Task Planning with Pretrained Prior

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages