Does Language Affect Reasoning Tasks in Robotic Navigation?

This is the repository of our AI701 course project at MBZUAI.

This repository explores the role of language in robotic navigation tasks, focusing on reasoning and action planning using advanced language models. It examines the performance of Arabic and English in Vision-Language Navigation (VLN) tasks.

Prerequisites

Microsoft Azure account

A Microsoft Azure account is required to set up and run the experiments in this project. Once you have setup your account visit the Azure AI portal and deploy your model.

You will need the ENDPOINT URL and API key for your deployed model. Add them to your environment variables:


  # prepare your Endpoint URL and API key (for linux)
  export ENDPOINT_URL="{ENDPOINT-URL}"
  export API_KEY="{API-KEY}"

  # prepare your Endpoint URL and API key (for windows)
  set ENDPOINT_URL="{ENDPOINT-URL}"
  set API_KEY="{API-KEY}"

Installation

Run the following to setup the conda environment and install the requirements:


  conda create --name NavGPT python=3.9
  conda activate NavGPT
  pip install -r requirements.txt

Usage

To Run the experiments please run the following code,

{model_name}: your deployed model name. Use "custom-gpt" for any openai model
{output_folder_of_model}: what you want to call the folder that the output results are saved into
{number_of_trajectories}: number of trajectories that the robot will take from the map
{--translated}: write --translated if you want to use the Arabic translated dataset, and remove it if you will be using the english dataset


  cd nav_src
  python NavGPT.py --llm_model_name {model_name} \
    --output_dir ../datasets/R2R/exprs/{output_folder_of_model} \
    --val_env_name R2R_val_unseen_instr \
    --iters {number_of_trajectories} {--translated}

Here is an example of an experiment with Llama 3 8B model, Arabic dataset, and using R2R_val_unseen_instr_100: the shortened version of the annotations directory that contains the translated scene for inference:


 cd nav_src
  python NavGPT.py --llm_model_name custom-llama_3_8B \
    --output_dir ../datasets/R2R/exprs/llama_3_8B_ar \
    --val_env_name R2R_val_unseen_instr_100 \
    --iters 100 --translated

Experiments

For our paper, we ran the following experiments to perform consistent comparisons

Experiment Name	LLM (API access from https://ai.azure.com/)	Dataset
custom-gpt	GPT-4o mini	English and Arabic
custom-llama_3_8B	Llama 3 8B Instruct	English and Arabic
custom-phi	Phi medium 14B Instruct 128K	English and Arabic
custom-jais	Jais 30B	English and Arabic

Acknowledgement

A large part of the code is used from NavGPT. Many thanks for their wonderful work.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
Translation_methods		Translation_methods
datasets		datasets
nav_src		nav_src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Does Language Affect Reasoning Tasks in Robotic Navigation?

Prerequisites

Microsoft Azure account

Installation

Usage

Experiments

Acknowledgement

About

Releases

Packages

Languages

baheytharwat/Language_In_RobNav

Folders and files

Latest commit

History

Repository files navigation

Does Language Affect Reasoning Tasks in Robotic Navigation?

Prerequisites

Microsoft Azure account

Installation

Usage

Experiments

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages