Skip to content

This repository contain the `R` code related the the study "Bayesian Optimization for breeding"

License

Notifications You must be signed in to change notification settings

ut-biomet/bayesianOptimizationForBreeding

Repository files navigation

Bayesian optimization of breeding schemes

Introduction

This repository contain the R code related the the study "Bayesian Optimization for breeding" submitted to Frontiers in Plant Science by DIOT Julien and IWATA Hiroyoshi. This study presents the usage of Bayesian optimization for optimizing Breeding schemes and breeding simulation thanks to the R packages breedSimulatR and mlrMBO.

This readme is here to help reproduce the results or adapt the method on different use case.

Repository structure

  • runRepeatOpt.Rmd: Main code running repeated Bayesian and Random optimization.
  • src: Folder containing all the R scripts defining the functions used in runRepeatOpt.Rmd.
  • data: Location of the raw genotype data of the initial breeding population.
  • simSetups: Location of the saved "simulation setups" files. Those files might be helpful to run another optimization using the same simulation setup (eg. same breeding simulation function, same heritability, same initial population...)
  • optSetups: Location of the saved "optimization setups" files. Those files might be helpful to run another optimization using the same setup.
  • output-optimization: Location of the optimization results.
  • output-resultRepetition: Location of the Breeding schemes replication done after an optimization.
  • aggregatedResults: Folder containing the Bayesian and Random optimization results aggregated in singles .rds files.
  • aggregateResults.R: Script aggregating the results of the Bayesian and Random optimizations in singles .rds files.
  • createFigures.R: Script generating the figures presented in the paper.
  • figures: Location of the figures generatied by createFigures.R.
  • misc: Miscellaneous folder maid for files without specific interest for this repo but which might not worth deleting:
    • misc/errorMsg.txtan example of error messages that can appear in when doing the bayesian optimization
    • misc/getMissingResults.Ra script that can get the id of the optimization runs which have generated the errors (based on the missing results file).
    • misc/seedList_all.csv: list of the ids of all the results.
    • misc/singleSimulation.R: Script to Simulate 1 breeding scheme according to breeding scheme parameters return by the optimization.

Install dependencies

All computation are done with the R (v4.1.1) language and uses the renv package to manage the library dependencies.

To install the necessary libraries, open a R console in the project folder and run:

renv::restore()

Optimization workflow

The main optimization workflow is detailed in the file runRepeatOpt.Rmd. It is a R-markdown file which can be run from R-Studio directly or with the shell command:

R -e "rmarkdown::render('runRepeatOpt.Rmd')"

Note: Be aware that the calculation can be long.

The workflow can be summarized as follow:

  1. Create the "Simulation setup". This setup contain all the necessary information to run a breeding simulation (eg. genotype of the initial population, marker effects, heritability, budget...).
  2. Create the "Optimization setup". This setup contain all the necessary information to run an optimization (eg. number of iterations, kernel, acquisition function...).
  3. Run the optimizations and repeat the breeding simulation with the optimized parameters:
    • Bayesian optimization
    • Repeat the breeding simulation using the "bayesian optimized" parameters
    • Random optimization
    • Repeat the breeding simulation using the "random optimized" parameters

About

This repository contain the `R` code related the the study "Bayesian Optimization for breeding"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published