Check compatibility between 2 snps coordinates data set and keep only genotypes SNPs
If the user_SNPcoord
data miss some SNPs defined in the .vcf
file, an error is raised. If the user_SNPcoord
data have additional SNPs,
those SNPs will be removed.
If the data between those two data-set are inconsistent, an error is raised.
checkAndFilterSNPcoord(user_SNPcoord, vcf_SNPcoord)
Argument | Description |
---|---|
user_SNPcoord |
SNPs coordinates data coming from the user (.csv file) |
vcf_SNPcoord |
SNPs coordinates data coming from the .vcf file |
The filtered user_SNPcoord
data.frame
Check individuals in the crossing table are in the haplotype data
Check individuals in the crossing table are in the haplotype data
checkIndNamesConsistency(crossTable, haplo)
Argument | Description |
---|---|
crossTable |
the crossing table |
haplo |
the haplotype data given by the function readPhasedGeno |
NULL, raise error if missing individuals are detected.
Initialise the simulation
Initialise the simulation
initializeSimulation(haplotypes, SNPcoord)
Argument | Description |
---|---|
haplotypes |
haplotypes of the parents, data.frame with genotype values in row, and individuals'haplotype in columns. The columns name should be individualName_1 and individualName_2 for the first/second haplotype of the individual named individualName . (the list item named haplo return by the function readPhasedGeno ) |
SNPcoord |
snp coordinates, data.frame of 4 columns: - chr : Name of the chromosome holding the SNP - physPos : SNP physical position on the chromosome - linkMapPos : SNP linkage map position on the chromosome in morgan - SNPid : SNP's IDs |
breedSimulatR
's population object
Main function to raise an engineError
(ie. expected error)
This function is similar to the R's stop
function and should be used
instead to ensure we raise expected errors.
engineError(message, extra = list(), n_skip_caller = 1)
Argument | Description |
---|---|
message |
error message |
extra |
list of extra information that will be included in the error |
n_skip_caller |
(int, default 1) This is to catch the function where the error happens. 0 will show this function, 1 will show the function calling this one and so on. |
Helper function to raise an engineError
for a "bad argument"
To be used inside functions to check their arguments.
the error message will be:
"arg
must be must_be
not not
"
(adapted from https://adv-r.hadley.nz/conditions.html#signalling)
bad_argument(
arg,
must_be,
not = NULL,
errType = "value",
class = "engineError",
extra = NULL,
n_skip_caller = 2
)
Argument | Description |
---|---|
arg |
(character) tested argument name |
must_be |
(character) expected value or type |
not |
(any) provided value |
errType |
"type" is the only recognised value. Use it to report an error about the type of the argument |
class |
use "engineError" (default) to raise an expected "engineError" |
extra |
(list, default NULL) extra information to pass to engineError |
n_skip_caller |
(int, default 2) see engineError |
x = "a"
bad_argument("x", must_be = 42, not = x)
# will stop with the error msg: "`x` must be 42 not `a`"
bad_argument("x", must_be = "numeric", not = x, "type")
# will stop with the error msg: "`x` must be numeric not `character`"
list of error codes as function so that unexpected error code raise and error
list of error codes as function so that unexpected error code raise and error
errorCode(code)
Filter gwas results
Filter gwas results
filterGWAS(gwas, filter_pAdj = 1, filter_nPoints = Inf, filter_quant = 1)
Argument | Description |
---|---|
gwas |
[data.frame] output of the gwas function |
filter_pAdj |
[numeric] threshold to remove points with pAdj < filter_pAdj from the plot (default no filtering) |
filter_nPoints |
[numeric] threshold to keep only the filter_nPoints with the lowest p-values for the plot (default no filtering) |
filter_quant |
[numeric] threshold to keep only the filter_quant*100 % of the points with the lowest p-values for the plot (default no filtering) |
Fit a GS model using gaston
package
Fit a GS model using gaston
package
fit_with_gaston(pheno, K)
Argument | Description |
---|---|
pheno |
1 dimensional vector of phenotypic values |
K |
list of 1 or 2 variance covariance matrices to consider for the random effect. |
This is a wrapper around gaston::lmm.aireml()
see
?gaston::lmm.aireml()
for more information
list of 3 elements
logL
: Value of log-likelihood of the model (cf.?gaston::lmm.aireml()
)blups
:length(pheno)
bylength(K)
matrix of the blups of the individuals (one random effects per column).intercept
: value of the intercept of the model
Fit a GS model using RAINBOWR
package
Fit a GS model using RAINBOWR
package
fit_with_rainbowr(pheno, K)
Argument | Description |
---|---|
pheno |
1 dimensional vector of phenotypic values |
K |
list of 1 or 2 variance covariance matrices used for the random effect. |
This is a wrapper around RAINBOWR::EM3.general()
see
?RAINBOWR::EM3.general()
for more information
list of 3 elements
logL
: Value of log-likelihood of the model (cf.?RAINBOWR::EM3.general()
)blups
:length(pheno)
bylength(K)
matrix of the blups of the individuals (one random effects per column).intercept
: value of the intercept of the model
Calculate markers effects from the BLUPS, the genetic matrix and the variance covariance matrix of the blups (ie. genetic relationship matrices)
Calculate markers effects from the BLUPS, the genetic matrix and the variance covariance matrix of the blups (ie. genetic relationship matrices)
calc_mark_eff(geno_mat, rel_mat, blups)
Argument | Description |
---|---|
geno_mat |
n_inds by n_marker genomic matrix |
rel_mat |
n_inds by n_inds relationship matrix associated to geno_mat |
blups |
n_inds vector of the individuals blups effect. |
Calculation is made by
with:
-
$X$ the genetic matrix - $n_ mark $ the number of markers (ie. number of columns of
$X$ ) -
$Z$ the relationship matrix of$X$ -
$B$ the blups of the individuals
vector of the estimated markers effects
Train GS model optionally with dominance
Train GS model optionally with dominance
train_gs_model(pheno, geno, with_dominance = FALSE)
Argument | Description |
---|---|
pheno |
1 column data.frame of the phenotype |
geno |
bed.matrix returned by readGenoData() |
with_dominance |
(default = FALSE) control if the model should include dominance effects |
IMPORTANT ! It will return the markers effects for the genotypes encoded:
- in allele dose (0, 1, 2) for the additive effects
- as (0, 1, 0) for the dominance effects
list of 2 elements:
intercept
: the intercept of the modeleff
: data.frame of 2 columnsadditive
anddominance
with their corresponding markers effects
Make GS predictions
Make GS predictions
predict_gs_model(geno, estim_mark_eff)
Argument | Description |
---|---|
geno |
bed.matrix returned by readGenoData() on wich we want to make prediction |
estim_mark_eff |
(list, output of train_gs_model() ) estimated markers effects and intercept. |
data.frame of one column containing the prediction
Repeated K-Folds cross-validation of a GS model
Repeated K-Folds cross-validation of a GS model
cross_validation_evaluation(
pheno,
geno,
with_dominance = TRUE,
n_folds = 10,
n_repetitions = 5
)
Argument | Description |
---|---|
pheno |
1 column data.frame of the phenotype |
geno |
bed.matrix returned by readGenoData() |
with_dominance |
(default = FALSE) control if the model should include dominance effects |
n_folds |
(default 10 ) number of folds for each repetition |
n_repetitions |
(default 5 ) number of repetition |
list of 2 elements:
predictions
: data.frame of the predicted values during the cross-validation. Available columns:ind
individual idactual
phenotype value in the training datapredicted
predicted valuesfold
cross-validation fold idrepetition
cross-validation repetition idmetrics
: data.frame of the calculated model metrics during the cross-validation. Available columns are:fold
cross-validation fold idrepetition
cross-validation repetition id...
values of the metrics returned byget_model_metrics()
(one column per returned list's elements with the same name)
Calculate model metrics
Calculate model metrics
get_model_metrics(actual, predictions)
Argument | Description |
---|---|
actual |
vector of actual values |
predictions |
vector of model's predicted values |
list of metrics values:
rmse
root mean square errorcorel_pearson
Pearson's corelation coefficientcorel_spearman
Spearman's corelation coefficientr2
R squared
Check if a dominance model is applicable with the provided genetic data
Check if a dominance model is applicable with the provided genetic data
check_dominance_model_is_applicable(dominance, homozygous_threshold = 0.95)
Argument | Description |
---|---|
dominance |
list returned by |
homozygous_threshold |
A threshold used to identify individuals or SNPs with a homozygosity proportion exceeding this value. These will be counted and reported in the error message if applicable. This parameter does not influence the function's behavior, only the error message it can raise. |
The dominance model need the dominance relationship matrix to be invertible in order to be able to calculate the dominance effects. This function will return an error if it is not the case. The error will contain information about the homozygousity of the markers and individuals as if the data have too many homozygous individuals/markers it will probably not be suited for a dominance model. The ``
TRUE
or raise an engineError
perform GWAS analysis
perform GWAS analysis
gwas(
data,
trait,
test,
fixed = 0,
response = "quantitative",
thresh_maf,
thresh_callrate
)
Argument | Description |
---|---|
data |
List return by prepareDta function |
trait |
Chraracter of length 1, name of the trait to analyze. Could be a column name of the phenotypic file |
test |
Which test to use. Either "score" , "wald" or "lrt" . For binary phenotypes, test = "score" is mandatory. For more information about this parameters see: ??gaston::association.test |
fixed |
Number of Principal Components to include in the model with fixed effect (for test = "wald" or "lrt" ). Default value is 0. For more information about this parameters see: ??gaston::association.test |
response |
Character of length 1, Either "quantitative" or "binary". Is the trait a quantitative or a binary phenotype? Default value is "quantitative" |
thresh_maf |
Threshold for filtering markers. Only markers with minor allele frequency > thresh_maf will be kept. |
thresh_callrate |
Threshold for filtering markers. Only markers with a callrate > thresh_callrate will be kept. |
For the calculation, the genetic relationship matrix need to be calculated. This is done based on the genetic matrix standardized by the the genetic mean "mu" and the genetic variance "sigma", after having filtering according to the thresh_maf
and thresh_callrate
.
data.frame
Adjust P-values for Multiple Comparisons
Adjust P-values for Multiple Comparisons
adjustPval(p, adj_method, thresh_p = NULL)
Argument | Description |
---|---|
adj_method |
correction method: "holm", "hochberg", "bonferroni", "BH", "BY", "fdr", "none" (see ?p.adjust for more details) |
thresh_p |
optional value of the p value significant threshold (default NULL) |
vector |
of p-values |
The method "hommel" is not implemented because it is too long to calculate.
list of two elements: "p_adj" vector of adjusted p values, "thresh_adj" the adjusted threshold (if thresh_p is preovided, NULL if not)
R6 class use to log messages in this engine's function
R6 class use to log messages in this engine's function
R6 class use to log messages in this engine's function
## ------------------------------------------------
## Method `Logger$new`
## ------------------------------------------------
mylogger <- Logger$new(context = NULL)
Run GWAS analysis
Run GWAS analysis
run_gwas(
genoFile = NULL,
phenoFile = NULL,
genoUrl = NULL,
phenoUrl = NULL,
trait,
test,
fixed = NULL,
response = "quantitative",
thresh_maf,
thresh_callrate,
outFile = tempfile(fileext = ".json")
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
phenoFile |
path of the phenotypic data file (csv file). Individuals' name should be the first column of the file and no duplication is allowed. |
genoUrl |
url of the geno data file (.vcf or .vcf.gz file) |
phenoUrl |
url of the phenotypic data file (csv file) Individuals' name should be the first column of the file and no duplication is allowed. |
trait |
Chraracter of length 1, name of the trait to analyze. Must be a column name of the phenotypic file. |
test |
Which test to use. Either "score" , "wald" or "lrt" . For binary phenotypes, test = "score" is mandatory. For more information about this parameters see: ??gaston::association.test |
fixed |
Number of Principal Components to include in the model with fixed effect (for test = "wald" or "lrt" ). Default value is 0. For more information about this parameters see: ??gaston::association.test |
response |
Character of length 1, Either "quantitative" or "binary". Is the trait a quantitative or a binary phenotype? Default value is "quantitative" |
thresh_maf |
Threshold for filtering markers. Only markers with minor allele frequency > thresh_maf will be kept. |
thresh_callrate |
Threshold for filtering markers. Only markers with a callrate > thresh_callrate will be kept. |
outFile |
path of the output file. If NULL , the output will not be written in any file. By default write in an tempoary .json file. |
list with 3 elements gwasRes
for the results of the gwas analysis in json, metadata
a list of metadata of these analysis and file
path of the json file containing the results
Draw a Manhattan Plot
Draw a Manhattan Plot
draw_manhattanPlot(
gwasFile = NULL,
gwasUrl = NULL,
adj_method = "bonferroni",
thresh_p = 0.05,
chr = NA,
interactive = TRUE,
filter_pAdj = 1,
filter_nPoints = Inf,
filter_quant = 1,
outFile = tempfile()
)
Argument | Description |
---|---|
gwasFile |
path of the gwas result data file (json file) |
gwasUrl |
url of the gwas result data file (json file) |
adj_method |
correction method: "holm", "hochberg", "bonferroni", "BH", "BY", "fdr", "none" (see ?p.adjust for more details) |
thresh_p |
p value significant threshold (default 0.05) |
chr |
name of the chromosome to show (show all if NA) |
interactive |
[bool] should the plot be interactive (the default) |
filter_pAdj |
[numeric] threshold to remove points with pAdj < filter_pAdj from the plot (default no filtering) |
filter_nPoints |
[numeric] threshold to keep only the filter_nPoints with the lowest p-values for the plot (default no filtering) |
filter_quant |
[numeric] threshold to keep only the filter_quant*100 % of the points with the lowest p-values for the plot (default no filtering) |
outFile |
path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an tempoary file. |
If several filtering rules are given, the filtering process apply the filtering process sequentially (this lead to having the same result that if only the strongest rules were given). Moreover, the number of points kept for the plot will be display in the plot title.
plotly graph if interactive is TRUE, or NULL if not.
Adjust GWAS p-values
Adjust GWAS p-values
run_resAdjustment(
gwasFile = NULL,
gwasUrl = NULL,
adj_method = "bonferroni",
filter_pAdj = 1,
filter_nPoints = Inf,
filter_quant = 1,
outFile = tempfile(fileext = ".json")
)
Argument | Description |
---|---|
gwasFile |
path of the gwas result data file (json file) |
gwasUrl |
url of the gwas result data file (json file) |
adj_method |
correction method: "holm", "hochberg", "bonferroni", "BH", "BY", "fdr", "none" (see ?p.adjust for more details) |
filter_pAdj |
[numeric] threshold to remove points with pAdj < filter_pAdj from the plot (default no filtering) |
filter_nPoints |
[numeric] threshold to keep only the filter_nPoints with the lowest p-values for the plot (default no filtering) |
filter_quant |
[numeric] threshold to keep only the filter_quant*100 % of the points with the lowest p-values for the plot (default no filtering) |
outFile |
path of the output file. If NULL , the output will not be written in any file. By default write in an tempoary .json file. |
list with 3 elements gwasAdjusted
for the results of the gwas analysis in json with adjusted p-values, metadata
a list of metadata of the gwas analysis in json with adjusted p-values, and file
path of the json file containing the results (if dir
is not NULL
)
Draw an LD Plot
Draw an LD Plot
draw_ldPlot(
genoFile = NULL,
genoUrl = NULL,
from,
to,
n_max = 50,
outFile = tempfile(fileext = ".png")
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
from |
lower bound of the range of SNPs for which the LD is computed |
to |
upper bound of the range of SNPs for which the LD is computed |
n_max |
maximum number of marker to show (to avoid unreadable plot) created. By default write in an tempoary .png file. |
outFile |
path of the png file to save the plot. If NULL , the image file will not be |
path of the created file (or NULL if file
is NULL)
Calculate pedigree relationship matrix
Calculate pedigree relationship matrix
calc_pedRelMat(
pedFile = NULL,
pedUrl = NULL,
unknown_string = "",
header = TRUE,
outFile = tempfile(fileext = ".csv"),
outFormat = tools::file_ext(outFile)
)
Argument | Description |
---|---|
pedFile |
path of the pedigree data file (csv file). |
pedUrl |
url of the pedigree data file (csv file). |
unknown_string |
[default: ""] a character vector of strings which are to be interpreted as "unknown parent". By default: missing value in the file. |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 will be interpreted as the individual id, column 2 as the first parent, column 3 as the second parent. |
outFile |
path of the output file. If NULL , the output will not be written in any file. By default write in an tempoary .json file. |
outFormat |
Format of the output file, either csv or json . by default it will use the file extension of outfile . |
For csv
output, the file will include some metadata lines (starting by a #
symbol), a header and the row ids in its first column.
list with 3 elements relMat
the relationship matrix, metadata
a
list of metadata of these analysis (pedigree fingerprint,
number of individuals, creation time) and file
path
of the file containing the results.
Calculate genomic relationship matrix
Calculate genomic relationship matrix
calc_genoRelMat(
genoFile = NULL,
genoUrl = NULL,
outFile = tempfile(fileext = ".csv"),
outFormat = tools::file_ext(outFile)
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
genoUrl |
url of the geno data file (.vcf or .vcf.gz file) |
outFile |
path of the output file. If NULL , the output will not be written in any file. By default write in an tempoary .json file. |
outFormat |
Format of the output file, either csv or json . by default it will use the file extension of outfile . |
For csv
output, the file will include some metadata lines (starting by a #
symbol), a header and the row ids in its first column.
list with 3 elements relMat
the relationship matrix, metadata
a
list of metadata of these analysis (pedigree fingerprint,
number of individuals, creation time) and file
path
of the file containing the results.
Combined (pedigree + genomic) Relationship Matrix
Correct a pedigree relationship matrix by using genomic relationship matrix.
calc_combinedRelMat(
pedRelMatFile = NULL,
pedRelMatUrl = NULL,
genoRelMatFile = NULL,
genoRelMatUrl = NULL,
method = "Legarra",
tau = NULL,
omega = NULL,
outFile = tempfile(fileext = ".csv"),
outFormat = tools::file_ext(outFile)
)
Argument | Description |
---|---|
pedRelMatFile |
path of a pedigree relationship matrix generated by the the engine. |
pedRelMatUrl |
url of a pedigree relationship matrix generated by the the engine. |
genoRelMatFile |
path of a genomic relationship matrix generated by the the engine. |
genoRelMatUrl |
url of a genomic relationship matrix generated by the the engine. |
method |
method to use, either "Legarra" or "Martini" |
tau |
tau parameter of the Martini's method |
omega |
omega parameter of the Martini's method |
outFile |
path of the output file. If NULL , the output will not be written in any file. By default write in an tempoary .json file. |
outFormat |
Format of the output file, either csv or json . by default it will use the file extension of outfile . |
This method correct the pedigree matrix with the genomic relationship matrix. Therefore, individuals in the genomic relationship matrix not in the pedigree relationship matrix will be ignored.
Using the Martini's method with tau=1
, and omega=1
is equivalent of
Legarra's method.
For csv
output, the file will include some metadata lines (starting by a #
symbol), a header and the row ids in its first column.
list with 3 elements relMat
the relationship matrix, metadata
a
list of metadata of these analysis (pedigree fingerprint,
number of individuals, creation time) and file
path
of the file containing the results.
Martini, JW, et al. 2018 The effect of the H-1 scaling factors tau and omega on the structure of H in the single-step procedure. Genetics Selection Evolution 50(1), 16
Legarra, A, et al. 2009 A relationship matrix including full pedigree and genomic information. Journal of Dairy Science 92, 4656–4663
Draw a heatmap of a relationship matrix
Draw a heatmap of a relationship matrix
draw_relHeatmap(
relMatFile = NULL,
relMatUrl = NULL,
format = NULL,
interactive = TRUE,
outFile = tempfile()
)
Argument | Description |
---|---|
relMatFile |
path of a file generated by the function saveRelMat |
relMatUrl |
url of a file generated by the function saveRelMat |
interactive |
[bool] should the plot be interactive (the default) or not |
outFile |
path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an tempoary file. |
plotly graph if interactive is TRUE, or NULL if not.
Draw interactive pedigree network
Draw interactive pedigree network
draw_pedNetwork(
pedFile = NULL,
pedUrl = NULL,
unknown_string = "",
header = TRUE,
outFile = tempfile(fileext = ".html")
)
Argument | Description |
---|---|
pedFile |
path of the pedigree data file (csv file). |
pedUrl |
url of the pedigree data file (csv file). |
unknown_string |
[default: ""] a character vector of strings which are to be interpreted as "unknown parent". By default: missing value in the file. |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 will be interpreted as the individual id, column 2 as the first parent, column 3 as the second parent. |
outFile |
path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an temporary file. |
plotly graph if interactive is TRUE, or NULL if not.
Simulate the genotypes of offspring given the parent genotypes.
Simulate the genotypes of offspring given the parent genotypes.
crossingSimulation(
genoFile = NULL,
genoUrl = NULL,
crossTableFile = NULL,
crossTableUrl = NULL,
SNPcoordFile = NULL,
SNPcoordUrl = NULL,
nCross = 30,
outFile = tempfile(fileext = ".vcf.gz")
)
Argument | Description |
---|---|
genoFile |
phased VCF file path (ext .vcf or .vcf.gz ) |
genoUrl |
url of a phased VCF file path (ext .vcf or .vcf.gz ) |
crossTableFile |
path of the crossing table data file (csv file of 2 or 3 columns). It must contain the names of the variables as its first line. The column 1 and 2 will be interpreted as the parents ids. The optional third column will be interpreted as the offspring base name. |
crossTableUrl |
URL of a crossing table file |
SNPcoordFile |
path of the SNPs coordinates file (csv file). This .csv file should have 4 named columns: - chr : Chromosome holding the SNP (mandatory) - physPos : SNP physical position on the chromosome - linkMapPos : SNP linkage map position on the chromosome in Morgan (mandatory) - SNPid : SNP's IDs |
SNPcoordUrl |
URL of a SNP coordinate file |
nCross |
number of cross to simulate for each parent pair defined in the crossing table. |
outFile |
path of the .vcf.gz file containing the simulated genotypes of the offspring. It must end by .vcf.gz . By default write in an temporary file. For SNPcoordFile/Url, column physPos is optional except in some particular case (see below). If this column is provided (or contain only missing values), it should exactly match the physical positions of the SNP specified in the VCF file. If SNPid columns is missing or have missing values, the SNPid will be automatically imputed using the convention chr@physPos therefore columns chr and physPos should not have any missing values in this case. |
path of the .vcf.gz
file containing the simulated genotypes
of the offspring.
Calculate progenies BLUPs variance and expected values based on parents' genotype and markers effects
Calculate progenies BLUPs variance and expected values based on parents' genotype and markers effects
calc_progenyBlupEstimation(
genoFile = NULL,
genoUrl = NULL,
crossTableFile = NULL,
crossTableUrl = NULL,
SNPcoordFile = NULL,
SNPcoordUrl = NULL,
markerEffectsFile = NULL,
markerEffectsUrl = NULL,
outFile = tempfile(fileext = ".json")
)
Argument | Description |
---|---|
genoFile |
phased VCF file path (ext .vcf or .vcf.gz ) |
genoUrl |
url of a phased VCF file path (ext .vcf or .vcf.gz ) |
crossTableFile |
path of the crossing table data file (csv file of 2 or 3 columns). It must contain the names of the variables as its first line. The column 1 and 2 will be interpreted as the parents ids. The optional third column will be interpreted as the offspring base name. |
crossTableUrl |
URL of a crossing table file |
SNPcoordFile |
path of the SNPs coordinates file (csv file). This .csv file should have 4 named columns: - chr : Chromosome holding the SNP - physPos : SNP physical position on the chromosome - linkMapPos : SNP linkage map position on the chromosome in Morgan - SNPid : SNP's IDs |
SNPcoordUrl |
URL of a SNP coordinate file |
markerEffectsFile |
path of the marker effects file (csv or json file). |
markerEffectsUrl |
URL of a marker effect file |
outFile |
.json file path where to save the data. If the file already exists, it will be overwritten. For SNPcoordFile/Url, column physPos is optional except in some particular case (see below). If this column is provided (or contain only missing values), it should exactly match the physical positions of the SNP specified in the VCF file. If SNPid columns is missing or have missing values, the SNPid will be automatically imputed using the convention chr@physPos therefore columns chr and physPos should not have any missing values in this case. |
data.frame containing the calculations results
Draw a plot of the progenies BLUPs' expected values with error bars
X axis is the crosses, and Y axis the blups. The points are located at the expected value and the error bar length is the standard deviation.
draw_progBlupsPlot(
progEstimFile = NULL,
progEstimUrl = NULL,
errorBarInterval = 0.95,
y_axisName = "Genetic values",
sorting = "alpha",
trait = NULL,
outFile = tempfile(fileext = ".html")
)
Argument | Description |
---|---|
progEstimFile |
path of the progeny BLUP estimation file generated by r-geno-tools-engine containing the blup estimations of the progenies of some crosses (json file). |
progEstimUrl |
URL of a progeny BLUP estimation file |
errorBarInterval |
length of XX% interval of interest represented by the error bars (default=0.95) |
y_axisName |
The Y axis name (default = "genetic values") |
sorting |
method to sort the individuals (X axis) can be: - "asc": sort the BLUP expected value in ascending order (from left to right) - "dec": sort the BLUP expected value in decreasing order (from left to right) - any other value will sort the individuals in alphabetical order (from left to right) |
outFile |
outFile path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an temporary file. |
plotly graph
Draw a plotly graph of blups data for 2 traits
The points are located at the expected value and the ellipses
size represent the confidenceLevel
prediction interval.
draw_progBlupsPlot_2traits(
progEstimFile = NULL,
progEstimUrl = NULL,
x_trait,
y_trait,
confidenceLevel = 0.95,
x_suffix = "",
y_suffix = "",
ellipses_npoints = 100,
outFile = tempfile(fileext = ".html")
)
Argument | Description |
---|---|
progEstimFile |
path of the progeny BLUP estimation file generated by r-geno-tools-engine containing the blup estimations of the progenies of some crosses (json file). |
progEstimUrl |
URL of a progeny BLUP estimation file |
x_trait |
name of the trait to show on the x axis |
y_trait |
name of the trait to show on the y axis |
confidenceLevel |
level of the prediction ellipses (default 0.95, ie 95 % ellypses) |
x_suffix |
suffix to add to the x axis's name |
y_suffix |
suffix to add to the y axis's name |
ellipses_npoints |
number of points used to draw the ellipses (default 100) |
outFile |
outFile path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an temporary file. |
plotly graph
Train GS model
Train GS model
train_gs_model_main(
genoFile = NULL,
phenoFile = NULL,
genoUrl = NULL,
phenoUrl = NULL,
trait,
with_dominance,
thresh_maf,
outFile = tempfile(fileext = ".json")
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
phenoFile |
path of the phenotypic data file (csv file). Individuals' name should be the first column of the file and no duplication is allowed. |
genoUrl |
url of the geno data file (.vcf or .vcf.gz file) |
phenoUrl |
url of the phenotypic data file (csv file) Individuals' name should be the first column of the file and no duplication is allowed. |
trait |
Chraracter of length 1, name of the trait to analyze. Must be a column name of the phenotypic file. |
with_dominance |
should the model include dominance effects |
thresh_maf |
threshold to keep only markers with minor allele frequency greater than thresh_maf . |
outFile |
paht of the .json file where to save the model's estimated markers effects. |
Make phenotypic prediction using a markers effects
Make phenotypic prediction using a markers effects
predict_gs_model_main(
genoFile = NULL,
genoUrl = NULL,
markerEffectsFile = NULL,
markerEffectsUrl = NULL,
outFile = tempfile(fileext = ".csv")
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
genoUrl |
url of the geno data file (.vcf or .vcf.gz file) |
markerEffectsFile |
path of the marker effects file (csv or json file). |
markerEffectsUrl |
URL of a marker effect file |
outFile |
.csv file path where to save the predictions. If the file already exists, it will be overwritten. |
Evaluate a model with repeated cross validation
Evaluate a model with repeated cross validation
cross_validation_evaluation_main(
genoFile = NULL,
phenoFile = NULL,
genoUrl = NULL,
phenoUrl = NULL,
trait,
n_folds = 10,
n_repetitions = 5,
with_dominance,
thresh_maf,
outFile = tempfile(fileext = ".json")
)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
phenoFile |
path of the phenotypic data file (csv file). Individuals' name should be the first column of the file and no duplication is allowed. |
genoUrl |
url of the geno data file (.vcf or .vcf.gz file) |
phenoUrl |
url of the phenotypic data file (csv file) Individuals' name should be the first column of the file and no duplication is allowed. |
trait |
Chraracter of length 1, name of the trait to analyze. Must be a column name of the phenotypic file. |
n_folds |
number of fold for each cross-validation |
n_repetitions |
number of cross-validation repetitions |
with_dominance |
should the model include dominance effects |
thresh_maf |
threshold to keep only markers with minor allele frequency greater than thresh_maf . |
outFile |
paht of the .json file where to save the evaluation results |
Draw the evaluation result plot
Draw the evaluation result plot
draw_evaluation_plot(
evaluationFile = NULL,
outFile = tempfile(fileext = ".html")
)
Argument | Description |
---|---|
evaluationFile |
path of the evaluation result file generated by save_GS_evaluation function |
outFile |
path of the file containing the plot. If NULL , the output will not be written in any file. By default write in an temporary file. |
create manhatan plot
create manhatan plot
manPlot(
gwas,
adj_method,
thresh_p = 0.05,
chr = NA,
title = "Manhattan Plot",
filter_pAdj = 1,
filter_nPoints = Inf,
filter_quant = 1,
interactive = TRUE
)
Argument | Description |
---|---|
gwas |
[data.frame] output of the gwas function |
adj_method |
correction method: "holm", "hochberg", "bonferroni", "BH", "BY", "fdr", "none" (see ?p.adjust for more details) |
thresh_p |
p value significant threshold (default 0.05) |
chr |
[char] name of the chromosome to show (show all if NA) |
title |
[char] Title of the plot. Default is "Manhattan Plot" |
filter_pAdj |
[numeric] threshold to remove points with pAdj < filter_pAdj from the plot (default no filtering) |
filter_nPoints |
[numeric] threshold to keep only the filter_nPoints with the lowest p-values for the plot (default no filtering) |
filter_quant |
[numeric] threshold to keep only the filter_quant*100 % of the points with the lowest p-values for the plot (default no filtering) |
interactive |
[bool] should the plot be interactive (the default) or not |
If several filtering rules are given, the filtering process apply the filtering process sequentially (this lead to having the same result that if only the strongest rules were given). Moreover, the number of points kept for the plot will be display in the plot title.
plotly graph if interactive is TRUE, or NULL if not.
writeLDplot Compute r2 Linkage Disequilibrium (LD) between given SNPs and return a plot
writeLDplot Compute r2 Linkage Disequilibrium (LD) between given SNPs and return a plot
LDplot(geno, from, to, file = tempfile(fileext = ".png"))
Argument | Description |
---|---|
geno |
[bed.matrix] geno data return by function readGenoData or downloadGenoData . |
from |
lower bound of the range of SNPs for which the LD is computed |
to |
upper bound of the range of SNPs for which the LD is computed |
file |
path of the png file to save the plot. If NULL , the image file will not be created. By default write in an tempoary .png file. |
from
should be lower than to
, and the maximum ranger size is 50.
(In order to get a readable image). If write is TRUE
, the function will write the plot in a png file, else it will plot it.
null if dir
is NULL, else the path of the png file.
Heatmap of a relationship matrix
Heatmap of a relationship matrix
relMatHeatmap(relMat, interactive = TRUE)
Argument | Description |
---|---|
relMat |
relationship matrix return by pedRelMat |
interactive |
[bool] should the plot be interactive (the default) or not |
plotly graph if interactive is TRUE, or NULL if not.
Draw an interactive Pedigree network
Draw an interactive Pedigree network
pedNetwork(ped)
Argument | Description |
---|---|
ped |
List return by readPedData function |
a forceNetwork
object (htmlwidget
)
Draw a plotly graph of blups data for 1 trait
X axis is the crosses, and Y axis the blups. The points are located at the expected value and the error bar length is the standard deviation.
plotBlup_1trait(
blupDta,
sorting = "alpha",
y_axisName = NULL,
errorBarInterval = 0.95,
trait
)
Argument | Description |
---|---|
blupDta |
list of data.frame of 4 columns: "ind1", "ind2", "blup_exp", "blup_var" |
sorting |
method to sort the individuals (X axis) can be: - "asc": sort the BLUP expected value in ascending order (from left to right) - "dec": sort the BLUP expected value in decreasing order (from left to right) - any other value will sort the individuals in alphabetical order (from left to right) |
y_axisName |
Name of the Y axis (default = trait ) |
errorBarInterval |
length of XX% interval of interest represented by the error bars (default=0.95) |
trait |
name of the trait to plot. This should be a name of the blupDta list. (optional if only one trait in blupDta , it will be set to the name of this trait) |
plotly graph
Draw a plotly graph of blups data for 2 traits
The points are located at the expected value and the ellipses
size represent the confidenceLevel
prediction interval.
plotBlup_2traits(
blupDta,
x_trait,
y_trait,
confidenceLevel = 0.95,
x_suffix = "",
y_suffix = "",
ellipses_npoints = 100
)
Argument | Description |
---|---|
blupDta |
list returned by calc_progenyBlupEstimation |
x_trait |
name of the trait to show on the x axis |
y_trait |
name of the trait to show on the y axis |
confidenceLevel |
level of the prediction ellipses (default 0.95, ie 95 % ellypses) |
x_suffix |
suffix to add to the x axis's name |
y_suffix |
suffix to add to the y axis's name |
ellipses_npoints |
number of points used to draw the ellipses (default 100) |
plotly graph
Draw a plotly graph of a GS model cross-validation evaluation
This plots is composed of several subplots:
- Observed vs Predicted scatter plot for all the cross-validation folds
- Horizontal box plots of models metrics calculated during the cross-validation
evaluation_plot(evaluation_results)
Argument | Description |
---|---|
evaluation_results |
list returned by cross_validation_evaluation() |
plotly graph
Calculate the recombination rate matrix for each couple of SNP
The recombination rate is alculated using the "haldane inverse" function
calcRecombRate(SNPcoord)
Argument | Description |
---|---|
SNPcoord |
SNP coordinate data.frame return by readSNPcoord |
named list of matrices. List names are the chromosomes' names. Matrices' row and columns names are the SNP ids (of the corresponding chromosome). Matrices values are the recombination rate between the corresponding SNPs.
Calculate the genetic variance-covariance matrix of the progenies of 2 given parents
Calculate the genetic variance-covariance matrix of the progenies of 2 given parents
calcProgenyGenetCovar(SNPcoord, r, haplo, p1.id, p2.id)
Argument | Description |
---|---|
SNPcoord |
SNP coordinate data.frame return by readSNPcoord |
r |
recombination rate matrices return by calcRecombRate |
haplo |
haplotypes of individuals ("haplotypes" element of the list return by readPhasedGeno function) |
p1.id |
id of the first parent |
p2.id |
id of the second parent |
named list of matrices. List names are the chromosomes' names. Matrices' row and columns names are the SNP ids (of the corresponding chromosome). Matrices values are the genetic covariance between the corresponding SNPs for the progeny of the given parents.
Calculate the BULP variance of the progeny
Calculate the BULP variance of the progeny
calcProgenyBlupVariance(SNPcoord, markerEffects, geneticCovar)
Argument | Description |
---|---|
SNPcoord |
SNP coordinate data.frame return by readSNPcoord |
geneticCovar |
list of the genetic variance covariance matrices return by calcProgenyGenetCovar |
makrerEffects |
(output of extract_additive_effects function) list of 2 elements: intercept : named vector of the intercepts SNPeffects : data.frame of the additive effects, 1 columns per phenotype with the marker ids as row names. |
numeric
Calculate the BULP expected values of the progeny
Calculate the BULP expected values of the progeny
calcProgenyBlupExpected(SNPcoord, haplo, p1.id, p2.id, markerEffects)
Argument | Description |
---|---|
SNPcoord |
SNP coordinate data.frame return by readSNPcoord |
haplo |
haplotypes of individuals ("haplotypes" element of the list return by readPhasedGeno function) |
p1.id |
id of the first parent |
p2.id |
id of the second parent |
makrerEffects |
(output of extract_additive_effects function) list of 2 elements: intercept : named vector of the intercepts SNPeffects : data.frame of the additive effects, 1 columns per phenotype with the marker ids as row names. |
list for each trait with a list with
sum
the global expected value for the trait (taking in account the intercept)by_chr
list of the expected value for each chromosome (NOT taking in account the intercept)
Calculate the variance covariance matrix of the progeny's blups for several traits
Calculate the variance covariance matrix of the progeny's blups for several traits
calcProgenyBlupCovariance(
SNPcoord,
r,
haplo,
p1.id,
p2.id,
markerEffects,
blupExpectedValues
)
Argument | Description |
---|---|
SNPcoord |
SNP coordinate data.frame return by readSNPcoord |
r |
recombination rate matrices return by calcRecombRate |
haplo |
haplotypes of individuals ("haplotypes" element of the list return by readPhasedGeno function) |
p1.id |
id of the first parent |
p2.id |
id of the second parent |
blupExpectedValues |
output of calcProgenyBlupExpected |
makrerEffects |
(output of extract_additive_effects function) list of 2 elements: intercept : named vector of the intercepts SNPeffects : data.frame of the additive effects, 1 columns per phenotype with the marker ids as row names. |
matrix
Download geno data
Download geno data
downloadGenoData(url)
Argument | Description |
---|---|
url |
url of the geno data file (.vcf.gz file) |
gaston::bed.matrix
Download phased geno data
Download phased geno data
downloadPhasedGeno(url)
Argument | Description |
---|---|
url |
url of the geno data file (.vcf.gz file) |
list of 2: haplotypes
a matrix of the individuals haplotypes
and SNPcoord
, data frame of the SNP coordinates.
Download phenotypic data
Download phenotypic data
downloadPhenoData(url)
Argument | Description |
---|---|
url |
url of the phenotypic data file (csv file) |
The individuals' names must be on the first column. No duplication is allowed.
data.frame
Download and prepare data for GWAS analysis
Download and prepare data for GWAS analysis
downloadData(genoUrl, phenoUrl)
Argument | Description |
---|---|
genoUrl |
url of the geno data file (.vcf.gz file) |
phenoUrl |
url of the phenotypic data file (csv file) |
List
Download a gwas reults
Download a gwas reults
downloadGWAS(url)
Argument | Description |
---|---|
url |
url of the result data file (json file) |
data.frame
Download pedigree data
Download pedigree data
downloadPedData(url, unknown_string = "", header = TRUE)
Argument | Description |
---|---|
url |
url of the pedigree data file (csv file). |
unknown_string |
[default: ""] a character vector of strings which are to be interpreted as "unknown parent". By default: missing value in the file. |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 will be interpreted as the individual id, column 2 as the first parent, column 3 as the second parent. |
List of 2: data
pedigree data, graph
"igraph" object of the pedigree graph.
Download relationship matrix
Download relationship matrix
downloadRelMat(url, format = tools::file_ext(url))
Argument | Description |
---|---|
url |
url of the result data file (csv or json file) |
format |
format of the input file. Either "csv" or "json" (optional, by default it will use "json"). |
matrix
Download crossing table
Download crossing table
downloadCrossTable(url, header = TRUE)
Argument | Description |
---|---|
url |
url of the crossing table data file (csv file of 2 or 3 columns). |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 and 2 will be interpreted as the parents id. The optional third column will be interpreted as the offspring base name. |
data.frame with the crossing table information.
download SNP coordinates .csv
file
download SNP coordinates .csv
file
downloadSNPcoord(url)
Argument | Description |
---|---|
url |
url of the SNPs coordinates file (csv file). This .csv file can have 4 named columns: - chr : Chromosome holding the SNP - physPos : SNP physical position on the chromosome - linkMapPos : SNP linkage map position on the chromosome in Morgan - SNPid : SNP's IDs If SNPid columns is missing or have missing values, the SNPid will be automatically imputed using the convention chr@physPos therefore columns chr and physPos should not have any missing values |
data.frame of 4 columns: 'chr', 'physPos', 'linkMapPos', 'SNPid'
Download marker effects file
Download marker effects file
downloadMarkerEffects(url)
Argument | Description |
---|---|
url |
url of the marker effects file (csv , or json file). |
data.frame of 1 columns named effects
with the marker ids as
row names.
Download progeny BLUP estimation file
Download progeny BLUP estimation file
downloadProgBlupEstim(url)
Argument | Description |
---|---|
url |
url of the progeny BLUP estimation file generated by r-geno-tools-engine containing the blup estimations of the progenies of some crosses (json file). |
data.frame of 4 columns named "ind1", "ind2", "blup_var", "blup_exp"
Read geno data from a file
Read geno data from a file
readGenoData(file)
Argument | Description |
---|---|
file |
VCF file path (ext .vcf or .vcf.gz ) |
gaston::bed.matrix
Read phased genetic data from a file
Read phased genetic data from a file
readPhasedGeno(file)
Argument | Description |
---|---|
file |
phased VCF file path (ext .vcf or .vcf.gz ) |
list of 2: haplotypes
a matrix of the individuals haplotypes
and SNPcoord
, data frame of the SNP coordinates.
Read phenotypic data file
Read phenotypic data file
readPhenoData(file, ind.names = 1, ...)
Argument | Description |
---|---|
file |
file path |
ind.names |
[default 1] a single number giving the column of the table which contains the individuals' names. |
... |
Further arguments to be passed to read.csv |
Any duplication in the phenotypic file is forbidden.
data.frame
Read and prepare data for GWAS result
Read and prepare data for GWAS result
readData(genoFile, phenoFile)
Argument | Description |
---|---|
genoFile |
path of the geno data file (.vcf or .vcf.gz file) |
phenoFile |
path of the phenotypic data file (csv file) |
List
Read and prepare pedigree data
Read and prepare pedigree data
readPedData(file, unknown_string = "", header = TRUE)
Argument | Description |
---|---|
file |
path of the pedigree data file (csv file). |
unknown_string |
[default: ""] a character vector of strings which are to be interpreted as "unknown parent". By default: missing value in the file. |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 will be interpreted as the individual id, column 2 as the first parent, column 3 as the second parent. |
We consider here only allo-fecundation or auto-fecundation. For auto-fecundation, use the parental individual id in both column 2 and 3. Doubles haploids can not be interpreted, please avoid them in the file.
Please be sure that all individuals id in columns 2 and 3 are defined in the column 1. If columns 2 and/or 3 contain id of individuals that are not in the first column, a warning will be raised and these individuals will be added to the pedigree with unknown parents as founder individuals.
List of 2: data
pedigree data, graph
"igraph" object of the pedigree graph.
Read a relationship matrix file
Read a relationship matrix file
readRelMat(file, format = tools::file_ext(file))
Argument | Description |
---|---|
file |
path of the file generated by the function saveRelMat containing relationship matrix |
format |
format of the input file. Either "csv" or "json" (optional, by default it will use the file extension). |
The metadata of the file are not kept.
matrix
Read crossing table
Read crossing table
readCrossTable(file, header = TRUE)
Argument | Description |
---|---|
file |
path of the crossing table data file (csv file of 2 or 3 columns). |
header |
[default: TRUE] a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. In any cases, the column 1 and 2 will be interpreted as the parents id. The optional third column will be interpreted as the offspring base name. |
data.frame with the crossing table information.
Read SNP coordinates .csv
file
Read SNP coordinates .csv
file
readSNPcoord(file)
Argument | Description |
---|---|
file |
path of the SNPs coordinates file (csv file). This .csv file can have 4 named columns: - chr : Chromosome holding the SNP - physPos : SNP physical position on the chromosome - linkMapPos : SNP linkage map position on the chromosome in Morgan (mandatory) - SNPid : SNP's IDs Column physPos is optional except in some particular case (see below) If SNPid columns is missing or have missing values, the SNPid will be automatically imputed using the convention chr@physPos therefore columns chr and physPos should not have any missing values |
data.frame of 4 columns: 'chr', 'linkMapPos', 'SNPid'
Read GWAS analysis result file (.json
)
Read GWAS analysis result file (.json
)
readGWAS(file)
Argument | Description |
---|---|
file |
path of the json file generated by the function gwas containing GWAS result |
list
of 2 elements gwas
(data.frame) and metadata
(list)
Read marker effects file
Read marker effects file
readMarkerEffects(file)
Argument | Description |
---|---|
file |
path of the marker effects file (csv , or json file) |
For .csv
, the file file should
have 2 named columns:
SNPid
: Marker ideffects
: effect of the corresponding marker one can specify the intercept using a "--INTERCEPT--" as SNPid.
For .json
, the file should
have 2 Key-value pairs:
intercept
: a number with the value of the intercept.coefficient
: a nested object with SNPids as keys and their corresponding effects as values. For example : list("\n", " "intercept": 100,\n", " "coefficients": ", list("\n", " "SNP01": 1.02e-06,\n", " "SNP02": 0.42,\n", " "SNP03": 0.0\n", " "), "\n")
list of 2 elements:
intercept
: the value of the intercept,
effects
: data.frame of 1 columns named SNPeffects
with the marker ids as
row names.
Read marker effects CSV file
Read marker effects CSV file
readMarkerEffects_csv(file)
Argument | Description |
---|---|
file |
path of the marker effects file (csv file). This .csv file should have 2 named columns: - SNPid : Marker id - effects : effect of the corresponding marker one can specify the intercept using a "--INTERCEPT--" as SNPid. |
list of 2 elements:
intercept
: the value of the intercept,
effects
: data.frame of 1 columns named SNPeffects
with the marker ids as
row names.
Read marker effects JSON file
Read marker effects JSON file
readMarkerEffects_json(file)
Argument | Description |
---|---|
file |
path of the marker effects file (json file). This .json file should have 2 Key-value pairs: - intercept : a number with the value of the intercept. - coefficient : a nested object with SNPids as keys and their corresponding effects as values. For example : list("\n", " "intercept": 100,\n", " "coefficients": ", list("\n", " "SNP01": 1.02e-06,\n", " "SNP02": 0.42,\n", " "SNP03": 0.0\n", " "), "\n") |
list of 2 elements:
intercept
: the value of the intercept,
effects
: data.frame of 1 columns named SNPeffects
with the marker ids as
row names.
Read progeny BLUP estimation file
Read progeny BLUP estimation file
readProgBlupEstim(file)
Argument | Description |
---|---|
file |
path of the progeny BLUP estimation file generated by r-geno-tools-engine containing the blup estimations of the progenies of some crosses (json file). |
data.frame of 4 columns named "ind1", "ind2", "blup_var", "blup_exp"
saveGWAS save gwas result in a temporary file
saveGWAS save gwas result in a temporary file
saveGWAS(gwasRes, metadata, dir = NULL, file = NULL)
Argument | Description |
---|---|
gwasRes |
data.frame return by gwas function |
metadata |
list of metadata of the gwas results |
dir |
if filename is NULL, directory where to save the data, by default it is a temporary directory |
file |
file path where to save the data. If the file already exists, it will be overwritten. Default NULL |
path of the created file
Save relationship matrix in file
Save relationship matrix in file
saveRelMat(
relMat,
metadata = NULL,
dir = NULL,
file = NULL,
format = tools::file_ext(file)
)
Argument | Description |
---|---|
relMat |
relationship matrix created with pedRelMat |
metadata |
list of metadata of the relationship matrix (optional). |
dir |
if file is NULL, directory where to save the data, by default it is a temporary directory |
file |
file path where to save the data. If the file already exists, it will be overwritten. Default NULL (it will create a new "csv" file) |
format |
format of the output file. Either "csv" or "json". (optional, by default it will use the file extension, if file is NULL, "csv"). |
path of the created file
Filter individuals and remove monomorphic markers
Filter individuals and remove monomorphic markers
prepareData(gDta, pDta)
Argument | Description |
---|---|
gDta |
output of downloadGenoData or readGenoData functions |
pDta |
output of downloadPhenoData or readPhenoData functions |
The function remove the monomorphic markers and
List of 2 elements: genoData
(a bed matrix), phenoData
(a data.frame)
Save phased genotypes of simulatied population to vcf.gz file
Save phased genotypes of simulatied population to vcf.gz file
saveVcf(file, pop, SNPcoord)
Argument | Description |
---|---|
file |
file path where to save the data. If the file already exists, it will be overwritten. |
pop |
simulated population (breedSimulatR 's population) |
SNPcoord |
snp coordinate of the genotypes. (data.frame with chr , physPos , and SNPid columns) |
Save a R data.frame as json file
Save a R data.frame as json file
save_dataFrame_as_json(df, file)
Argument | Description |
---|---|
df |
data.frame |
file |
file path where to save the data. If the file already exists, it will be overwritten. |
path of the created file
Save a R blupVarExp as json file
Save a R blupVarExp as json file
save_blupVarExp_as_json(blupVarExp, file)
Argument | Description |
---|---|
blupVarExp |
list (cf. calc_progenyBlupEstimation() ) |
file |
file path where to save the data. If the file already exists, it will be overwritten. |
path of the created file
Save a plotly graph
Save a plotly graph
save_plotly(plot, file)
Argument | Description |
---|---|
plot |
graph generated with plotly |
file |
file path where to save the html graph. If the file already exists, it will be overwritten. |
path of the created file
Save a GS model
Save a GS model
save_GS_model(model, trait_name, file)
Argument | Description |
---|---|
model |
GS model generated with |
trait_name |
name of the phenotypic trait predicted by this model |
file |
file path where to save the model. If the file already exists, it will be overwritten. |
path of the created file
Format markerEffects
to be used by calcProgenyBlupCovariance
function.
Format markerEffects
to be used by calcProgenyBlupCovariance
function.
extract_additive_effects(markerEffects)
Argument | Description |
---|---|
markerEffects |
list |
list of 2 elements:
intercept
: named vector of the intercepts
SNPeffects
: data.frame of the additive effects, 1 columns per phenotype with
the marker ids as row names.
Save evaluation result in .json file
Save evaluation result in .json file
save_GS_evaluation(evaluation, trait_name, file)
Argument | Description |
---|---|
evaluation |
output of cross_validation_evaluation function |
trait_name |
trait name |
file |
file path where to save the evaluations resutls. If the file already exists, it will be overwritten. |
read evaluation results file
read evaluation results file
read_GS_evaluation(file)
Argument | Description |
---|---|
file |
path of the evaluation result file generated by save_GS_evaluation function |
Pedigree Relationship Matrix calculation
Pedigree Relationship Matrix calculation
pedRelMat(ped)
Argument | Description |
---|---|
ped |
List return by readPedData function |
matrix
Hiroyoshi Iwata, Julien Diot
Additive Genomic Matrix
Additive Genomic Matrix
calc_additive_geno(geno, standardized = TRUE)
Argument | Description |
---|---|
geno |
gaston::bed.matrix return by readGenoData function |
standardized |
boolean (default TRUE) control if the returned genetic matrix should be standardized. |
The standardization is made with: (x - 2p) / sqrt(2p*(1 - p)) with x the genetic values in alleles dose and p the allelic frequency.
matrix
Additive Genomic Relationship Matrix calculation
Additive Genomic Relationship Matrix calculation
calc_additive_rel_mat(geno, standardized = TRUE)
Argument | Description |
---|---|
geno |
gaston::bed.matrix return by readGenoData function |
standardized |
boolean (default TRUE) control if the calculation should be done on the standardized additive genetic matrix. |
list of 2 elements:
geno_mat
: the genetic matrix used for calculating the relationship matrixrel_mat
: the relationship matrix
Dominance Genomic Matrix
Dominance Genomic Matrix
calc_dominance_geno(geno, standardized = TRUE)
Argument | Description |
---|---|
geno |
gaston::bed.matrix return by readGenoData function |
standardized |
boolean (default TRUE) control if the returned genetic matrix should be standardized. |
The standardization is made with the matrix with entries
p/(1 - p)
, -1
, (1-p)/p
according to the values 0
, 1
, 2
in the
genetic matrix, with p the allele frequency (cf. ?gaston::DM
)
matrix
Dominance Genomic Relationship Matrix calculation
Dominance Genomic Relationship Matrix calculation
calc_dominance_rel_mat(geno, standardized = TRUE)
Argument | Description |
---|---|
geno |
gaston::bed.matrix return by readGenoData function |
standardized |
boolean (default TRUE) control if the calculation should be done on the standardized dominance genetic matrix. |
list of 2 elements:
geno_mat
: the genetic matrix used for calculating the relationship matrixrel_mat
: the relationship matrix
Combined (pedigree + genomic) Relationship Matrix
Correct a pedigree relationship matrix by using genomic relationship matrix.
combinedRelMat(ped_rm, geno_rm, method = "Legarra", tau = NULL, omega = NULL)
Argument | Description |
---|---|
ped_rm |
pedigree relationship matrix (matrix from function pedRelMat) |
geno_rm |
genomic relationship matrix (matrix from function genoRelMat) |
method |
method to use, either "Legarra" or "Martini" |
tau |
tau parameter of the Martini's method |
omega |
omega parameter of the Martini's method |
matrix
Hiroyoshi Iwata, Julien Diot
Do not show a specific warning message
Do not show a specific warning message
supThisWarning(expr, warnMessage)
Argument | Description |
---|---|
expr |
expression to evaluate. |
warnMessage |
warning message to catch |
This is based on the base::suppressWarnings
function.