What's the format of input data? #1

baif666 · 2021-02-21T09:51:31Z

I didn't find the detailed format of input data in your tutorial and is there any function we can use to convert our own data to the required format.

Ten thousands SNP genotypes were filtered for high-quality, relatively high coverage and low number of missing values. Individual metadata and sample ages are provided as separate objects. The ages were defined by the averages of 95.4% date range in calBP provided with the database. The command below shows how to load the data in the R environment and which groups have been included in the data set.

francoio · 2021-02-21T13:58:06Z

Genotypes must be loaded in R in matrix format with individuals as rows and genotypes at each locus as columns. Ages must be loaded in R in numeric format, with ages = 0 corresponding to present-day individuals. Time unit is not important, as the dates are internally calibrated. No missing data allowed.

If your data are in a ".vcf", ".geno" or ".ped" format convert them with vcftools or with the R package LEA. LEA's impute() function can also impute the missing genotypes. Send me a private email if you need more on preprocessing/filtering the data, i could send our script.

baif666 · 2021-02-22T06:03:54Z

thanks! I will try LEA. I think you can write it in your tutorial to make it more clear.

baif666 closed this as completed Feb 25, 2021

baif666 reopened this Feb 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's the format of input data? #1

What's the format of input data? #1

baif666 commented Feb 21, 2021

francoio commented Feb 21, 2021

baif666 commented Feb 22, 2021

What's the format of input data? #1

What's the format of input data? #1

Comments

baif666 commented Feb 21, 2021

francoio commented Feb 21, 2021

baif666 commented Feb 22, 2021