Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the format of input data? #1

Open
baif666 opened this issue Feb 21, 2021 · 2 comments
Open

What's the format of input data? #1

baif666 opened this issue Feb 21, 2021 · 2 comments

Comments

@baif666
Copy link

baif666 commented Feb 21, 2021

I didn't find the detailed format of input data in your tutorial and is there any function we can use to convert our own data to the required format.

Ten thousands SNP genotypes were filtered for high-quality, relatively high coverage and low number of missing values. Individual metadata and sample ages are provided as separate objects. The ages were defined by the averages of 95.4% date range in calBP provided with the database. The command below shows how to load the data in the R environment and which groups have been included in the data set.

@francoio
Copy link
Contributor

Genotypes must be loaded in R in matrix format with individuals as rows and genotypes at each locus as columns. Ages must be loaded in R in numeric format, with ages = 0 corresponding to present-day individuals. Time unit is not important, as the dates are internally calibrated. No missing data allowed.

If your data are in a ".vcf", ".geno" or ".ped" format convert them with vcftools or with the R package LEA. LEA's impute() function can also impute the missing genotypes. Send me a private email if you need more on preprocessing/filtering the data, i could send our script.

@baif666
Copy link
Author

baif666 commented Feb 22, 2021

thanks! I will try LEA. I think you can write it in your tutorial to make it more clear.

@baif666 baif666 closed this as completed Feb 25, 2021
@baif666 baif666 reopened this Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants