Skip to content

Commit

Permalink
Add new SuppMaterial_3 returning missing geno_key
Browse files Browse the repository at this point in the history
Adding the SuppMaterial_3 file that outputs the misses geno_key file linking the genotypes to their family and individual IDs within family
  • Loading branch information
jinkog authored Jul 8, 2024
1 parent ad2bd2a commit c1598f3
Showing 1 changed file with 21 additions and 3 deletions.
24 changes: 21 additions & 3 deletions SimRVseq/SupplementaryMaterial_3.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ We start by reading the SLiM simulation output, \texttt{SLiM\_output.txt}, into
library(Matrix) #this package is required throughout this document
# Read the text file to R.
# Note: Change the path for the file as necessary.
exData <- readLines("D:/SFU_Vault/SLiM_Output/SLiM_output.txt")
#exData <- readLines("D:/SFU_Vault/SLiM_Output/SLiM_output.txt")
exData <- readLines("../Zenodo/SLiM_output.txt")
```

Next, we select rare SNVs based on their population derived (mutated) allele frequencies, as described in the next subsection.
Expand Down Expand Up @@ -328,7 +329,8 @@ To identify the RVs that lie on the pathway of interest, we use the \texttt{iden
# Load the output generated from the previous code chunk.
# Note: Change the path for the file as necessary.
load("Chromwide.Rdata")
#load("Chromwide.Rdata")
load("../Zenodo/Chromwide.Rdata")
#----------------------#
# Identify Pathway SNVs #
Expand Down Expand Up @@ -821,7 +823,8 @@ Familial cRVs are sampled on the basis of their population derived-allele freque
```{r}
# Load all 150 pedigrees.
# Note: Change the path for the file as necessary.
study_peds <- read.table("study_peds.txt", header=TRUE, sep= " ")
#study_peds <- read.table("study_peds.txt", header=TRUE, sep= " ")
study_peds <- read.table("../Zenodo/study_peds.txt", header=TRUE, sep= " ")
# Collect list of FamIDs.
FamIDs <- unique(study_peds$FamID)
Expand Down Expand Up @@ -1312,6 +1315,21 @@ for(i in 1:22){
}
```

Next, we create a dataframe of IDs to link genotypes to individuals. The IDs for each RV-haplotype are in the dataframe \texttt{haplo\_map} returned by the chromosome-by-chromosome gene drop. We save the IDs for each genotype in a dataframe called \texttt{geno\_key}
that has rows for genotypes, in the same order as the \texttt{.geno} files, and columns for the family ID and ID.

```{r}
odd_inds <- seq(from=1,to=nrow(study_seq[[1]]$haplo_map),by=2)
geno_key <- study_seq[[1]]$haplo_map[odd_inds,c("FamID","ID")]
```

We then write the \texttt{geno\_key} dataframe as a text file, \texttt{geno\_key.txt}. The text file can be found in our Zenodo repository.

```{r,eval=FALSE}
write.table(geno_key, "geno_key.txt", row.names=FALSE, quote = FALSE)
```


## \texttt{.var} files

A \texttt{.var} file contains information about the RVs in the columns of the associated \texttt{.geno} file.
Expand Down

0 comments on commit c1598f3

Please sign in to comment.