diff --git a/episodes/01-background.md b/episodes/01-background.md index bc74905f..da7d331e 100644 --- a/episodes/01-background.md +++ b/episodes/01-background.md @@ -43,7 +43,7 @@ We are going to use a long-term sequencing dataset from a population of *Escheri ### View the metadata -We will be working with three sample events from the **Ara-3** strain of this experiment, one from 5,000 generations, one from 15,000 generations, and one from 50,000 generations. The population changed substantially during the course of the experiment, and we will be exploring how (the evolution of a **Cit+** mutant and **hypermutability**) with our variant calling workflow. The metadata file associated with this lesson can be [downloaded directly here](https://raw.githubusercontent.com/datacarpentry/wrangling-genomics/gh-pages/files/Ecoli_metadata_composite.csv) or [viewed in Github](https://github.com/datacarpentry/wrangling-genomics/blob/gh-pages/files/Ecoli_metadata_composite.csv). If you would like to know details of how the file was created, you can look at [some notes and sources here](https://github.com/datacarpentry/wrangling-genomics/blob/gh-pages/files/Ecoli_metadata_composite_README.md). +We will be working with three sample events from the **Ara-3** strain of this experiment, one from 5,000 generations, one from 15,000 generations, and one from 50,000 generations. The population changed substantially during the course of the experiment, and we will be exploring how (the evolution of a **Cit+** mutant and **hypermutability**) with our variant calling workflow. The metadata file associated with this lesson can be [downloaded directly here](files/Ecoli_metadata_composite.csv) or [viewed in Github](https://github.com/datacarpentry/wrangling-genomics/blob/main/episodes/files/Ecoli_metadata_composite.csv). If you would like to know details of how the file was created, you can look at [some notes and sources here](https://github.com/datacarpentry/wrangling-genomics/blob/main/episodes/files/Ecoli_metadata_composite_README.md). This metadata describes information on the *Ara-3* clones and the columns represent: @@ -81,8 +81,6 @@ Based on the metadata, can you answer the following questions? 2. 62 rows, 12 columns 3. 10 citrate+ mutants 4. 6 hypermutable mutants - - ::::::::::::::::::::::::: diff --git a/episodes/05-automation.md b/episodes/05-automation.md index 9fc16cb9..5e6057fd 100644 --- a/episodes/05-automation.md +++ b/episodes/05-automation.md @@ -19,7 +19,7 @@ exercises: 15 ## What is a shell script? -You wrote a simple shell script in a [previous lesson](https://www.datacarpentry.org/shell-genomics/05-writing-scripts/) that we used to extract bad reads from our +You wrote a simple shell script in a [previous lesson](https://www.datacarpentry.org/shell-genomics/05-writing-scripts) that we used to extract bad reads from our FASTQ files and put them into a new file. Here is the script you wrote: @@ -76,7 +76,6 @@ above, and during the 'for' loop lesson). Assign any name and the value using the assignment operator: '='. You can check the current definition of your variable by typing into your script: echo $variable\_name. - :::::::::::::::::::::::::::::::::::::::::::::::::: In this lesson, we will use two shell scripts to automate the variant calling analysis: one for FastQC analysis (including creating our summary file), and a second for the remaining variant calling. To write a script to run our FastQC analysis, we will take each of the commands we entered to run FastQC and process the output files and put them into a single file with a `.sh` extension. The `.sh` is not essential, but serves as a reminder to ourselves and to the computer that this is a shell script. @@ -229,10 +228,10 @@ replace SRR2584866_fastqc/Icons/fastqc_icon.png? [y]es, [n]o, [A]ll, [N]one, [r] We can extend these principles to the entire variant calling workflow. To do this, we will take all of the individual commands that we wrote before, put them into a single file, add variables so that the script knows to iterate through our input files and write to the appropriate output files. This is very similar to what we did with our `read_qc.sh` script, but will be a bit more complex. -Download the script from [here](https://raw.githubusercontent.com/datacarpentry/wrangling-genomics/gh-pages/files/run_variant_calling.sh). Download to `~/dc_workshop/scripts`. +Download the script from [here](files/run_variant_calling.sh). Download to `~/dc_workshop/scripts`. ```bash -curl -O https://raw.githubusercontent.com/datacarpentry/wrangling-genomics/gh-pages/files/run_variant_calling.sh +curl -O https://datacarpentry.org/wrangling-genomics/files/run_variant_calling.sh ``` Our variant calling workflow has the following steps: @@ -408,7 +407,6 @@ It is a good idea to add comments to your code so that you (or a collaborator) c Look through your existing script. Discuss with a neighbor where you should add comments. Add comments (anything following a `#` character will be interpreted as a comment, bash will not try to run these comments as code). - :::::::::::::::::::::::::::::::::::::::::::::::::: Now we can run our script: @@ -445,8 +443,6 @@ For SRR2589044 from generation 5000 there were 10 mutations, for SRR2584863 from and SRR2584866 from generation 766 mutations. In the last generation, a hypermutable phenotype had evolved, causing this strain to have more mutations. - - ::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::: @@ -459,7 +455,6 @@ If you have time after completing the previous exercise, use `run_variant_callin on the full-sized trimmed FASTQ files. You should have a copy of these already in `~/dc_workshop/data/trimmed_fastq`, but if you do not, there is a copy in `~/.solutions/wrangling-solutions/trimmed_fastq`. Does the number of variants change per sample? - :::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::: keypoints