Skip to content

Native Installation

Robert J. Gifford edited this page Nov 27, 2024 · 5 revisions

Offline HCV-GLUE

The HCV-GLUE resource can be used "offline" to organise and analyse sequence data on a private computer. Offline HCV-GLUE takes the form of a GLUE project build (a linked dataset and set of analysis functions). This project build can be loaded into an instance of the GLUE engine.

A certain level of Unix command-line computing experience is required to install and use Offline HCV-GLUE. Please follow the instructions below to set up and use offline HCV-GLUE on your computer. A working installation of offline HCV-GLUE is capable of a range of analysis functions. Some examples are given below; the GLUE engine website documents how other functions may be accessed.

  1. Install the GLUE engine
  2. Download and install the HCV-GLUE project build
  3. Check the project build is working
  4. Example use: Basic analysis of deep sequencing data
  5. Example use: Combined genotyping and drug resistance analysis reports

Please contact Josh Singer or post a question on the GLUE support forum with any questions or queries about offline HCV-GLUE


Install the GLUE engine

Please follow the GLUE installation instructions. We would strongly recommend a Docker-based installation of GLUE.


Download and install the HCV-GLUE project build

Note: This step will erase anything that was previously in your database.

Docker-based GLUE

  1. Make sure the gluetools-mysql container is running.

  2. The following UNIX command will install the HCV-GLUE project build.

    $ docker exec gluetools-mysql installGlueProject.sh ncbi_hcv_glue
    
    

Native GLUE

  1. Download ncbi_hcv_glue.sql.gz

  2. Load the build into your MySQL databse using a Unix command line, adjusting the details according to your system:

    $ gunzip -c ncbi_hcv_glue.sql.gz | /usr/local/mysql/bin/mysql --user=gluetools --password=glue12345 GLUE_TOOLS
    
    

Check the project build is working

  1. Download the files resistanceGeno1.fasta and ngsData.bam to a local directory, e.g. /home/fred/glue_data

  2. Start the GLUE command line, ensuring that your local directory is accessible to GLUE:

    Docker-based GLUE

    $ docker run --rm -it --name gluetools -v /home/fred/glue_data:/home/fred/glue_data -w /home/fred/glue_data --link gluetools-mysql cvrbioinformatics/gluetools:latest
    
    

    Native GLUE

    $ cd /home/fred/glue_data
    $ gluetools.sh
    
    
  3. Use list project to check that the HCV project is there.

    Mode path: / GLUE> list project +======+================================+ | name | description | +======+================================+ | hcv | Hepatitis C variation analysis | +======+================================+ Projects found: 1

  4. We will use the maxLikelihoodGenotyper module within HCV-GLUE to assign a genotype and subtype for the sequence in the FASTA file. Enter the GLUE commands below and check the output.

    GLUE> project hcv
    OK
    Mode path: /project/hcv
    GLUE> module maxLikelihoodGenotyper genotype file --fileName resistanceGeno1.fasta
    +===========+====================+===================+
    | queryName | genotypeFinalClade | subtypeFinalClade |
    +===========+====================+===================+
    | EF407428  | AL_1               | AL_1a             |
    +===========+====================+===================+
    
    
  5. We will use the phdrFastaSequenceReporter module within HCV-GLUE to translate the NS5A gene of this sequence to amino acids. Enter the GLUE command below and check the output. Use 'Q' to exit the interactive table.

    Mode path: /project/hcv
    GLUE> module phdrFastaSequenceReporter amino-acid -i resistanceGeno1.fasta -r REF_MASTER_NC_004102 -f NS5A -t REF_1a_M62321 -a AL_UNCONSTRAINED
    +==========+=========+==========+==========+===========+===========+===========+
    |codonLabel| queryNt | relRefNt | codonNts | aminoAcid |definiteAas|possibleAas|
    +==========+=========+==========+==========+===========+===========+===========+
    |1         | 6164    | 6258     | TCC      | S         |S          |S          |
    |2         | 6167    | 6261     | GGC      | G         |G          |G          |
    |3         | 6170    | 6264     | TCC      | S         |S          |S          |
    |4         | 6173    | 6267     | TGG      | W         |W          |W          |
    |5         | 6176    | 6270     | CTA      | L         |L          |L          |
    |6         | 6179    | 6273     | AGG      | R         |R          |R          |
    |7         | 6182    | 6276     | GAC      | D         |D          |D          |
    |8         | 6185    | 6279     | ATC      | I         |I          |I          |
    |9         | 6188    | 6282     | TGG      | W         |W          |W          |
    |10        | 6191    | 6285     | GAC      | D         |D          |D          |
    |11        | 6194    | 6288     | TGG      | W         |W          |W          |
    |12        | 6197    | 6291     | ATA      | I         |I          |I          |
    |13        | 6200    | 6294     | TGC      | C         |C          |C          |
    |14        | 6203    | 6297     | GAG      | E         |E          |E          |
    |15        | 6206    | 6300     | GTG      | V         |V          |V          |
    |16        | 6209    | 6303     | TTG      | L         |L          |L          |
    |17        | 6212    | 6306     | AGC      | S         |S          |S          |
    |18        | 6215    | 6309     | GAC      | D         |D          |D          |
    |19        | 6218    | 6312     | TTT      | F         |F          |F          |
    +==========+=========+==========+==========+===========+===========+===========+
    Rows 1 to 19 of 448 [F:first, L:last, P:prev, N:next, Q:quit]
    
    

Example use: Basic Analysis of Deep Sequencing Data

Offline HCV-GLUE can be used for minority variant analysis of deep sequencing data in the form of SAM/BAM files We presume here that a suitable sequence assembly method has been used to generate the SAM/BAM file from the raw sequencing data.

We will use the phdrSamReporter module within HCV-GLUE to translate those reads within the file which map to the NS5A gene. The function will report the balance of different amino acids residues at different locations within this gene.

Enter the GLUE command below and check the output. Use 'Q' to exit the interactive table.

GLUE> module phdrSamReporter amino-acid -i ngsData.bam -r REF_MASTER_NC_004102 -f NS5A -p -a AL_UNCONSTRAINED
+============+==========+==========+===========+=============+============+
| codonLabel | samRefNt | relRefNt | aminoAcid | readsWithAA | pctAaReads |
+============+==========+==========+===========+=============+============+
| 1          | 6147     | 6258     | Y         | 1           | 0.32       |
| 1          | 6147     | 6258     | S         | 314         | 99.37      |
| 1          | 6147     | 6258     | P         | 1           | 0.32       |
| 2          | 6150     | 6261     | S         | 1           | 0.32       |
| 2          | 6150     | 6261     | R         | 1           | 0.32       |
| 2          | 6150     | 6261     | G         | 313         | 99.37      |
| 3          | 6153     | 6264     | S         | 315         | 100.00     |
| 4          | 6156     | 6267     | W         | 315         | 100.00     |
| 5          | 6159     | 6270     | L         | 314         | 100.00     |
| 6          | 6162     | 6273     | R         | 312         | 99.36      |
| 6          | 6162     | 6273     | K         | 1           | 0.32       |
| 6          | 6162     | 6273     | G         | 1           | 0.32       |
| 7          | 6165     | 6276     | D         | 314         | 100.00     |
| 8          | 6168     | 6279     | I         | 274         | 100.00     |
| 9          | 6171     | 6282     | W         | 274         | 100.00     |
| 10         | 6174     | 6285     | D         | 253         | 100.00     |
| 11         | 6177     | 6288     | *         | 1           | 0.40       |
| 11         | 6177     | 6288     | W         | 252         | 99.60      |
| 12         | 6180     | 6291     | I         | 197         | 100.00     |
+============+==========+==========+===========+=============+============+
Rows 1 to 19 of 753 [F:first, L:last, P:prev, N:next, Q:quit]

Example use: Combined genotyping and drug resistance analysis reports

Offline HCV-GLUE contains modules which run combined genotyping and drug resistance analysis procedures, and generate detailed reports.

This first example will read in the FASTA file, resistanceGeno1.fasta and produce an HTML report file resistanceGeno1.html. Click here for a preview of the report.

Mode path: /project/hcv
GLUE> module phdrReportingController invoke-function reportFastaAsHtml resistanceGeno1.fasta resistanceGeno1.html

The second example will read in the BAM file, ngsData.bam and produce a new HTML report ngsData.html, using a 15% reads percentage minimum threshold for reporting polymorphisms. Click here for a preview of the report.

Mode path: /project/hcv
GLUE> module phdrReportingController invoke-function reportBamAsHtml ngsData.bam 15.0 ngsData.html

HCV-GLUE can also generate files containing genotyping and drug resistance reporting data in a machine-readable format (XML or JSON). These could be used e.g. for integration into a bioinformatics pipeline. Contact the team if this is of interest.