forked from fechitheodore/GenomeGit3.1
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ShowHelp.txt
82 lines (51 loc) · 4.65 KB
/
ShowHelp.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
Welcome to GenomeGit's help page. To execute any of the commands type: genomegit <command> --options <mandatory_arguments>.
GenomeGit utilizes git and its commands, as well as the following custom commands:
add --threads --no-update --aligner --ggw --additional_args*** <input_file>
Parses <input_file> into git-compatible objects and adds it to the repository. Number of threads can be specified
optionally for liftover (default is --threads=1 or --t=1). If the optional parameter --no-update is specified, no automatic
coordinate migration of VCF, GFF and SAM files will be performed. With optional parameter --aligner or --a, the user can
specify between MashMap + Nucmer (--aligner=1/--a=1) or Nucmer only (--aligner=2/--a=2). Optional parameter --ggw can be
used when adding a new assembly to the repository if there are at least two versions of an assembly already present. This
option ensures that the new assembly is aligned to all the the assemblies in the repository so that the link rearrangement
track in GenomeGitWeb is available between the new assembly and all the previous assemblies.
***Additional arguments can be passed to MashMap, including segment length (--segLength/--s), kmer (--kmer/--k), and
percent identity (--pi), so you may optimize your aligner for speed and accuracy. In addition, you can choose to handle merges
by setting the flag --ms to --ms=2.
Nucmer can also use the --b or --breaklen and --c or --clustermin arguments. All other values are left at default.
list
Create list of files contained in the repository at a given time for each dataset (Genome, Alignment, Variants and
Annotation). It will also inform of their size in MB.
get --dataset --sequence --region --commit-hash --message <file_name>
Reconstructs the current version of the <file_name> contained in the repository. To check the names of the files
currently stored in the repository use genomegit list. Optional parameter --dataset can be used to select all files
contained in a specific dataset (valid input: Genome, Annotation, Alignment and Variants). Optional parameters
--sequence and --region can be used to indicate a sequence name and a region of interest to be extracted. The
region must be specified as two integers separated with a "-". Optional parameters --commit-hash and --message can
be used to retrieve data from a specific version of interest by indicating the commit hash or the commit message
respectively.
discarded <file_name>
Returns the discarded entries of <file_name> (only available after a repository update).
diff --threads --message --busco --lineage_dataset <version1> <version2>
Show the genome assembly update summary and produce a report file for genome versions specified by commit hashes of
verion 1 and version 2: <version1> <version2>. Optional parameter --threads allows to choose the number of threads to
use if an alignment must be performed, default is --threads=1 (only considered if alignment is necessary). Optional
parameter --message can be used if commit messages are provided instead of commit hashes. Optional parameter --busco
can be used to run a busco analysis on both assembly versions. --lineage_dataset can be used alongside --busco to
specify the clade specific information to be used in the analysis. Without --lineage_dataset, automated lineage
selection will be used.
report --busco --lineage_dataset --commit-hash --message
Create a report summarising the information contained in the repository at any given moment. Optional parameters
--commit-hash and --message can be used in order to create a report for a specific version of interest by indicating
the commit hash or the commit message respectively. Optional parameter --busco can be used to run a busco analysis
on the assembly . --lineage_dataset can be used alongside --busco to specify the clade specific information
to be used in the analysis. Without --lineage_dataset, automated lineage selection will be used.
delete <file_name>
Delete the specified <file_name> from the staging area. Extreme caution when using this command is
required as potential data loss migth occur.
Below you can see the list of the most common git commands that can be used in GenomeGit (other git commands may be used as well):
init Initialize a repository.
clone <remote_address> Clone an existing remote repository.
commit -m <message> Commit current changes. Commit message must be specified.
status Show the status of the current working tree.
log Show the commit log.
See more documentation about other git commands on https://git-scm.com/doc or simply using "git help"