forked from cytham/nanovar
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG.txt
243 lines (185 loc) · 10.5 KB
/
CHANGELOG.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
NanoVar Changelog
Release Summary:
Version 1.5.1 - Dec *, 2023
* BND SVs are now represented as one entry in VCF
* BND SVs now has an additional INFO attribute, CHR2, that records second chromosome name of BND
* Fixed cytocad import bug
* Fixed intersect_debug bug
* Fixed SVLEN float value bug
* Fixed missing FASTA entries bug
Version 1.5.0 - Sept 8, 2023
* Fixed font warning during report generation
* NanoVar no longer relies on HS-BLASTn (This would fixed multiple run errors)
* Suppressed keras tensorflow model warning
* Fixed nv_detect_algo import error due to Cython 3.0.2 Namespace update
* Reworked DUP calling to increase accuracy (Previously INS miscalled as DUP)
* Added MAPQ score filter against alignments between with 0 < MAPQ < 30
* HTML report is now a standalone file
* ins_seq.fa is output by default, it contains the inserted sequences of all INS SVs
* Removed ">" in SVLEN for one-sided INS SVs
* Temporary removed option to run CNV through CopyCAD (work in progress)
* Removed progress spinner and modified stdout status reporting
* Output no longer produces total.vcf file, unless --debug specified
* genome.sizes removed from output
* Modified paired INV breakends size calculation to that of single INV breakend (coord2-coord1)
* INS size adjusted (Taking median INS size across reads, instead of largest size)
* Simplified run status verbose
* Changed SVLEN placeholder for translocations and transpositions from '.' to 0.
* Changed translocation abbreviation from TLO to TRA
* Modified clustering from mean to median SV coordinates amongst reads
* Added NanoVar execution command into VCF header
* Included svtype in clustering filter
* Improved DUP calling by sequence mapping
* Improved SV genotyping
* Improved phasing of SVs sharing the same coordinates
Version 1.4.1 - Oct 8, 2021
* Bump version for Bioconda build for various Python versions
Version 1.4.0 - Sept 1, 2021
* Implemented a large cytogenetic variation detection algorithm through CytoCAD (Add the paramenter "--cnv hg38" during run)
* Added LINE (L1) and SINE (Alu) novel insertion detection functionality (NanoVar screens the sequence of INS SVs
for L1 and Alu elements and output the results in the INFO column of VCF file (E.g. TE=L1HS)
* Updated curated hg38 filter file (added all N regions)
* Expanded the CIGAR reading values to include '=' and 'X'
* Improved breakpoint clustering algorithm and rectified bugs
* Modified setup.py to state compatibility with python3.8
* Fixed Numpy VisibleDeprecationWarning in nv_report.py
* Added '--pickle' arguement for debugging purposes (Hidden option)
* Added '--archivefasta' arguement for debugging purposes (Hidden option)
* Added '--blastout' arguement for debugging purposes (Hidden option)
Version 1.3.9 - Mar 24, 2021
* Fixed nv_detect_algo insertion and deletion large size bug
* Added pysam >=0.15.3 into bioconda metal.yml as prerequisite
* Added pybedtools >=0.8.2 prerequisite to fixed RuntimeWarning buffering=1 error (Refer to https://github.com/daler/pybedtools/issues/322)
* Prevent repeated read-indexes by adjusting seed (Thanks to Geoffrey Woodland)
* Improve read cluster exception message (Thanks to Geoffrey Woodland)
* Unique ID of breakpoints identified by BLAST shortened to four characters to prevent mixing with minimap2 breakpoints
* Adjusted breakend filtering during mm clustering
* Improved breakpoint clustering algorithm to increase accuracy
* Added newline to last line of genome.sizes file
* Added genome check for BAM (Thanks to oneillkza, https://github.com/cytham/nanovar/issues/19#issuecomment-791599629)
* Modified argparse "usage" format
* Suppressed BAM index missing warning
* Supressed Tensorflow INFO and WARNING logs
* Migrated to tensorflow-cpu/tensorflow-mkl to prevent cuda_driver.cc error
* Fixed FixedLocator warning
Version 1.3.8 - May 24, 2020
* Fixed file type detection (Thanks to jiadong324, https://github.com/cytham/nanovar/issues/9#issuecomment-626579853)
* Fixed negative coordinates in VCF
Version 1.3.7 - May 23, 2020
* Changed version import approach in setup.py
* All SV classes except deletions now undergo secondary analysis by hsblast alignment
* Allowed clustering of Nov_Ins to other SV classes
* Added depth of coverage information in VCF header
* Fixed SV index duplications
* Added BND limitation in README.md
* Removed "number_of_maps" from condfidence score equation in nv_nn.py to capture SVs consisting repetitive elements
Version 1.3.6 - Apr 18, 2020
* Fixed Exception bug in nv_cluster.py (Thanks to jiadong324, https://github.com/cytham/nanovar/issues/9#issuecomment-609146494)
* Added mincov filter early in nv_cluster.py for faster computation
* Re-ordered writing of debug files in nv_characterize.py
* Fixed novel insertion length bug in nv_cluster.py
* Fixed duplication coord directional issue in nv_cluster.py
* Size correction for INV and DUP SVs in nv_vcf.py
* Add 400 bp buffer to in-built gap files (0-based)
* Fixed bug in breakend record in nv_vcf.py
* Added output table in README.md
Version 1.3.5 - Apr 1, 2020
* Fixed deletion sv length threshold and double negative bug in nv_vcf.py
* Changed relative input paths to full paths in nv_vcf.py
* Fixed left and right coord allocations in nv_cluster.py
* Corrected nv_input.py typo
* Corrected README.md typo
Version 1.3.4 - Mar 19, 2020
* Fixed missing nanovar script
Version 1.3.3 - Mar 19, 2020
* Upgraded model (ONT (Guppy) - v1, pacbio (CLR or CSS) - v1)
* Modified normal read breakpoint buffer from 400 to 100 to include shorter alignments as breakend-opposing reads
* Survey invalid symbols in contig ids and ignore reads mapping to these contigs (Thanks to Simone, https://github.com/cytham/nanovar/issues/6#issuecomment-595851018)
* Added a new debug file 'detect.tsv'
* Removed blast table intermediate file when not debugging
* Added limitation section in README.md
Version 1.3.2 - Mar 4, 2020
* Fixed VCF header unclosed quotes (Thanks to Scott, https://github.com/cytham/nanovar/issues/5#issuecomment-592961341)
* Improved clustering algorithm (Clustering left and right breakends of each SV separately)
* Updated README.md
Version 1.3.1 - Feb 29, 2020
* Fixed cython compilation bug (Thanks to Scott, https://github.com/cytham/nanovar/issues/5#issuecomment-592766203)
* Updated requirements.txt
Version 1.3.0 - Feb 28, 2020
* Tool requirement changes:
* Added SAMTools
* Added Minimap2
* Algorithm and pipeline changes:
* NanoVar now uses both minimap2 and hs-blastn for increased sensitivity
* Upgraded model (v5)
* Improved time complexity for clustering algorithm
* Minor tuning of SV confidence scores by normal read coverage and number of alignments
* Default SV read support -c changed from 1 to 2
* Added progress information command-line feed during the run.
* Input/Ouput changes:
* NanoVar can now take a BAM file (tested BAM from minimap2) as input instead of FASTQ/FASTA reads
* Added ONT/PacBio option (For future use, currently both tech uses same model)
* Added custom model option
* Added --debug option
* Added Minimap2 executable path option
* Added SAMTools executable path option
* VCF format changes:
* "END" coordinate for BND SVs will now show POS+1 coordinate instead of "." to be usable for IGV (Thanks to Scott,
https://github.com/cytham/nanovar/issues/5#issue-559151039)
Version 1.2.7 - Dec 15, 2019
* Upgraded model (v4)
* Changed default score threshold to 1.0
* Added --mincov argument
* Added genotype prediction thresholds --homo --hetero
* Fixed make_interp_spline bug (Thanks to Asma, https://github.com/cytham/nanovar/issues/1#issue-536971382)
* Fixed logging oversize
* Fixed HTML figures and file path link
* Disabled sorting by bitscore
Version 1.2.6 - Nov 28, 2019
* Fixed svread-overlap.tsv file formating in nv_nn.py
* Fixed operation bug in nv_valid.py l201
Version 1.2.5 - Nov 25, 2019
* Python 3.5 is no longer supported
Version 1.2.4 - Nov 25, 2019
* Replaced keras with tf.keras (>=2.0.0) to avoid compatibility errors
* Fixed progress spinner overflow by adding boolean
Version 1.2.3 - Nov 25, 2019
* Fixed progress spinner by increasing sleep time
Version 1.2.2 - Nov 25, 2019
* Updated dependency versions
* Changed spline in nv_cov_upper to make_interp_spline due to scipy update
Version 1.2.1 - Nov 24, 2019
* Added running progress spinner
* Added bedtools as a pre-requisite for pybedtools
* Added installation of dependencies if using conda
* Updated MANIFEST.in
* Updated README.md, added badges
* Tested in Python 3.5, 3.6, 3.7
Version 1.2-alpha - Nov 21, 2019
* Program language migration to entirely Python 3.7
* Distributed as a python PyPI package and a conda package
* Added requirements of Blast binaries from NCBI-BLAST Version 2.3.0+ and HS-BLASTN v0.0.5+
* Added new parameters: --minalign, --buffer, --force
* Changed input FASTQ/FASTA, reference genome and working directory to positional arguments
* Changed default minimum SV len to 25 bp
* Removed short-read support and bowtie2 requirements
* Improved VCF file formating
* Added FORMAT column
* Added genotype, read-depth and allele read-depth information
* For BND SV type, ALT coloumn now shows a breakend record as specified in VCFv4.2
* For BND SV type, INFO field "SV2" is added to indicate translocation (TLO) or transposition (TPO)
* Changed INFO field naming: LCOV to SR, PROB to NN
* Removed INFO fields: SCOV, SVRATIO
* SV len is now estimated for tandem duplications
* FILTER column now shows "PASS" if SV score above/equal to score threshold, or "FAIL" if SV score below score threshold
* Replaced read_name in each SV ID with an arbituary SV number
Version 1.1.1 - Aug 29, 2019 [Archieved]
* Added gcc, ldd library requirements
* Added Tensorflow installation check
Version 1.1.0 - June 23, 2019 [Archieved]
* Migration from Python 2 to Python 3
Version 1.0.1 - June 21, 2019 [Archieved]
* Fixed chromsome naming bug
* Fixed python pip installation of updated packages
Version 1.0 - May 12, 2019 [Archieved]
* Initial release