-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathCHANGELOG.txt
176 lines (163 loc) · 7.03 KB
/
CHANGELOG.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
# TEFingerprint
# A Python 3 library and command line tool for transposon fingerprinting
v0.3.2
- alpha
- changes
- rename cluster classes and methods IDBCAN and SIDBCAN to DBICAN and SDBICAN respectively
- minor changes for compatibility with python 3.4
- added Makefile
- bug fixes
- simplify and fix travis CI
- documentation
- add MIT licence
- improve method docs
- add graphics to method docs #74
- document requirements
v0.3.1
- alpha
- new features
- maximum count proportion # 115
- added 'max_count_proportion' field to output (was already used for determining colour)
- renamed '--no-colour' switch to '--no-max-count-proportion'
- changes
- package version now stored in single location 'version.py'
v0.3.0
- alpha
- new features
- informative reads
- option to include soft-clipped tips in place of their mate informative read
- option to exclude full-length informative reads completely
- output formats
- outputs with the .gz extension are compressed with block-gzip rather than regular gzip #104
- option to output tab-delimited plain text files suitable for tabix indexing when block-gzipped #103
- extract-informative temporary files #106
- temporary files are now written to the same location as the output by default
- option to not remove these files
- option to write them to a temp location provided by the operating system
- changes
- added Biopython as a dependency to support block-gzip compression
- refactor submodules to hide application specific code
- refactor clustering classes/methods to use new names
- non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
- conservative-hierarchical method named SIDBCAN (Splitting-IDBCAN)
- aggressive-hierarchical method named SIDBCAN-aggressive and deprecated
- bug fixes
- python 3 shebang lines #98
- add MANIFEST.in to allow for github install with pip #89
- documentation
- named clustering algorithms loosely based on DBSCAN
- non-hierarchical method named IDBCAN (Interval Density Based Clustering of Applications with Noise)
- hierarchical method named SIDBCAN (Splitting-IDBCAN)
- many corrections to documentation of clustering algorithms #99
- specify tabix command to index block-gzipped output
v0.2.0
- alpha
- new features
- new 'conservative' clustering method #91
- changes
- new 'conservative' clustering method used by default #91
- simplified clustering/splitting method selection with flag '--splitting-method'
- changed library imports to include loci, fingerprint, fingerprintio and cluster
- tidied cluster.py submodule code
- bug fixes
- fixed miscalculation of initial cluster support #90
- fixed miscalculation of parent vs child support #95
- fixed name collision between sub module cluster and function loci.cluster
- documentation
- updated method.rst with new clustering method
- updated usage.rst with new clustering method
- updated CLI help text
- updated cluster.py docstrings with new method
- added metadata for PyPI
v0.1.4
- alpha
- Command-line-tool names and arguments have changed
- Renamed tools
- remove tef wrapper as it is likely to have a name collision with something
- tef fingerprint/compare are combined into single tool tefingerprint #63
- tef preprocess is now tef-extract-informative
- tef filter-gff is now tef-filter-gff (note: this tools behavior has changes substantially to handle the new gff output)
- Refactored modules
- core sub-module loci.py re-written to be more flexible #82
- single data structure for for representing collections of loci
- arbitrary string length limit removed (use of python string in place numpy strings)
- split tool logic into separate sub-module fingerprint.py
- New Features
- tefingerprint
- trim buffered clusters to extent of read tips
- count n most common elements per sample in each bin #63 #81
- use gff annotation for tagging known elements #80
- join paired clusters using gff annotation #78
- output files (gff, csv) are optionally pipe-able
- output files (gff, csv) contain more detailed data
- can read(anotation gff)/write files compressed with gzip or bz2 #86
- escape special characters in gff files with percent encodings
- tef-filter-gff
- changed to handle new gff output #84
- use of --any and --all contexts for combining filters
- unix style wild cards for matching multiple fields
- read and write gz abd bz2 compressed files #86
- escape special characters with percent encodings
- read gff from standard in
v0.1.3
- alpha
- Performance improvements:
- Significantly reduced memory usage and improved speed for preprocess #66 #70
- Reduced memory usage for fingerprint/compare io operations #68
- Reduced memory usage for filter-gff #67
- Fixes
- Corrected feature-csv output on large arrays #58 #72
- Documentation
- Split readme into multiple documents
- Switched to .rst for documentation
- Added description of methods
v0.1.2
- alpha
- Faster output of CSV files #58
- Tidied up CLI arguments #55 #62
- Replace underscores with dash
- Use of flags for Boolean arguments
- Updated Readme with changes
- Tests
- Integration tests for fingerprint, compare and filter-gff
- Further tests for preprocess
v0.1.1
- alpha
- Minor fixes for for cluster support calculation #53
- Default selection of starting epsilon value now matches Campello et al 2015
- Update terminology to reflect that used in Campello et al 2015
- Tests
v0.1.0
- alpha
- Pre-Processing
- Corrections for reverse-complemented reads when extracting dangler reads
- Inclusion of soft-clipped sections from the outer end of proper-mapped pairs
- Separated mapping reads to repeats from pre-processing script
- Fingerprinting/Compare
- Quality score filtering options
- New output format options
- Other
- Bump version to 0.1.0
- Change status to alpha
- Updated documentation
- Additional tests
v0.0.3
- pre-alpha
- Nicer data structures for use as a library #42
- Easy inter-op with Numpy, Pandas and Plotting libraries in python #42
- Improved buffering of comparative bins
- Multi-process pipelines return results to parent process rather than printing directly to file
- GFF output is sorted by chromosome, start position, stop position #32
- Added better examples to readme #40
- improved testing of fingerprint and compare data methods #6
- Compare output is no longer nested and contains normalised read counts #4 and is coloured by read count proportions #43
- filter_gff program is simplified (no longer deals with nested gff)
- renamed module to `tefingerprint` and CLI to `tef` #3
v0.0.2
- pre-alpha
- implemented a faster hierarchical clustering method
v0.0.1-pre-alpha
- pre-alpha
- initial release
- legacy hierarchical clustering method
- tagged for reference reasons