-
Notifications
You must be signed in to change notification settings - Fork 4
/
NEWS
279 lines (261 loc) · 12.4 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
v0.5.2 (2021-02-24)
-------------------
- Allow manual control of view split (auto/vertical/horizontal) by pressing
the 'z' key.
- Updated contrib libraries.
- Replaced hash map with a faster one.
v0.5.1 (2018-05-06)
-------------------
- Performance improvements.
- Removed needless checks for file existence.
- Parallel initialization of archives in galaxy mode.
- Browser won't do things twice (e.g. thread list redraw).
- Browser will calculate top and bottom message index without screen
redraw.
- Browser won't "scroll" through messages when going to a specific one.
- Score file is loaded on a separate thread.
- Visited messages list is loaded on a separate thread.
- Browser won't write last open archive file, if it was just read from
disk.
- Updated pdcurses to a recent version (includes resizing windows console).
v0.5 (2017-09-04)
-----------------
- Qt browser has been removed.
- Use robin hood hashing for Message ID/lexicon/galaxy data.
- Disk space requirements have been reduced.
- Data access pattern is now linear, which increases speed of operation,
especially due to reduction of page fault rate.
- These changes require removal of lexopt utility and merge of lexhash into
lexicon utility.
- Use custom string compression algorithm for Message IDs. This results in
nearly 50% disk space savings.
- Improvements in extract-msgid utility:
- Relaxed Message ID validation. If a non-confirming ID can be handled
without any hassle, it is.
- Message ID scanning procedure is much stricter now.
- Scan is only performed in headers block.
- Message ID now must be in the same line as the 'Message-ID:' field
header.
- Completely broken Message IDs, or messages without the 'Message-ID:' field
are now assigned a custom, unique Message ID.
- The same Message ID validation is also performed in kill-duplicates utility.
- Improvements in filter-spam utility:
- Zstd data is now also filtered, if present. LZ4 data is still required.
- Optional mode for filtering against message size.
- Optional mode for filtering messages by provided Message ID list.
- Optional mode for threaded spam removal.
- Replaced libcrm114 spam classificator with terminator library.
- Display Message ID and date in spam training mode.
- Increase message preview size in spam training mode.
- Spam score is now displayed in first column in removed messages listing.
- Spam score is now colorized, according to spam/ham/unsure classification.
- Improvements in connectivity utility:
- Handle missing 'Date:' field.
- Add support for ancient '12-Jan-87' date format.
- Improvements to text mode browser:
- Very long (more than 1024 characters) message lines are now properly
handled.
- Add decorations to *bold*, _underlined_ and /italics/ text segments.
- Add decorations to URLs.
- "Go to message" functionality now accepts negative numbers to go to the
most recent messages.
- Group activity chart has been added.
- Bump package version to 3.
- Required by changes to the hashing algorithm.
- Required to save Message ID compression codebook.
- Old archives are now refused to be opened by libuat. Package utility still
can handle them.
- Improvements in verify utility:
- Check if all messages can be accessed using their Message IDs.
- Display info if lexicon distance data is missing.
- Check for empty 'Subject:' field in extract-msgmeta utility.
- Extraction of parent message from 'References:' field is now more relaxed.
- Improvements to time stamp handling:
- While 'Received:' field exists primarily in mailing list messages, usenet
messages use 'NNTP-Received-Date:' or 'Injection-Data:' headers. These are
now also handled.
- Validate date against all server date headers, not only the first one.
- Multi-threaded compression in repack-lz4 utility.
- Improved memory efficiency of small temporary allocations in various
utilities.
- Improvements in threadify utility:
- Multi-threaded message matching.
- Lexicon data is now properly transformed after the messages are matched.
Lexicon utility no longer needs to be run twice.
- Added second mode of operation - thread grouping.
- Don't use irrelevant rank metrics.
- Search improvements:
- Change search in headers to search in 'From:' or 'Subject:' fields.
- Give better score to messages that have more hits for search terms.
- General adjustments to search ranking weights.
- Take into account word distances between all words in the query.
v0.4 (2017-07-19)
-----------------
- Qt browser is now marked as deprecated.
- Use command line interface in query utility.
- Allow scanning for the last message index on usenet servers (Xref field).
- Add update-zstd utility.
- Add nntp-get utility.
- Further speed improvements in lexdist utility.
- Store levenshtein distance in lexdist data. Use this information to modify
search score.
- Search for indirect links between messages in galaxy mode. An indirect link
is a reply sent to group A to a message sent to group B. This is not a
crosspost, but messages obviously should be cross-referenced.
- Text mode browser improvements:
- Internal changes to message buffer ownership, resulting in decreased
memory usage and fixing of a nasty bug.
- Support for filtering in galaxy archive open list. Requires further work.
- Search results now can be sorted by date in ascending or descending order.
The default order is still determined by search score.
- Search results preview now displays properly formatted message context and
highlights most relevant search terms.
- Display 'to:' field, which is used instead of 'newsgroups:' in mailing
list messages.
- Adjust search results display to workaround performance problems of some
terminal emulators.
- Display thread preview in galaxy warp mode.
- Long lines are now properly broken into multiple lines on word boundaries.
- Search improvements:
- Fuzzy search can be disabled for individual words.
- Some words may be required to be present.
- Some words may be required to be absent.
- Limited search term wildcard expansion.
- Ability to restrict search to headers.
- Provide more search metadata.
- The 'wrote' message preamble is now detected heuristically and assigned
its own lexicon context type.
- Speed optimization (at the cost of memory).
- Words existing in only one message are no longer stored in lexicon.
- Changes in threadify utility:
- Search context has been limited to 16 lines.
- Don't check for matches in low-ranked messages.
- Only match messages that are in a reasonable chronological order.
- Improve end-of-message detection in import-source-mbox utility.
- Handle messages without 'subject:' field. Such messages are conforming to
RFC requirements.
- Improvements to time stamp handling:
- Search for date in 'received:' field, if the message contains malformed
'date:' field. Still try to parse the 'date:' field, if search fails.
- Detect time traveling messages ('date:' and 'received:' fields too far
apart in time).
- Bump package version to 2.
- Store list of subject prefixes common to the archive.
- Additional lexicon context types invalidate old lexicon data.
v0.3 (2017-01-29)
-----------------
- Added google-groups utility.
- Added sort utility.
- Text mode browser improvements:
- On launch thread tree is focused on the last viewed message.
- Message viewing history traversal.
- Deep thread trees (more than ~55 levels) are now properly drawn.
- ROT13 support.
- Thread tree is now filled on-demand. This reduces startup time.
- Long lines in messages are now broken to multiple on-screen lines.
- Highlight tree of children of the current message.
- Support for going to a specific message.
- Only sorted archives are supported now. This greatly reduces memory usage.
- Basic support for message search.
- Give visual indication of folded subtrees.
- Added help.
- Added ability to display group charter.
- Deep thread trees are now displayed in a condensed mode.
- Change coloring of quotations. Previously whole line color was determined
by the quotation level. Now the color changes, as the quotation level
progresses.
- Add one more level of quote coloring.
- Various improvements to status lines.
- Minimal score file implementation. Messages can be classified as neutral,
positive or negative.
- Allow switching to another archive from within browser.
- Allow going to message by its index in the archive.
- Archive verification changes:
- Check if archive is sorted.
- Check for availability of proper archive name.
- Display notification, if archive description files are missing.
- Detection of broken connectivity data.
- Check total children counts.
- Check for message duplicates.
- ANSI color support.
- Optional quiet mode (only failures are displayed).
- Visited message list is not checked for external changes on every query
(half second grace time).
- Improvements to time stamp handling:
- Support for reading broken deja news 'date:' field.
- Handle 'date:' field with tabulation separator.
- Recurse into subdirectories in import-source-maildir utility.
- Unix packaging support:
- Added uat dispatcher utility.
- Added man page documentation.
- Added global makefile.
- Added Arch PKGBUILD file.
- Ignore carriage return character in maildir import utilities.
- Improvements in utf8ize utility:
- Detect if message is already encoded as UTF-8.
- Translate ISO8859-2/CP1250 encoded as UTF-8 to proper UTF-8 codepoints.
- Fix threading loops in threadify utility.
- Fix self-referencing message handling.
- Fix total children count method in connectivity utility.
- Message ID and lexicon hash tables now target 0.75 load factor, instead of
using hardcoded size.
- Allow changing selectivity level in repack-zstd utility.
- Search improvements:
- Adjust score by calculating search terms distance in messages.
- Do not require are search terms to be present in results.
- Fuzzy search.
- Various critical improvements to lexdist utility speed. This enables actual
use of this utility.
- Bump package version to 1.
- Required by variable hash table sizes.
- Packing of lexicon distance data, if available.
- Introduction of galaxy mode. This is an optional metadata collection
generated from multiple archives.
- List of groups contained in galaxy is available to browser, which allows
archive selection from a list, not by file name.
- Unique Message ID collection lists all messages in all archives.
- Each message is assigned to one or more archives.
- Browser can display crossposts.
- Additional processing is used to determine whether children or parent
messages are not the same in all archives containing the message. This
indicates that additional discussion took place.
- Galaxy warp functionality allows quick switching to another archive, which
contains the crossposted message.
- Use memory mapped zstd dictionary without making a copy.
- Delay initialization of zstd context until it's really needed.
v0.2 (2016-11-17)
-----------------
- Added relative-complement utility.
- Added export-messages utility.
- Added verify utility.
- Qt browser improvements:
- Each user name will now have semi-unique color.
- Number of threads is displayed next to total number of messages when
opening new archive.
- Search scroll is now reset after performing new search.
- Notification window is displayed when loading archive.
- Ability to go to a specific date.
- Message viewing history has been added, with ability to go back to
previously visited messages.
- Thread list is now filled on demand, which results in performance
improvements and significant memory usage reduction.
- Application state is now held between browsing sessions:
- Last browsed archive is remembered and reopened on application
launch.
- Last viewed message is saved and restored when opening archive.
- Browser remembers which messages were already viewed and displays
them accordingly.
- Initial implementation of text mode browser.
- Design is based on slrn news reader.
- Message thread tree can be displayed.
- Messages can be viewed.
- Make sure all messages are accessible in browsers.
- Break message reference loops.
- Use less memory-hungry hash table in lexicon utility.
- Allow configuration of compression options in repack-zstd utility.
v0.1 (2016-08-23)
-----------------
- Initial release.
- Package version 0.
- Qt-based browser.
- Many bugs.