-
Notifications
You must be signed in to change notification settings - Fork 75
/
Copy pathCHANGELOG.md
5263 lines (4329 loc) · 396 KB
/
CHANGELOG.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
## [1.0.0] - 2024-12-02
## qsv v1.0.0 is here! 🎉
After over 3 years of development, nearly 200 releases, and 11,000+ commits, qsv has finally reached v1.0.0!
What started as a hobby project to learn Rust during COVID has evolved into a powerful data wrangling tool used in multiple datHere products, open source projects, and even in several mission-critical production environments!
To mark this major milestone, this larger than usual release includes major performance improvements, new features, and various optimizations!
---
### Added
* `joinp`: add `--ignore-case` option https://github.com/dathere/qsv/pull/2287
* `py`: add ability to load python expression from file https://github.com/dathere/qsv/pull/2295
* `replace`: add `--not-one` flag (resolves #2305) by @rzmk in https://github.com/dathere/qsv/pull/2307
* `slice`: add `--invert` option https://github.com/dathere/qsv/pull/2298
* `stats`: add dataset-level stats https://github.com/dathere/qsv/pull/2297
* `sqlp`: auto-decompression of gzip, zstd & zlib compressed csv files with `read_csv` table function (implements suggestion from @wardi in #2301) https://github.com/dathere/qsv/pull/2315
* `template`: add lookup support https://github.com/dathere/qsv/pull/2313
* added `ui` feature to make it easier to make a headless build of qsv https://github.com/dathere/qsv/pull/2289
* added better panic handling https://github.com/dathere/qsv/pull/2304
* added new benchmark for `template` command https://github.com/dathere/qsv/commit/cd7e480de5ff1e2766a16b8d21767b76fbf10d35
* added 📚 `lookup support` legend https://github.com/dathere/qsv/commit/b46de73f57ba35ee08581a4f20809a5f581d461b
### Changed
* move qsv from personal Github repo to datHere GitHub org https://github.com/dathere/qsv/pull/2317
* `template`: parallelized template rendering for significant speedups https://github.com/dathere/qsv/pull/2273
* simplify input format check https://github.com/dathere/qsv/pull/2309
* bump embedded `luau` from 0.650 to 0.653 https://github.com/dathere/qsv/commit/986a1d3b4e60f15c25ef8a157c7e9e205ae8e7a9
* deps: Switch back to `simple-home-dir` from `simple-expand-tilde` https://github.com/dathere/qsv/pull/2319
* deps: Add minijinja contrib https://github.com/dathere/qsv/pull/2276
* deps: bump pyo3 down to 0.21.2 because polars-mem-engine is not compatible with pyo3 0.23.x yet https://github.com/dathere/qsv/commit/7f9fc8a6cfe94a104d33e895ecae11e2f40274ee
* build(deps): bump base62 from 2.0.2 to 2.0.3 by @dependabot in https://github.com/dathere/qsv/pull/2281
* build(deps): bump bytemuck from 1.19.0 to 1.20.0 by @dependabot in https://github.com/dathere/qsv/pull/2299
* build(deps): bump bytes from 1.8.0 to 1.9.0 by @dependabot in https://github.com/dathere/qsv/pull/2314
* build(deps): bump file-format from 0.25.0 to 0.26.0 by @dependabot in https://github.com/dathere/qsv/pull/2277
* build(deps): bump hashbrown from 0.15.1 to 0.15.2 by @dependabot in https://github.com/dathere/qsv/pull/2310
* build(deps): bump itoa from 1.0.11 to 1.0.12 by @dependabot in https://github.com/dathere/qsv/pull/2300
* build(deps): bump itoa from 1.0.12 to 1.0.13 by @dependabot in https://github.com/dathere/qsv/pull/2302
* build(deps): bump itoa from 1.0.13 to 1.0.14 by @dependabot in https://github.com/dathere/qsv/pull/2311
* build(deps): bump mlua from 0.10.0 to 0.10.1 by @dependabot in https://github.com/dathere/qsv/pull/2280
* build(deps): bump mlua from 0.10.1 to 0.10.2 by @dependabot in https://github.com/dathere/qsv/pull/2316
* build(deps): bump serial_test from 3.1.1 to 3.2.0 by @dependabot in https://github.com/dathere/qsv/pull/2279
* build(deps): bump minijinja from 2.4.0 to 2.5.0 by @dependabot in https://github.com/dathere/qsv/pull/2284
* build(deps): bump minijinja-contrib from 2.3.1 to 2.5.0 by @dependabot in https://github.com/dathere/qsv/pull/2283
* build(deps): bump rfd from 0.15.0 to 0.15.1 by @dependabot in https://github.com/dathere/qsv/pull/2291
* build(deps): bump sanitize-filename from 0.5.0 to 0.6.0 by @dependabot in https://github.com/dathere/qsv/pull/2275
* build(deps): bump serde from 1.0.214 to 1.0.215 by @dependabot in https://github.com/dathere/qsv/pull/2286
* build(deps): bump serde_json from 1.0.132 to 1.0.133 by @dependabot in https://github.com/dathere/qsv/pull/2292
* build(deps): bump tempfile from 3.13.0 to 3.14.0 by @dependabot in https://github.com/dathere/qsv/pull/2278
* build(deps): bump tokio from 1.41.0 to 1.41.1 by @dependabot in https://github.com/dathere/qsv/pull/2274
* build(deps): bump url from 2.5.3 to 2.5.4 by @dependabot in https://github.com/dathere/qsv/pull/2306
* applied several clippy suggestions
* bumped numerous indirect dependencies to latest versions
* bumped MSRV to latest Rust stable (1.83.0)
* bumped Rust nightly from 2024-11-01 to 2024-11-28, the same version used by Polars
### Fixed
* fix `get_stats_records()` helper to handle input files with embedded spaces (fixes #2294) https://github.com/dathere/qsv/pull/2296
* added better panic handling (fixes #2301) https://github.com/dathere/qsv/pull/2304
* implement simple format check for input files (fixes #2308) https://github.com/dathere/qsv/pull/2308
### Removed
* removed `simple-expand-tilde` dependency in favor of `simple-home-dir` https://github.com/dathere/qsv/pull/2318
* removed patched fork of `indicatif` now that 0.17.9 is released, fixing GH unmaintained advisory for `instant` https://github.com/dathere/qsv/commit/33fa54a1651ce29d286c0e1ff4f3d77bbbd2ffd5
* removed `clipboard` command from `qsvlite` binary variant https://github.com/dathere/qsv/commit/9c663d84da49cbbe53d7c9df6bd747cad0d9ba24
**Full Changelog**: https://github.com/dathere/qsv/compare/0.138.0...1.0.0
## [0.138.0] - 2024-11-05
## Highlights:
* __:star: New `template` command for rendering templates with CSV data.__
This should allow users to generate very complex documents (Form letters, JSON/XML files, etc.) with the powerful [MiniJinja template engine](https://docs.rs/minijinja/latest/minijinja/) ([Example template](https://github.com/jqnatividad/qsv/blob/master/scripts/template.tpl)).
* __:star: New `lookup` module for fetching reference data from remote and local files.__
In addition to the typical `http`/`https` schemes for remote files, qsv adds two additional schemes - `CKAN://` and `datHere://`, fetching lookup data from a CKAN site or [datHere maintained](https://data.dathere.com) [reference data](https://github.com/dathere/qsv-lookup-tables) respectively. The lookup module has simple file-based caching as well to minimize repeated fetching of typically static reference data (default cache age: 600 seconds).
The `lookup` module is now being used by the `luau` (for its [`qsv_register_lookup`](https://github.com/jqnatividad/qsv/blob/9036430b1902701eaf60058afce7823810968099/src/cmd/luau.rs#L2034-L2070) helper) and `validate` (for its [`dynamicEnum`](https://github.com/jqnatividad/qsv/blob/9036430b1902701eaf60058afce7823810968099/src/cmd/validate.rs#L35-L72) custom JSON Schema keyword) commands. More commands will take advantage of this module over time (e.g. `apply`, `geocode`, `template`, `sqlp`, etc.) to do extended lookups (e.g. lookup Census information given spatiotemporal data - like demographic info of a Census tract).
* __:sparkles: Enhanced `fetchpost` with MiniJinja templating for payload construction.__
Previously, `fetchpost` was limited to posting url-encoded HTML Form data. Now with the `--payload-tpl` and `--content-type` options, users can render and post request bodies using MiniJinja using other content types as well (typically `application/json`, `text/plain`, `multipart/form-data`).
* __:sparkles: Improved Polars integration with automatic schema detection__
The `joinp` and `sqlp` commands now use qsv's stats cache to automatically determine column data types, rather than having Polars scan a sample of rows. This provides two key benefits:
1. Faster execution by skipping Polars' schema inference step
2. More accurate data type detection since the stats cache analyzes the entire dataset, not just a sample
* __:running: `fast-float2` crate for faster float parsing__
Casting string/bytes to float is now much faster ([2 to 8x faster than Rust's standard library](https://github.com/Alexhuszagh/fast-float-rust?tab=readme-ov-file#performance)) with `fast-float2`.
* __:muscle: Major dependency updates including [Polars 0.44.2](https://github.com/pola-rs/polars/releases/tag/rs-0.44.2), [Luau 0.650](https://github.com/luau-lang/luau/releases/tag/0.650), [mlua 0.10.0](https://github.com/mlua-rs/mlua/releases/tag/v0.10.0) and [jsonschema 0.26.1](https://github.com/Stranger6667/jsonschema/releases/tag/rust-v0.26.1)__
These core crates underpin much of qsv's functionality. Using the latest version of these crates allow qsv to stay true to its goal of being the [fastest and most comprehensive data-wrangling toolkit](https://github.com/jqnatividad/qsv?tab=readme-ov-file#goals--non-goals).
---
### Added
* added lookup module - enabling fetching and caching of reference data from remote and local files https://github.com/jqnatividad/qsv/pull/2262
* `fetchpost`: add `--payload-tpl <file>` and `--content-type` options to construct payload using MiniJinja with the appropriate content-type https://github.com/jqnatividad/qsv/pull/2268 https://github.com/jqnatividad/qsv/commit/592149867997da6ac56d20a7e7f84252b2baeb2a
* `joinp`: derive polars schema from stats cache https://github.com/jqnatividad/qsv/commit/86fe22ee4e3677dc702eaf21175c60ceb8166001
* `sqlp`: derive polars schema from stats cache https://github.com/jqnatividad/qsv/pull/2256
* `template`: new command to render MiniJinja templates with CSV data https://github.com/jqnatividad/qsv/pull/2267
* `validate`: add `dynamicEnum` lookup support https://github.com/jqnatividad/qsv/pull/2265
* `contrib(completions)`: add template command and update fetchpost by @rzmk in https://github.com/jqnatividad/qsv/pull/2269
* add `fast-float2` dependency for faster bytes to float conversion https://github.com/jqnatividad/qsv/commit/7590e4ed171eeb6804845e1b54bec0fa26cca706 https://github.com/jqnatividad/qsv/commit/3ca30aa878ed3c4dc58944d46f53fb0c4b955356
* added more benchmarks for new/updated commands https://github.com/jqnatividad/qsv/commit/f8a1d4fff11d78860c102c1375653822ee95ca58 https://github.com/jqnatividad/qsv/commit/cd7e480de5ff1e2766a16b8d21767b76fbf10d35
### Changed
* `luau`: adapt to mlua 0.10 API changes https://github.com/jqnatividad/qsv/commit/268cb45a04a49360befb81af76cc1cddd6307286
* `luau`: refactored stage management https://github.com/jqnatividad/qsv/commit/31ef58a82b8f80fe0b29260f9170f10220c73714
* `luau`: now uses the lookup module https://github.com/jqnatividad/qsv/commit/2f4be3473a90252df4fd559a5f3b38246a3da696
* `stats`: minor perf refactoring https://github.com/jqnatividad/qsv/commit/6cdd6ea94adbae063e7fb6d9da71dac0c86adc12
* build(deps): bump actions/setup-python from 5.2.0 to 5.3.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2243
* build(deps): bump azure/trusted-signing-action from 0.4.0 to 0.5.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2239
* build(deps): bump bytes from 1.7.2 to 1.8.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2231
* build(deps): bump cached from 0.53.1 to 0.54.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2272
* build(deps): bump flexi_logger from 0.29.3 to 0.29.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/2229
* build(deps): bump flexi_logger from 0.29.4 to 0.29.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/2261
* build(deps): bump flexi_logger from 0.29.5 to 0.29.6 by @dependabot in https://github.com/jqnatividad/qsv/pull/2266
* build(deps): bump hashbrown from 0.15.0 to 0.15.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2270
* build(deps): bump jsonschema from 0.24.0 to 0.24.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2234
* build(deps): bump jsonschema from 0.24.1 to 0.24.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2238
* build(deps): bump jsonschema from 0.24.2 to 0.24.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2240
* build(deps): bump jsonschema from 0.25.0 to 0.25.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2244
* build(deps): bump jsonschema from 0.26.0 to 0.26.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2260
* build(deps): bump regex from 1.11.0 to 1.11.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2242
* build(deps): bump reqwest from 0.12.8 to 0.12.9 by @dependabot in https://github.com/jqnatividad/qsv/pull/2258
* build(deps): bump serde from 1.0.210 to 1.0.211 by @dependabot in https://github.com/jqnatividad/qsv/pull/2232
* build(deps): bump serde from 1.0.211 to 1.0.213 by @dependabot in https://github.com/jqnatividad/qsv/pull/2236
* build(deps): bump serde from 1.0.213 to 1.0.214 by @dependabot in https://github.com/jqnatividad/qsv/pull/2259
* build(deps): bump simd-json from 0.14.1 to 0.14.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2235
* build(deps): bump tokio from 1.40.0 to 1.41.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2237
* `deps`: updated our fork of the csv crate with more perf optimizations https://github.com/jqnatividad/qsv/commit/eae7d764bd31d717bdf123646ea85c81ed829829
* `deps`: use calamine upstream with unreleased fixes https://github.com/jqnatividad/qsv/commit/4cc7f37e9c34b712ae2c5f43c018b2d6a6655ebb
* `deps`: use our csvlens fork untl PR removing unneeded arboard features is merged https://github.com/jqnatividad/qsv/commit/bb3232205b7a948848c2949bcaf3b54e54f3d49b
* `deps`: bump jsonschema from 0.25 to 0.26 https://github.com/jqnatividad/qsv/pull/2251
* `deps`: bump embedded Luau from 0.640 to 0.650 https://github.com/jqnatividad/qsv/commit/8c54b875bf8768849b128ab15d96c33b02be180b https://github.com/jqnatividad/qsv/commit/aca30b072ecb6bb22d7edbe8ddef348649a5d699
* `deps`: bump mlua from 0.9 to 0.10 https://github.com/jqnatividad/qsv/pull/2249
* `deps`: bump Polars from 0.43.1 at py-1.11.0 tag to latest 0.44.2 upstream by @jqnatividad in https://github.com/jqnatividad/qsv/pull/2255 https://github.com/jqnatividad/qsv/commit/0e40a4429b4ef219ab7a11c91767e95778470ef2
* apply select clippy lint suggestions
* updated indirect dependencies
* aligned Rust nightly to Polars nightly - 2024-10-28 - https://github.com/jqnatividad/qsv/commit/245bcb55af416960aa603c05de960289f6125c5c
### Fixed
* fix documentation typo: it's → its by @tmtmtmtm in https://github.com/jqnatividad/qsv/pull/2254
### Removed
* removed need to set RAYON_NUM_THREADS env var and just call the Rayon API directly https://github.com/jqnatividad/qsv/commit/aa6ef89eceac89c3d1ed19068e0e23a451c4402d
* removed unneeded `create_dir_all_threadsafe` helper now that std::create_dir_all is threadsafe https://github.com/jqnatividad/qsv/commit/d0af83bfbd0430fa22f039bd00615380110f456e
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.137.0...0.138.0
## [0.137.0] - 2024-10-20
### Highlights:
* `extdedup` and `extsort` now support two modes - LINE mode and CSV mode. Previously, both commands only sorted on a line-by-line basis (now called LINE MODE).<br/>
With the addition of CSV MODE, you can now deduplicate or sort CSV files on a column-by-column basis, with the powerful `--select` option to specify which columns to deduplicate or sort on. This is especially useful for large CSV files with many columns, where you only want to deduplicate or sort on a subset of columns.
And since both commands use the disk and are streaming, they can handle files of any size.
* `sqlp` now has a `--cache-schema` option that caches the schema of the input CSV file, which can significantly speed up subsequent queries on the same file.
* `fetch` and `fetchpost` have been updated to use the [`jaq`](https://github.com/01mf02/jaq?tab=readme-ov-file#jaq) (a [jq](https://jqlang.github.io/jq/)-like tool for parsing JSON) crate instead of the `jql` crate. This change was made to improve performance and to make the commands more consistent with the `json` command which also uses `jaq`. Furthermore, `jaq` is a clone of `jq` - which is widely used and has a large community, so it should be more familiar to users.
* `stats` is a tad faster as we keep squeezing more performance from this central command.
* `validate` is now faster and more memory efficient due to optimizations in the `jsonschema` crate and minor performance improvements in the `validate` command itself.
---
### Added
* `extdedup`: now supports two modes - LINE mode and CSV mode https://github.com/jqnatividad/qsv/pull/2208
* `extsort`: now also has two modes - CSV mode and LINE mode https://github.com/jqnatividad/qsv/pull/2210
* `sqlp`: add `--cache-schema` option https://github.com/jqnatividad/qsv/pull/2224
* added `sqlp --cache-schema` benchmarks
### Changed
* `apply` & `applydp`: use smallvec for operations vector & other minor performance optimizations https://github.com/jqnatividad/qsv/pull/2219 & https://github.com/jqnatividad/qsv/commit/bc837ae698f3aee06ea9b846b98ea0c75820a22d
* `apply` & `applydp`: specify min_length for parallel iterators https://github.com/jqnatividad/qsv/commit/7d6ce5ec9675755abd5942a5e9e731592961700d
* `fetch` & `fetchpost`: replace jql with jaq https://github.com/jqnatividad/qsv/pull/2222
* `stats`: performance optimizations https://github.com/jqnatividad/qsv/commit/f205809549ac275078a95bc2821a583611955ad0 https://github.com/jqnatividad/qsv/commit/e26c27f58df688d7bfb2185ad54d4fe010b1fccf https://github.com/jqnatividad/qsv/commit/4579c1bfba4eca21d7480694780e39f6966a88a0
* `validate`: specify min_length for parallel iterators https://github.com/jqnatividad/qsv/commit/a5b818562d5db7d65f00e5acd2c8bf7d44bd869a
* build(deps): bump calamine from 0.26.0 to 0.26.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2204
* build(deps): bump csvs_convert from 0.8.14 to 0.9.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2215
* build(deps): bump flexi_logger from 0.29.2 to 0.29.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2209
* build(deps): bump jsonschema from 0.23.0 to 0.24.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2223
* build(deps): bump pyo3 from 0.22.3 to 0.22.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/2207
* build(deps): bump pyo3 from 0.22.4 to 0.22.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/2212
* build(deps): bump redis from 0.27.3 to 0.27.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/2202
* build(deps): bump redis from 0.27.4 to 0.27.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/2217
* build(deps): bump serde_json from 1.0.129 to 1.0.130 by @dependabot in https://github.com/jqnatividad/qsv/pull/2218
* build(deps): bump serde_json from 1.0.131 to 1.0.132 by @dependabot in https://github.com/jqnatividad/qsv/pull/2220
* build(deps): bump uuid from 1.10.0 to 1.11.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2213
* apply select clippy lints
* bumped indirect dependencies
* bumped MSRV to 1.82
### Fixed:
* fix performance regression in batched commands by refactoring `optimal_batch_size` to require indexed CSV files https://github.com/jqnatividad/qsv/pull/2206
### Removed:
* `fetch` & `fetchpost`: removed jql options; replaced with jaq https://github.com/jqnatividad/qsv/pull/2222
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.136.0...0.137.0
## [0.136.0] - 2024-10-08
## Highlights
# :tada: qsv pro is now available in the Microsoft Store! :tada:
It's ***Data Wrangling Democratized*** on the Desktop, featuring:
- __:bar_chart: Familiar Spreadsheet Interface__<br/>tap the power of qsv to query, analyze, enrich, scrub and transform huge Excel files and multi-gigabyte CSV files in seconds, without having to deal with the command-line.
- __ CKAN desktop client__<br/>designed to make data publishing easier for portal operators and data stewards using the  [CKAN](https://ckan.org) platform.
- __:inbox_tray: Flow__<br/>allows you to build custom node-based flows and data pipelines using a visual interface.
- __:wrench: Toolbox__<br/>features an ever-expanding library of reusable scripts for common data-wrangling use cases.
- __:star: and more!__<br/>Natural Language Interface ([RAG](https://docs.google.com/presentation/d/10T_3MyIqS5UsKxJaOY7Ktrd-GfhJelQImlE_qYmtuis/edit#slide=id.g2e10e05624b_0_124)), [Polars](https://pola.rs) SQL query support, an API, Python/Luau support, automatic Data Dictionaries, [DCAT 3 metadata profile inferencing](https://github.com/jqnatividad/qsv/issues/1705), along with a retinue of other cloud-based services (e.g. customizable street-level geocoding, data feeds, reference data lookups, geo-ip lookups, cloud storage support, [`.qsv` file format](https://github.com/jqnatividad/qsv/issues/1982), etc.) that will be unveiled in future versions.
Like qsv, we're iterating rapidly with qsv pro, so your feedback is essential. Give it a try!
<div dir="rtl">Get it from https://qsvpro.dathere.com or<br/><a href="https://apps.microsoft.com/detail/xpffdj3f1jsztf?mode=full">
<img
src="https://get.microsoft.com/images/en-us%20light.svg"
width="200" /></a></div>
__Other highlights:__
* `excel`: new `--table` option for XLSX files; new `--header-row` option; expanded `--range` option, adding support for Named Ranges and absolute ranges (e.g. `Sheet2!$A$1:$J$10`); and expanded metadata export now including Named Ranges and Tables (for XLSX files)
* Improved performance for several commands (`apply`, `datefmt`, `tojsonl` and `validate`) through automatic batch size optimization
* `validate`: `dynamicEnum` custom JSON Schema keyword in validate command (renamed from `dynenum`) and enhanced email validation
* `schema`: automatic JSON Schema `const` inferencing for columns with just one value
* Significant dependency updates, including latest upstream versions of Polars, jsonschema, and serde_json with unreleased performance upgrades, new features and fixes
> __NOTE:__ You can also see __qsv__ & __qsv pro__ in action in our ["The Problem with Data Portals" webinar](https://us06web.zoom.us/webinar/register/5317284045017/WN_wTe4l6nlTWa6C0HDs8R2PA) Oct 23, 2024. 1-2pm EDT
---
### Added
* :tada: [__qsv pro is now in the Microsoft Store!!!__](https://apps.microsoft.com/detail/xpffdj3f1jsztf?mode=full) :tada:
* `apply`, `datefmt`, `tojsonl`, `validate`: added logic to automatically determine optimal batch size for better parallelization https://github.com/jqnatividad/qsv/pull/2178
* `enum`: added `--new-column` support for all enum modes, not just `--increment` https://github.com/jqnatividad/qsv/pull/2173
* `excel`: new `--table` option for XLSX files https://github.com/jqnatividad/qsv/pull/2194
* `excel`: new `--header-row` option https://github.com/jqnatividad/qsv/commit/458f79ad9f4da504c68d73b48e83ad53b9634027
* `excel`: expanded range and metadata options https://github.com/jqnatividad/qsv/pull/2195
* `schema`: added JSON Schema automatic `const` inferencing https://github.com/jqnatividad/qsv/pull/2180
* Add signing step to qsv MSI installer GitHub Action by @rzmk in https://github.com/jqnatividad/qsv/pull/2182
* `contrib(completions)`: add `--table` option to `qsv excel` by @rzmk in https://github.com/jqnatividad/qsv/pull/2197
* `completions`: add `--header-row` option to `qsv excel` https://github.com/jqnatividad/qsv/commit/e8794d569185245f857659cdc299ea86029dd841
* added new `apply operations sentiment` benchmark https://github.com/jqnatividad/qsv/commit/b745e6438b64686810e4d1df4fa2e6748ba93ff8
* `docs`: added indexing section to PERFORMANCE.md https://github.com/jqnatividad/qsv/commit/804145a5304091c36728a8cdde4d56f879f71c15
### Changed
* `stats`: various minor micro-optimizations https://github.com/jqnatividad/qsv/commit/62d95fc6db2c34916160db88e4235719749a5f23 https://github.com/jqnatividad/qsv/commit/2c2862a75d6c0b2651516da30a7e6207a0043670
* `validate`: renamed custom keyword `dynenum` to `dynamicEnum` to be more consistent with JSON schema naming conventions https://github.com/jqnatividad/qsv/compare/0.135.0...master#diff-9783631cdad9e1f47f60266303dc2d56a6e7a486784b61c40961601e8192f7cf
* `validate`: optimizations for increased performance; replace serde_json with simd_json https://github.com/jqnatividad/qsv/compare/0.135.0...master#diff-9783631cdad9e1f47f60266303dc2d56a6e7a486784b61c40961601e8192f7cf
* apply new `clippy::ref_option` lint to Config::new API https://github.com/jqnatividad/qsv/pull/2192
* Update debian package readme by @tino097 in https://github.com/jqnatividad/qsv/pull/2187
* `deps`: bump `calamine` from 0.25 to 0.26 https://github.com/jqnatividad/qsv/commit/b42279a66144264bde9333068c47c530e3945f8c
* `deps`: `jsonschema` use [latest 0.22.3 upstream with unreleased features/fixes](https://github.com/jqnatividad/qsv/blob/f44d4c95db034d0770a5ee7df42a472aba7f4dd5/Cargo.toml#L300)
* `deps`: `polars` use [latest 0.43.1 upstream with unreleased features/fixes](https://github.com/jqnatividad/qsv/blob/1c1174b3b8b65d9dfd9c841597366fb09d0a047c/Cargo.toml#L311-L322)
* `deps`: created our own fork of unmaintained vader_sentiment crate https://github.com/jqnatividad/qsv/commit/b4267610f39d13eb8939c86f3b5e70033aa95a0c
* `deps`: use `serde_json` upstream with unreleased perf improvement/fixes https://github.com/jqnatividad/qsv/blob/1c1174b3b8b65d9dfd9c841597366fb09d0a047c/Cargo.toml#L221
* build(deps): bump flate2 from 1.0.33 to 1.0.34 by @dependabot in https://github.com/jqnatividad/qsv/pull/2171
* build(deps): bump flexi_logger from 0.29.0 to 0.29.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2189
* build(deps): bump flexi_logger from 0.29.1 to 0.29.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2196
* build(deps): bump hashbrown from 0.14.5 to 0.15.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2186
* build(deps): bump jsonschema from 0.20.0 to 0.21.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2177
* build(deps): bump jsonschema from 0.22.1 to 0.22.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2191
* build(deps): bump regex from 1.10.6 to 1.11.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2176
* build(deps): bump reqwest from 0.12.7 to 0.12.8 by @dependabot in https://github.com/jqnatividad/qsv/pull/2183
* build(deps): bump simd-json from 0.14.0 to 0.14.1 https://github.com/jqnatividad/qsv/pull/2199
* build(deps): bump simple-expand-tilde from 0.4.2 to 0.4.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2190
* build(deps): bump sysinfo from 0.31.4 to 0.32.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2193
* build(deps): bump tempfile from 3.12.0 to 3.13.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2175
* apply select clippy lints
* bumped indirect dependencies
* aligned Rust nightly to Polars nightly - 2024-09-29 https://github.com/jqnatividad/qsv/commit/7cd2de1151b2299d9b75a9c8b1a3e21dc9c992e2
### Fixed
* `schema`: fix `enum` so it only adds a list when the number of unique values > `--enum-threshold` https://github.com/jqnatividad/qsv/pull/2180
* Upload artifact fix for Debian package publishing by @tino097 in https://github.com/jqnatividad/qsv/pull/2168
* fixed typos configuration https://github.com/jqnatividad/qsv/commit/627de891d8fd358aadf8c302552e8a99c54ed959
* fixed various GitHub Actions publishing workflow issues
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.135.0...0.136.0
## [0.135.0] - 2024-09-24
### Highlights
JSON Schema validation just got a whole lot more powerful with the introduction of the `dynenum` keyword!
With `dynenum`, you can now dynamically lookup valid enum values from a CSV (on the filesystem or on a URL), allowing for more flexible and responsive data validation.
Unlike the standard[`enum` keyword](https://json-schema.org/draft/2020-12/draft-bhutton-json-schema-validation-01#name-enum), `dynenum` does not require hardcoding valid values at schema definition time, and can be used to validate data against a changing set of valid values.
In an upcoming qsv pro release, we're making `dynenum` even more powerful by allowing you to specify high-value reference data (e.g. US Census data, World Bank data, etc.) that is maintained at [data.dathere.com](https://data.dathere.com) and other CKAN instances.
This release also add the custom [`currency` JSON Schema format](https://github.com/jqnatividad/qsv/blob/90257bbba6d0b1c59c7a6c104b05beae35ae97e1/src/cmd/validate.rs#L23-L31), which enables currency validation according to the [ISO 4217 standard](https://en.wikipedia.org/wiki/ISO_4217).
The Polars engine was also updated to [0.43.1](https://github.com/pola-rs/polars/releases/tag/rs-0.43.1) at the [py-1.81.1 tag](https://github.com/pola-rs/polars/releases/tag/py-1.81.1) - making for various under-the-hood improvements for the `sqlp`, `joinp` and `count` commands, as we set the stage for more [Polars-powered features in future releases](https://github.com/jqnatividad/qsv/issues?q=is%3Aissue+is%3Aopen+label%3Apolars).
---
### Added
* `foreach`: enabled `foreach` command on Windows prebuilt binaries https://github.com/jqnatividad/qsv/commit/def9c8fa98cd214f0db839b64bcd12764dcfba43
* `lens`: added support for QSV_SNIFF_DELIMITER env var and snappy auto-decompression https://github.com/jqnatividad/qsv/commit/8340e8949c4b60669bc95c432c661a8c374ca422
* `sample`: add `--max-size` option https://github.com/jqnatividad/qsv/commit/e845a3cc1dcbbceda86bb7fe132c5040d23ce78b
* `validate`: added `dynenum` custom JSON Schema keyword for dynamic validation lookups https://github.com/jqnatividad/qsv/pull/2166
* `tests`: add tests for https://100.dathere.com/lessons/2 by @rzmk in https://github.com/jqnatividad/qsv/pull/2141
* added `stats_sorted` and `frequency_sorted` benchmarks
* added `validate_dynenum` benchmarks
### Changed
* `json`: add error for empty key and update usage text by @rzmk in https://github.com/jqnatividad/qsv/pull/2167
* `prompt`: gate `prompt` command behind `prompt` feature https://github.com/jqnatividad/qsv/pull/2163
* `validate`: expanded `currency` JSON Schema custom format to support ISO 4217 currency codes and alternate formats https://github.com/jqnatividad/qsv/commit/5202508e5c3969b279c20cf80bb1e37d89afd826
* `validate`: migrate to new `jsonschema` crate api https://github.com/jqnatividad/qsv/commit/5d6505426c652e7db4bb602c1bf9d302e6a09214
* Update ubuntu version for deb package by @tino097 in https://github.com/jqnatividad/qsv/pull/2126
* move --help output from stderr to stdout https://github.com/jqnatividad/qsv/pull/2138
* `contrib(completions)`: update completions for qsv v0.134.0 and fix subcommand options by @rzmk in https://github.com/jqnatividad/qsv/pull/2135
* `contrib(completions)`: add `--max-size` completion for `sample` by @rzmk in https://github.com/jqnatividad/qsv/pull/2142
* `deps`: bump to polars 0.43.1 at py-1.81.1 https://github.com/jqnatividad/qsv/pull/2130
* `deps`: switch back to calamine upstream instead of our fork https://github.com/jqnatividad/qsv/commit/677458faa4439b1b34c8a3556687a031ed184e4e
* build(deps): bump actix-governor from 0.5.0 to 0.6.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2146
* build(deps): bump anyhow from 1.0.87 to 1.0.88 by @dependabot in https://github.com/jqnatividad/qsv/pull/2132
* build(deps): bump arboard from 3.4.0 to 3.4.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2137
* build(deps): bump bytes from 1.7.1 to 1.7.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2148
* build(deps): bump geosuggest-core from 0.6.3 to 0.6.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/2153
* build(deps): bump geosuggest-utils from 0.6.3 to 0.6.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/2154
* build(deps): bump jql-runner from 7.1.13 to 7.2.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2165
* build(deps): bump jsonschema from 0.18.1 to 0.18.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2127
* build(deps): bump jsonschema from 0.18.2 to 0.18.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2134
* build(deps): bump jsonschema from 0.18.3 to 0.19.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2144
* build(deps): bump jsonschema from 0.19.1 to 0.20.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2152
* build(deps): bump pyo3 from 0.22.2 to 0.22.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2143
* build(deps): bump rfd from 0.14.1 to 0.15.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2151
* build(deps): bump simple-expand-tilde from 0.4.0 to 0.4.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2129
* build(deps): bump qsv_currency from 0.6.0 to 0.7.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2159
* build(deps): bump qsv_docopt from 1.7.0 to 1.8.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2136
* build(deps): bump redis from 0.26.1 to 0.27.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2133
* build(deps): bump simdutf8 from 0.1.4 to 0.1.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/2164
* bump indirect dependencies
* apply select clippy lint suggestions
* several usage text/documentation improvements
* bump MSRV to 1.81.0
### Fixed
* `validate`: correct `fail_validation_error!` macro; reformat error messages to use hyphens as the JSONschema error message already starts with "error:" https://github.com/jqnatividad/qsv/commit/9a2552481a07759847efe6025b402297ecba7e19
* moved `--help` output from stderr to stdout as per [GNU CLI guidelines](https://www.gnu.org/prep/standards/standards.html#g_t_002d_002dhelp) https://github.com/jqnatividad/qsv/commit/2b7dbdc68d49b67fb80c58cc7678cd3f2c112bd9
* `lens`: fixed parsing of lens options https://github.com/jqnatividad/qsv/commit/1cdd1bcac29fd2411521ac95fa87595de74cbb1b
* `searchset`: fixed usage text for <regexset-file> https://github.com/jqnatividad/qsv/commit/9a60fb088a326ee97ed1b147c4c3686b6b8aaeeb
* [used patched forks of `arrow`, `csvlens` and `xlsxwriter` crates](https://github.com/jqnatividad/qsv/blob/90257bbba6d0b1c59c7a6c104b05beae35ae97e1/Cargo.toml#L270-L315) that replaces a dependency on an old version of `lexical-core` with known soundness issues - https://rustsec.org/advisories/RUSTSEC-2023-0086. Once those crates have updated their `lexical-core`dependency, we will revert to the original crates.
### Removed
* removed `prompt` command from qsvlite https://github.com/jqnatividad/qsv/pull/2163
* publish: remove `lens` feature from i686 targets as it does not compile https://github.com/jqnatividad/qsv/commit/959ca7686f8656c98de9257d11f1f762852bdf9d
* `deps`: remove anyhow dependency https://github.com/jqnatividad/qsv/pull/2150
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.134.0...0.135.0
## [0.134.0] - 2024-09-10
## qsv pro v1 is here! 🎉
If you've been using qsv for a while, even if you're a command-line ninja, you'll find a lot of new capabilities in qsv pro that can make your data wrangling experience even better!
Apart from making qsv easier to use, qsv pro has a multitude of features including: view interactive data tables; browse stats/frequency/metadata; run recipes and tools (scripts); run Polars SQL queries; an interface using Retrieval Augmented Generation (RAG) techniques to attempt converting Natural Language queries to Polars SQL queries; regular expression search; export to multiple file formats; download/upload from/to compatible CKAN instances; design custom node-based flows and data pipelines; interact with a local API from external programs including the qsv pro command; run various qsv commands in a graphical user interface; and the list goes on!
That's just the beginning, there's more to come! You just have to try it!
Download qsv pro v1 now at [qsvpro.dathere.com](https://qsvpro.dathere.com/).
Other highlights include:
- `pro`: new command to allow qsv to interact with the qsv pro API to tap into qsv pro exclusive features.
- `lens`: new command to interactively view CSVs using the [csvlens](https://github.com/YS-L/csvlens) crate.
- The ludicrously fast `diff` command is now easier to use with its `--drop-equal-fields` option. @janriemer continues to work on his `csv-diff` crate, and there's more `diff` UX improvements coming soon!
- `stats` adds `sum_length` and `avg_length` "streaming" statistics in addition to the existing `min_length` and `max_length` metrics. These are especially useful for datasets with a lot of "free text" columns.
- `stats` also got "smarter" and "faster" by [dog-fooding](https://en.wikipedia.org/wiki/Eating_your_own_dog_food) its own statistics to make it run faster!
It's a little complicated, but the way `stats` works is that it compiles the "streaming" statistics on the fly first, and the more expensive advanced statistics are "lazily" computed at the end.
Since we now compile "sort order" in a streaming manner, we use this info when deriving cardinality at the end to see if we can skip sorting - an otherwise necessary step to get cardinality which is done by "scanning" all the sorted values of a column. Everytime two neighboring values differ in a sortedcolumn, it increments the cardinality count.
Apart from this "sort order" optimization, we also improved the "cardinality scan" algorithm - halving its memory footprint and making it faster still for larger datasets by parallelizing the computation!
This in turn, makes the `frequency` command faster and more memory efficient!
- we now also use our own fork of the `csv` crate, featuring SIMD-accelerated UTF-8 validation and other minor perf tweaks, making the *entire qsv suite* faster still!
---
### Added
* `pro`: add `qsv pro` command to interact with qsv pro API by @rzmk in https://github.com/jqnatividad/qsv/pull/2039
* `lens`: new command to interactively view CSVs using the [csvlens](https://github.com/YS-L/csvlens) crate https://github.com/jqnatividad/qsv/pull/2117
* `apply`: add crc32 operation https://github.com/jqnatividad/qsv/pull/2121
* `count`: add --delimiter option https://github.com/jqnatividad/qsv/pull/2120
* `diff`: add flag `--drop-equal-fields` by @janriemer in https://github.com/jqnatividad/qsv/pull/2114
* `stats`: add `sum_length` and `avg_length` columns https://github.com/jqnatividad/qsv/pull/2113
* `stats`: smarter cardinality computation - added new parallel algorithm for large datasets (10,000+ rows) and updated sequential algorithm for smaller datasets https://github.com/jqnatividad/qsv/commit/4e63fec61a394ef2ddfa499c0cdd0958e677ad17
### Changed
* `count`: added comment to justify magic number https://github.com/jqnatividad/qsv/commit/5241e3972c05f024a0791be04632d03a06b2f9ce
* `stats`: use simdjson for faster JSONL parsing; micro-optimize `compute` hot loop https://github.com/jqnatividad/qsv/commit/0e8b73451999a3e95bfd52246b1088aecd64b88f
* `stats`: standardized OVERFLOW and UNDERFLOW messages https://github.com/jqnatividad/qsv/commit/38c61285704e5064a63c9dbb1ac866f18fa130fd
* `sort`: renamed symbol so eliminate devskim lint false positive warning https://github.com/jqnatividad/qsv/commit/12db7397f68d3199e3311f402d5c7afed586b88c
* enable `lens` feature in GH workflows https://github.com/jqnatividad/qsv/pull/2122
* `deps`: bump polars 0.42.0 to latest upstream at time of release https://github.com/jqnatividad/qsv/commit/3c17ed12c3c763d644d9713afcc8442964f22de3
* `deps`: use our own optimized fork of csv crate, with simdutf8 validation and other minor perf tweaks https://github.com/jqnatividad/qsv/commit/e4bcd7123172fa8d8094c305d7780e151c120db1
* build(deps): bump serde from 1.0.209 to 1.0.210 by @dependabot in https://github.com/jqnatividad/qsv/pull/2111
* build(deps): bump serde_json from 1.0.127 to 1.0.128 by @dependabot in https://github.com/jqnatividad/qsv/pull/2106
* build(deps): bump qsv-stats from 0.19.0 to 0.22.0 https://github.com/jqnatividad/qsv/pull/2107 https://github.com/jqnatividad/qsv/pull/2112 https://github.com/jqnatividad/qsv/commit/cb1eb60a0a9fb3b9ba381183a2c29909f82efa42
* apply select clippy lint suggestions
* updated several indirect dependencies
* made various doc and usage text improvements
### Fixed
* `schema`: Print an error if the `qsv stats` invocation fails by @abrauchli in https://github.com/jqnatividad/qsv/pull/2110
## New Contributors
* @abrauchli made their first contribution in https://github.com/jqnatividad/qsv/pull/2110
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.133.1...0.134.0
## [0.133.1] - 2024-09-03
### Highlights
This release doubles down on Polars' capabilities, as we now, as a matter of [policy track the latest polars upstream](https://github.com/jqnatividad/qsv/blob/0801f678fd55af01ff53f80ee6b22b508e7c3dfb/Cargo.toml#L283-L294). If you think qsv has a torrid release schedule, you should [see Polars](https://github.com/pola-rs/polars/releases). They're constantly fixing bugs, adding new features and optimizations!
To keep up, we've added Polars revision info to the `--version` output, and the `--envlist` option now includes Polars relevant env vars. We've also added a new `POLARS_BACKTRACE_IN_ERR` env var to control whether Polars backtraces are included in error messages.
We also removed the `to parquet` subcommand as its redundant with the Polars-powered `sqlp`'s ability to create parquet files. This also removes the HUGE duckdb dependency, which should markedly make compile times shorter and binaries much smaller.
Other highlights include:
- New `edit` command that allows you to edit CSV files.
- The `count` command's `--width` option now includes record width stats beyond max length (avg, median, min, variance, stddev & MAD).
- The `fixlengths` command now has `--quote` and `--escape` options.
- The `stats` command adds a `sort_order` streaming statistic.
---
### Added
* `count`: expanded `--width` options, adding record width stats beyond max length (avg, median, min, variance, stddev & MAD). Also added `--json` output when using `--width` https://github.com/jqnatividad/qsv/pull/2099
* `edit`: add `qsv edit` command by @rzmk in https://github.com/jqnatividad/qsv/pull/2074
* `fixlengths`: added `--quote` and `--escape` options https://github.com/jqnatividad/qsv/pull/2104
* `stats`: add `sort_order` streaming statistic https://github.com/jqnatividad/qsv/pull/2101
* `polars`: add polars revision info to `--version` output https://github.com/jqnatividad/qsv/commit/e60e44f99061c37758bd53dfa8511c16d49ceed5
* `polars`: added Polars relevant env vars to `--envlist` option https://github.com/jqnatividad/qsv/commit/0ad68fed94f7b5059cca6cf96cec4a3b55638e60
* `polars`: add & document `POLARS_BACKTRACE_IN_ERR` env var https://github.com/jqnatividad/qsv/commit/f9cc5595664d4665f0b610fcbac93c30fa445056
### Changed
* Optimize polars optflags https://github.com/jqnatividad/qsv/pull/2089
* `deps`: bump polars 0.42.0 to latest upstream at time of release https://github.com/jqnatividad/qsv/commit/3b7af519343f08919f114c7307f0f561d04f93e8
* bump polars to latest upstream, removing smartstring https://github.com/jqnatividad/qsv/pull/2091
* build(deps): bump actions/setup-python from 5.1.1 to 5.2.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2094
* build(deps): bump flate2 from 1.0.32 to 1.0.33 by @dependabot in https://github.com/jqnatividad/qsv/pull/2085
* build(deps): bump flexi_logger from 0.28.5 to 0.29.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2086
* build(deps): bump indexmap from 2.4.0 to 2.5.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2096
* build(deps): bump jsonschema from 0.18.0 to 0.18.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2084
* build(deps): bump serde from 1.0.208 to 1.0.209 by @dependabot in https://github.com/jqnatividad/qsv/pull/2082
* build(deps): bump serde_json from 1.0.125 to 1.0.127 by @dependabot in https://github.com/jqnatividad/qsv/pull/2079
* build(deps): bump sysinfo from 0.31.2 to 0.31.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2077
* build(deps): bump qsv-stats from 0.18.0 to 0.19.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2100
* build(deps): bump tokio from 1.39.3 to 1.40.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2095
* apply select clippy lint suggestions
* updated several indirect dependencies
* made various doc and usage text improvements
* pin Rust nightly to 2024-08-26 from 2024-07-26, aligning with Polars pinned nightly
### Fixed
* Ensure portable binaries are "added" to the publish zip archive, instead of replacing all the binaries with just the portable version. Fixes #2083. https://github.com/jqnatividad/qsv/commit/34ad2067007a86ffad6355f7244163c4105a98f2
### Removed
* removed `to parquet` subcommand as its redundant with `sqlp`'s ability to create parquet files. This also removes the HUGE duckdb dependency, which should markedly make compile times shorter and binaries much smaller https://github.com/jqnatividad/qsv/pull/2088
* removed `smartstring` dependency now that Polars has its own compact inlined string type https://github.com/jqnatividad/qsv/commit/47f047e6ee10916b5caa19ee829471e9fb6f4bea
* remove `to parquet` benchmark
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.132.0...0.133.1
## [0.133.0] - 2024-09-03
SKIPPED because `cargo publish` was not publishing to crates.io because of a dev dependency issue with `csvs_convert` crate.
## [0.132.0] - 2024-08-21
### Highlights
With this release, we finally finish the `stats` caching refactor started in 0.131.0, replacing the binary encoded stats cache with a simpler JSONL cache. The `stats` cache stores the necessary statistical metadata to make several key commands smarter & faster. Per the [benchmarks](https://qsv.dathere.com/benchmarks):
- `frequency` is 6x faster (`frequency_index_stats_mode_auto`).
Not only is it faster, it now doesn't need to compile a hashmap for columns with ALL unique values (e.g. ID columns) - practically, making it able to handle "real-world" datasets of any size (that is, unless all the columns have ALL unique cardinalities. In that case, the entire CSV will have to fit into memory).
- `tojsonl` is 2.67x faster (`tojsonl_index`)
- `schema` is two orders of magnitude (100x) faster!!! (`schema_index`)
The stats cache also provides the foundation for even more "smart" features and commands in the future. It also has the side-benefit of adding a way to produce stats in JSONL format that can be used for other purposes beyond qsv.
The `search`, `searchset`, and `replace` commands now also have a `--literal` option that allows you to search for and replace strings with regex special/reserved characters. This makes it easier to search for and replace strings that contain special characters without having to escape them.
---
### Added
* `search`, `searchset` & `replace`: add `--literal` option https://github.com/jqnatividad/qsv/pull/2060 & https://github.com/jqnatividad/qsv/commit/7196053b36c8886092fe25fd030ccf1cf765ed6a
* `slice`: added usage text examples https://github.com/jqnatividad/qsv/commit/04afaa3d5a6e51c75f3f9041515c1d7986c45777
* `publish`: added workflow to build "portable" binaries with CPU features disabled
* `contrib(completions)`: add `--literal` for `search` and `searchset` by @rzmk in https://github.com/jqnatividad/qsv/pull/2061
* `contrib(completions)`: add `--literal` completion to `replace` by @rzmk in https://github.com/jqnatividad/qsv/pull/2062
* add more polars metadata in `--version` info https://github.com/jqnatividad/qsv/pull/2073
* `docs`: added more info to SECURITY.md https://github.com/jqnatividad/qsv/commit/609d4df61c93de6959f07e8d972009ae6cd12b78
* `docs`: expanded Goals/Non-Goals https://github.com/jqnatividad/qsv/commit/54998e36eb4608a1fba7938836e5985b699e32ff
* `docs`: added Installation "Option 0" quick start https://github.com/jqnatividad/qsv/commit/bf5bf82105397436d901de247398fce3e808b122
* added `search --literal` benchmark
### Changed
* `stats`, `schema`, `frequency` & `tojsonl`: stats caching refactor, replacing binary encoded stats cache with a simpler JSONL cache https://github.com/jqnatividad/qsv/pull/2055
* rename `stats --stats-json` option to `stats --stats-jsonl` https://github.com/jqnatividad/qsv/pull/2063
* changed "broken pipe" error to a warning https://github.com/jqnatividad/qsv/commit/73532759a8dad2d643f283296aa402950765b648
* `docs`: update multithreading and caching sections of PERFORMANCE.md https://github.com/jqnatividad/qsv/commit/5e6bc455bc544003535e18f99493cc1a20c4a2ce
* `deps`: switch to our qsv-optimized fork of csv crate https://github.com/jqnatividad/qsv/commit/3fc1e82c83b5dec23d3ba610e3d0f9bbd2924788
* `deps`: bump polars from 0.41.3 to 0.42.0 https://github.com/jqnatividad/qsv/pull/2051
* build(deps): bump actix-web from 4.8.0 to 4.9.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2041
* build(deps): bump flate2 from 1.0.31 to 1.0.32 by @dependabot in https://github.com/jqnatividad/qsv/pull/2071
* build(deps): bump indexmap from 2.3.0 to 2.4.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2049
* build(deps): bump reqwest from 0.12.6 to 0.12.7 by @dependabot in https://github.com/jqnatividad/qsv/pull/2070
* build(deps): bump rust_decimal from 1.35.0 to 1.36.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2068
* build(deps): bump serde from 1.0.205 to 1.0.206 by @dependabot in https://github.com/jqnatividad/qsv/pull/2043
* build(deps): bump serde from 1.0.206 to 1.0.207 by @dependabot in https://github.com/jqnatividad/qsv/pull/2047
* build(deps): bump serde from 1.0.207 to 1.0.208 by @dependabot in https://github.com/jqnatividad/qsv/pull/2054
* build(deps): bump serde_json from 1.0.122 to 1.0.124 by @dependabot in https://github.com/jqnatividad/qsv/pull/2045
* build(deps): bump serde_json from 1.0.124 to 1.0.125 by @dependabot in https://github.com/jqnatividad/qsv/pull/2052
* apply select clippy lint suggestions
* updated several indirect dependencies
* made various usage text improvements
### Fixed
* `stats`: fix `--output` delimiter inferencing based on file extension https://github.com/jqnatividad/qsv/pull/2065
* make process_input helper handle stdin better https://github.com/jqnatividad/qsv/pull/2058
* `docs`: fix completions for `--stats-jsonl` and qsv pro installation text update by @rzmk in https://github.com/jqnatividad/qsv/pull/2072
* `docs`: added Note about why `luau` feature is disabled in musl binaries - https://github.com/jqnatividad/qsv/commit/ffa2bc5a3f397b406347d14d0d4fb4ead49cb470 & https://github.com/jqnatividad/qsv/commit/27d0f8e1c2e43c00b99abf98dfa01a4758cf9bad
### Removed
* Removed bincode dependency now that we're using JSONL stats cache https://github.com/jqnatividad/qsv/pull/2055 https://github.com/jqnatividad/qsv/commit/babd92bbae473ed63f44f593bc1ab0ad1bc17761
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.131.1...0.132.0
## [0.131.1] - 2024-08-09
### Changed
* deps: bump polars to latest upstream post py-1.41.1 release at the time of this release
* build(deps): bump filetime from 0.2.23 to 0.2.24 by @dependabot in https://github.com/jqnatividad/qsv/pull/2038
### Fixed
* `frequency`: change `--stats-mode` default to `none` from `auto`.
This is because of a big performance regression when using `--stats-mode auto` on datasets with columns with ALL unique values.
See https://github.com/jqnatividad/qsv/issues/2040 for more info.
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.131.0...0.131.1
## [0.131.0] - 2024-08-08
### Highlights
* __Refactored `frequency` to make it smarter and faster.__
`frequency`'s core algorithm essentially compiles an in-memory hashmap to determine the frequency of each unique value for each column. It does this using multi-threaded, multi-I/O techniques to make it blazing fast.
However, for columns with ALL unique values (e.g. ID columns), this takes a comparatively long time and consumes a lot of memory as it essentially compiles a hashmap of the entire column.
Now, with the new `--stats-mode` option (enabled by default), `frequency` can compile the dataset in a more intelligent way by looking up a column's cardinality in the stats cache.
If the cardinality of a column is equal to the CSV's rowcount (indicating a column with ALL unique values), it short-circuits frequency calculations for that column - dramatically reducing the time and memory requirements for the ID column as it eliminates the need to maintain a hashmap for it.
Practically speaking, this makes `frequency` able to handle "real-world" datasets of any size.
To ensure `frequency` is as fast as possible, be sure to `index` and compute `stats` for your datasets beforehand.
* __Setting the stage for Datapusher+ v1 and...__
The "[itches we've been scratching](https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar#Lessons_for_creating_good_open_source_software)" the past few months have been informed by our work at several clients towards the release of Datapusher+ 1.0 and qsv pro 1.0 (more info below) - both targeted for release this month.
[DP+](https://github.com/dathere/datapusher-plus) is our third-gen, high-speed data ingestion/registration tool for CKAN that uses qsv as its data wrangling/analysis engine. It will enable us to reinvent the way data is ingested into CKAN - with exponentially faster data ingestion, metadata inferencing, data validation, computed metadata fields, and more!
We're particularly excited how qsv will allow us to compute and infer high-quality metadata for datasets (with a focus on inferring optional recommended [DCAT-US v3](https://doi-do.github.io/dcat-us/) metadata fields) in "near real-time", while dataset publishers are still entering metadata. This will be a game-changer for CKAN administrators and data publishers!
* __...qsv pro 1.0__
[qsv pro](https://qsvpro.dathere.com) is [datHere](https://dathere.com)'s enterprise-grade data wrangling/curation workbench that’s planned for v1.0 release this month.
Building the core functionality of qsv pro's Workflow feature is one of the primary reasons for a v1.0 release.
We feel qsv pro may be a game-changer for data wranglers and data curators who need to work with spreadsheets and large datasets to view statistical data and metadata while also performing complex data wrangling operations in a user-friendly way without having to write code.
---
### Added
* `docs`: added Shell Completion section https://github.com/jqnatividad/qsv/commit/556a2ff48660d05f8e9a865ec427e98114f49b43
* `docs:` add 🪄 emoji in legend to indicate "automagical" commands https://github.com/jqnatividad/qsv/commit/2753c90fcbd1cc1b41dae0a51d26bfe704029ee8
* Add building deb package (WIP) by @tino097 in https://github.com/jqnatividad/qsv/pull/2029
* Added GitHub workflow to test debian package (WIP) by @tino097 in https://github.com/jqnatividad/qsv/pull/2032
* `tests`: added false positive to _typos.toml configuration https://github.com/jqnatividad/qsv/commit/d576af229bf76b7d0e1f40eb37b578a6b6691ed4
* added more benchmarks
* added more tests
### Changed
* `fetch` & `fetchpost`: remove expired diskcache entries on startup https://github.com/jqnatividad/qsv/commit/9b6ab5db91416f71577b8a1fc91e2e3189a1bd4b
* `frequency`: smarter frequency compilation with new `--stats-mode` option https://github.com/jqnatividad/qsv/pull/2030
* `json`: refactored for maintainability & performance https://github.com/jqnatividad/qsv/commit/62e92162a4aa446097736ec76834cf0e06d195b8 and https://github.com/jqnatividad/qsv/commit/4e44b1878b952c455c1922a66795b8c86a1b1dba
* improved `self-update` messages https://github.com/jqnatividad/qsv/commit/5c874e09e15a274dccd8f83a322002032e65c2d0 and https://github.com/jqnatividad/qsv/commit/0aa0b13cf34103cfb75befc6480f31714d806aa2
* `contrib(completions)`: `frequency` updates & remove bashly/fish by @rzmk in https://github.com/jqnatividad/qsv/pull/2031
* Debian package update by @tino097 in https://github.com/jqnatividad/qsv/pull/2017
* `publish`: optimized enabled CPU features when building release binaries in all GitHub Actions "publishing" workflows
* `publish`: ensure latest Python patch release is used when building `qsvpy` binary variants https://github.com/jqnatividad/qsv/commit/2ab03a097645a95b0d390f546ad9735c9a7e72b2 and https://github.com/jqnatividad/qsv/commit/ec6f486ef112cf942b2263b84b97d90cba1beb12
* `tests`: also enabled CPU features in CI tests
* `docs`: wordsmith qsv "elevator pitch" https://github.com/jqnatividad/qsv/commit/cc47fe688eeeb13b4deb3f3bf48d954924eee22e
* `docs`: point to https://100.dathere.com in Whirlwind tour https://github.com/jqnatividad/qsv/commit/fc49aef826c1b1933ea1508cb687476936a147ff
* `deps`: bump polars to latest upstream post py-1.41.1 release at the time of this release
* build(deps): bump bytes from 1.6.1 to 1.7.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2018
* build(deps): bump bytes from 1.7.0 to 1.7.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2021
* build(deps): bump flate2 from 1.0.30 to 1.0.31 by @dependabot in https://github.com/jqnatividad/qsv/pull/2027
* build(deps): bump indexmap from 2.2.6 to 2.3.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2020
* build(deps): bump jaq-parse from 1.0.2 to 1.0.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/2016
* build(deps): bump redis from 0.26.0 to 0.26.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/2023
* build(deps): bump regex from 1.10.5 to 1.10.6 by @dependabot in https://github.com/jqnatividad/qsv/pull/2025
* build(deps): bump serde_json from 1.0.121 to 1.0.122 by @dependabot in https://github.com/jqnatividad/qsv/pull/2022
* build(deps): bump sysinfo from 0.30.13 to 0.31.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2019
* build(deps): bump sysinfo from 0.31.0 to 0.31.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/2024
* build(deps): bump tempfile from 3.11.0 to 3.12.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/2033
* build(deps): bump serde from 1.0.204 to 1.0.205 by @dependabot in https://github.com/jqnatividad/qsv/pull/2036
* apply select clippy suggestions
* updated several indirect dependencies
* made various usage text improvements
* bumped MSRV to 1.80.1
### Fixed
* `sqlp` & `joinp`: fixed `.ssv.sz` output auto-compression support https://github.com/jqnatividad/qsv/commit/5397f6c7a3b083872bbb97d90db3a2fd2f8521e6 & https://github.com/jqnatividad/qsv/commit/d86ba6376d5819898187d5fa88eae19373022e5b
* `docs`: fix link by @uncenter in https://github.com/jqnatividad/qsv/pull/2026
* `tests`: correct misnamed test https://github.com/jqnatividad/qsv/commit/8ae600011ddb109e7993e54dae9b933d15eccd38
* `tests`: fix flaky `reverse` property tests https://github.com/jqnatividad/qsv/commit/d86ba6376d5819898187d5fa88eae19373022e5b
### Removed
* `docs`: "Quicksilver" is the name of the logo horse, not how you pronounce "qsv" https://github.com/jqnatividad/qsv/commit/e4551ae4b62a3a635b7c351c5f28aa2a7d374958
## New Contributors
* @uncenter made their first contribution in https://github.com/jqnatividad/qsv/pull/2026
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.130.0...0.131.0
## [0.130.0] - 2024-07-29
Following the [0.129.0 release - the largest release ever](https://github.com/jqnatividad/qsv/releases/tag/0.129.0), 0.130.0 continues to polish qsv as a data-wrangling engine, packing new features, fixes, and improvements, previewing upcoming features in qsv pro 1.0. Here are a few highlights:
### Highlights
- Added `.ssv` (semicolon separated values) automatic support. Semicolon separated values are now automatically detected and supported by qsv. Though not as common as CSV, SSV is used in some regions and industries, so qsv now supports it.
- Added cargo deb compatibility. In preparation for the release of [DataPusher+ 1.0](https://github.com/dathere/datapusher-plus/tree/master), we're now making it easier to upgrade `qsvdp` so [CKAN](https://ckan.org) administrators can install and upgrade it more easily, using `apt-get install qsvdp` or `apt-get upgrade qsvdp`.
DP+ is our next-gen, high-speed data ingestion tool for CKAN. Its not only a robust, fast, validating data pump that guarantees high quality data, it also does extended analysis to infer and derive high-quality metadata - what we call "[automagical metadata](https://dathere.com/2023/11/automagical-metadata/)".
- Upgraded to the latest Polars upstream at the [py-polars-1.3.0](https://github.com/pola-rs/polars/releases/tag/py-1.3.0) tag. [Polars tops the TPC-H Benchmark](https://pola.rs/posts/benchmarks/) and is several orders of magnitude faster than traditional dataframe libraries (cough - 🐼 pandas). qsv proudly rides the 🐻❄️ Polars bear to get subsecond response times even with very large datasets!
- qsv v0.130.0 shell completions files are available for download [here](https://github.com/jqnatividad/qsv/tree/master/contrib/completions/examples). With shell completions, pressing tab in a compatible shell may provide suggestions for various qsv commands, subcommands, and options that you may choose from. Supported shells include bash, zsh, powershell, fish, nushell, fig, and elvish. You may view tips on how to install completions for the bash shell [here](https://100.dathere.com/exercises-setup.html#optional-set-up-qsv-completions).
### Added
* `apply`: add base62 encode/decode operations https://github.com/jqnatividad/qsv/pull/2013
* `headers`: add `--just-count` option https://github.com/jqnatividad/qsv/pull/2004
* `json`: add `--select` option https://github.com/jqnatividad/qsv/pull/1990
* `searchset`: add `--not-one` flag by @rzmk in https://github.com/jqnatividad/qsv/pull/1994
* Added `.ssv` (semicolon separated values) automatic support https://github.com/jqnatividad/qsv/pull/1987
* Added cargo deb compatibility by @tino097 in https://github.com/jqnatividad/qsv/pull/1991
* `contrib(completions)`: add `--just-count` for `headers` by @rzmk in https://github.com/jqnatividad/qsv/pull/2006
* `contrib(completions)`: add `--select` for `json` by @rzmk in https://github.com/jqnatividad/qsv/pull/1992
* added several benchmarks
* added more tests
### Changed
* `diff`: allow selection of `--key` and `--sort-columns` by name, not just by index https://github.com/jqnatividad/qsv/pull/2010
* `fetch` & `fetchpost`: replace deprecated Redis execute command https://github.com/jqnatividad/qsv/commit/75cbe2b76426591e4658fdcb7d29287a40a7db36
* `stats`: more intelligent `--infer-len`option https://github.com/jqnatividad/qsv/commit/c6a0e641cd4c6ef87c070c8944f32a962a11c7e3
* `validate`: return delimiter detected upon successful CSV validation https://github.com/jqnatividad/qsv/pull/1977
* bump polars to latest upstream at py-polars-1.3.0 tag https://github.com/jqnatividad/qsv/pull/2009
* deps: bump csvs_convert from 0.8.12 to 0.8.13 https://github.com/jqnatividad/qsv/commit/d1d08009deb0579fd4d6fe305097e00e92da4191
* build(deps): bump cached from 0.52.0 to 0.53.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1983
* build(deps): bump cached from 0.53.0 to 0.53.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1986
* build(deps): bump postgres from 0.19.7 to 0.19.8 by @dependabot in https://github.com/jqnatividad/qsv/pull/1985
* build(deps): bump pyo3 from 0.22.1 to 0.22.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/1979
* build(deps): bump redis from 0.25.4 to 0.26.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1995
* build(deps): bump serde_json from 1.0.120 to 1.0.121 by @dependabot in https://github.com/jqnatividad/qsv/pull/2011
* build(deps): bump simple-expand-tilde from 0.1.7 to 0.4.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1984
* build(deps): bump tokio from 1.38.0 to 1.38.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1973
* build(deps): bump tokio from 1.38.1 to 1.39.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1988
* build(deps): bump xxhash-rust from 0.8.11 to 0.8.12 by @dependabot in https://github.com/jqnatividad/qsv/pull/1997
* apply select clippy suggestions
* updated several indirect dependencies
* made various usage text improvements
* pin Rust nightly to 2024-07-26
### Fixed
* `diff`: clarify `--key` usage examples, resolves #1998 by @rzmk in https://github.com/jqnatividad/qsv/pull/2001
* `json`: refactored so it didn't need to use threads to spawn `qsv select` to order the columns. Had to do this as sometimes intermediate output was sent to stdout before the final output was ready https://github.com/jqnatividad/qsv/commit/0f25deff98139b574dfd61c6e9bf58d36ea16618
* `py`: replace row with col in usage text by @allen-chin in https://github.com/jqnatividad/qsv/pull/2008
* `reverse`: fix indexed bug https://github.com/jqnatividad/qsv/pull/2007
* `validate`: properly auto-detect tab delimiter when file extension is TSV or TAB https://github.com/jqnatividad/qsv/pull/1975
* fix panic when process_input helper fn receives unexpected input from stdin https://github.com/jqnatividad/qsv/commit/152fec486c0e7b16242f3967930e9654ff2bdf3c
### Removed
* `docs`: remove *nix only message for `foreach` by @rzmk in https://github.com/jqnatividad/qsv/pull/1972
## New Contributors
* @tino097 made their first contribution in https://github.com/jqnatividad/qsv/pull/1991
* @allen-chin made their first contribution in https://github.com/jqnatividad/qsv/pull/2008
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.129.1...0.130.0
---
To stay updated with datHere's latest news and updates (including [qsv pro](https://qsvpro.dathere.com), [datHere's CKAN DMS](https://dathere.com/ckan-dms/), and [analyze.dathere.com](https://analyze.dathere.com)), subscribe to the newsletter here: [dathere.com/newsletter](https://dathere.com/newsletter/)
## [0.129.0] - 2024-07-14
This release is the biggest one ever!
Packed with new features, improvements, and previews of upcoming qsv pro features, here are a few highlights:
## 📌 Highlights (click each dropdown for more info)
<details><summary><strong>Meet @rzmk - qsv pro's software engineer now also co-maintains qsv!</strong></summary>
@rzmk has contributed to projects in the qsv ecosystem including qsv's [`describegpt`](https://github.com/jqnatividad/qsv/tree/master/src/main/describegpt.rs), [`prompt`](https://github.com/jqnatividad/qsv/tree/master/src/main/prompt.rs), [`json`](https://github.com/jqnatividad/qsv/tree/master/src/main/json.rs), and [`clipboard`](https://github.com/jqnatividad/qsv/tree/master/src/main/clipboard.rs) commands; qsv's tab completion support; [qsv.dathere.com](https://qsv.dathere.com) including its online configurator and benchmarks page; [100.dathere.com](https://100.dathere.com) with its qsv lessons and exercises; and [qsv pro](https://qsvpro.dathere.com) the spreadsheet data wrangling desktop app (along with its promo site). @rzmk now also co-maintains qsv!
With @rzmk now also co-maintaining qsv, our data-wrangling portfolio's roadmap may get more intriguing as @rzmk's work on qsv pro, 100.dathere.com, and other initiatives can result in contributions to qsv as we've seen in this release. Perhaps some aims may be put towards AI; "[automagical](https://dathere.com/2023/11/automagical-metadata/)" metadata inferencing; DCAT 3; and expanded recipe support with the accelerated evolution of qsv pro as an enterprise-grade Data-Wrangling/Data Curation Workbench.
</details>
<details><summary><strong>Polars v0.41.3</strong> - numerous <a href="https://github.com/jqnatividad/qsv/tree/master/src/cmd/sqlp.rs"><code>sqlp</code></a> and <a href="https://github.com/jqnatividad/qsv/tree/master/src/cmd/joinp.rs"><code>joinp</code></a> improvements</summary>
* `sqlp`: expanded SQL support
- Natural Join support
- DuckDB-like `COLUMNS` SQL function to select columns that match a pattern
- ORDER BY ALL support
- Support POSTGRESQL `^@` ("starts with"), `~~`,`~~*`,`!~~`,`!~~*` ("like", "ilike") string-matching operators
- Support for SQL `SELECT * ILIKE` wildcard syntax
- Support SQL temporal functions `STRFTIME` and `STRPTIME`
* `sqlp`: added `--streaming` option
</details>
<details style="margin-bottom: 0;"><summary><strong>New command <code><a href="https://github.com/jqnatividad/qsv/tree/master/src/cmd/prompt.rs">qsv prompt</a></code></strong> - Use a file dialog for qsv file input and output</summary>
Be more interactive with qsv by using a file dialog to select a file for input and output.

Here are a few key highlights:
- Start with `qsv prompt` when piping commands to provide a file as input from an open file dialog and pipe it into another command, for example: `qsv prompt | qsv stats`.
- End with `qsv prompt -f` when piping commands to save the output to a file you choose with a save file dialog.
There are other options too, so feel free to explore more with `qsv prompt --help`.
This will allow you to create qsv pipelines that are more "user-friendly" and distribute them to non-technical users. It's not as flexible as qsv pro's full-blown GUI, but it's a start!
</details>
<details><summary><strong>New command <a href="https://github.com/jqnatividad/qsv/tree/master/src/cmd/json.rs"><code>qsv json</code></a></strong> - Convert JSON data to CSV and optionally provide a jq-like filter</summary>
The new `json` command allows you to convert non-nested JSON data to CSV. If your data is not in the expected format, try using the `--jaq` option to provide a jq-like filter. See `qsv json --help` for more information and examples.

Here are a few key highlights:
- Specify the path to a JSON file to attempt conversion to CSV with `qsv json <filepath>`.
- Attempt conversion of JSON to CSV data from `stdin`, for example: `qsv slice <filepath.csv> --json | qsv json`.
- Write the output to a file with the `--output <filepath>` (or `-o` for short) option.
- Use the `--jaq <filter>` option to try converting nested or complex JSON data into the intended format before parsing to CSV.
You may learn more by running `qsv json --help`.
Along with the `jsonl` command, we now have more options to convert JSON to CSV with qsv!
</details>
<details style="margin-bottom: 0;"><summary><strong>New command <code><a href="https://github.com/jqnatividad/qsv/tree/master/src/cmd/prompt.rs">qsv clipboard</a></code></strong> - Provide input from your clipboard and save output to your clipboard</summary>
Provide your clipboard content using `qsv clipboard` and save output to your clipboard by piping into `qsv clipboard --save` (or `-s` for short).

</details>
<details><summary><strong><a href="https://100.dathere.com">100.dathere.com</a></strong> - Try out lessons and exercises with qsv from your browser!</summary>
You may run qsv commands from your browser without having to install it locally at [100.dathere.com](https://100.dathere.com).
| Within the lesson (in-page) using Thebe | In a Jupyter Lab environment |
| ----------------------------------- | ----------------------------------- |
|  |  |
Thanks to [Jupyter Book](https://jupyterbook.org), [datHere](https://dathere.com) has released a website available at [100.dathere.com](https://100.dathere.com) where you may explore lessons and exercises with qsv by running them within the web page, in a Jupyter Lab environment, or locally after following the provided installation instructions. There are multiple exercises planned, but feel free to try out the first few available lessons/exercises by visiting [100.dathere.com](https://100.dathere.com) and star the source code's repository [here](https://github.com/dathere/100.dathere.com).
</details>
<details><summary><strong>New <a href="https://github.com/jqnatividad/qsv/tree/master/contrib/completions">multi-shell completions draft</a></strong> (bash, zsh, powershell, fish, nushell, fig, elvish)</summary>
There's a draft of more qsv shell completion support including 7 different shells! The plan is to add the rest of the commands in this implementation since we can use one codebase to generate the 7 shell completion script files. Feel free to try out the various shell completions in the `examples` folder from [`contrib/completions`](https://github.com/jqnatividad/qsv/tree/master/contrib/completions) to verify if the examples work (as of today's release date only `qsv count` and `qsv clipboard` may be available) and also contribute to adding the rest of the completions if you know a bit of Rust.
The existing <a href="https://github.com/jqnatividad/qsv/tree/master/contrib/bashly">Bash shell completions for v0.129.0</a> and <a href="https://github.com/jqnatividad/qsv/tree/master/contrib/fish">fish shell completions draft</a> are available for now as the multi-shell completions draft is being developed.
| Bash completions demo | Fish completions demo |
| ----------------------------------- | ----------------------------------- |
|  |  |
With shell completions enabled, you may identify qsv commands more easily when pressing the `tab` key on your keyboard in certain positions using the relevant Bash or fish shell from your terminal. You may follow the instructions from 100.dathere.com [here](https://100.dathere.com/exercises-setup.html#bash) to learn how to install the Bash completions and under the Usage section [here](https://github.com/jqnatividad/qsv/tree/master/contrib/fish#usage) for fish shell completions. Note that the fish shell completions are incomplete and both of the implementations may be replaced by the multi-shell completions implementation once complete.
</details>
<details><summary><strong><a href="https://qsvpro.dathere.com">qsvpro.dathere.com</a></strong> - Preview: Download spreadsheets from a compatible CKAN instance into the qsv pro Workflow</summary>
> This is a preview of a feature, meaning it is planned for an upcoming release but may change by the time it is released.

In addition to importing local spreadsheet files and uploading to a CKAN instance, this new feature allows users to select a locally registered CKAN instance where they have the `create_dataset` permission to download a spreadsheet file from their CKAN instance and load the new local spreadsheet file into the Workflow. qsv pro's Workflow would therefore have both upload and download capability to and from a compatible CKAN instance.
</details>
<details><summary><strong><a href="https://qsvpro.dathere.com">qsvpro.dathere.com</a></strong> - Preview: Attempt SQL query generation from natural language with a compatible LLM API instance</summary>
> This is a preview of a feature, meaning it is planned for an upcoming release but may change by the time it is released.
> Also note that this video is sped up as you may see by the notes that pop up (you may pause the video to read them).
https://github.com/jqnatividad/qsv/assets/30333942/e90893e6-3196-4fa6-bce0-f69a9f6347f2
Leveraging [`qsv describegpt`](https://github.com/jqnatividad/qsv/tree/master/src/cmd/describegpt.rs)'s AI integration capabilities along with multiple other qsv commands, qsv pro's Workflow's existing SQL query tab now has a generator that may ***attempt*** to generate a SQL query natural language using an LLM API compatible with OpenAI's API specification such as running an [Ollama](https://ollama.com/) (v0.2.0 or above) server locally and ***attempt*** to generate a SQL query by asking a question related to your spreadsheet data. Results may vary depending on your configuration and you may need to fix the generated output. For example in the demo we asked for ***who*** has the highest salary but extra information and only the highest salary was provided, though this does give a query we can modify and work with.
<details><summary>Note on Ask and <code>qsv describegpt</code></summary>
We mention ***attempt*** since LLMs can produce incorrect output, even output that *seems* correct but is not. We mention that "inaccurate information" may be produced within `qsv describegpt`'s usage text too along with AI-generated output potentially being incorrect within qsv pro, so make sure the output is fixed and verified before using it in production use cases.
</details>
</details>
<details><summary><h2>🔁 Changelog</h2></summary>
### Added
* `clipboard`: add `qsv clipboard` command for clipboard input/output by @rzmk in https://github.com/jqnatividad/qsv/pull/1953
* `describegpt`: add `--prompt` for custom prompt & update prompt file + docs by @rzmk in https://github.com/jqnatividad/qsv/pull/1862
* `describegpt`: add base_url, model, ollama, & timeout to prompt file by @rzmk in https://github.com/jqnatividad/qsv/pull/1859
* `enum`: add `--hash` option to create a platform-independent deterministic id https://github.com/jqnatividad/qsv/pull/1902
* `enum`: add `--uuid7` option to create UUID v7 identifiers https://github.com/jqnatividad/qsv/pull/1914
* `freq`: add `--no-trim` option https://github.com/jqnatividad/qsv/pull/1944
* `foreach`: add sample Windows implementation by @rzmk in https://github.com/jqnatividad/qsv/pull/1847
* `joinp`: add `--right` outer join option https://github.com/jqnatividad/qsv/pull/1945
* `json`: change jsonp to json using new implementation by @rzmk in https://github.com/jqnatividad/qsv/pull/1924
* `json`: add `--jaq` option to allow jq-like filtering & test by @rzmk in https://github.com/jqnatividad/qsv/pull/1959
* `jsonp`: add `jsonp` command allowing non-nested JSON to CSV conversion with Polars by @rzmk in https://github.com/jqnatividad/qsv/pull/1880
* `prompt`: add `qsv prompt` to pick a file with a file dialog & write to stdout by @rzmk in https://github.com/jqnatividad/qsv/pull/1860
* `prompt`: add `--fd-output` (`-f`) & `--output` (`-o`) options by @rzmk in https://github.com/jqnatividad/qsv/pull/1861
* `select`: add `--sort`, `--random` & `--seed` options; also add 9999 sentinel value to indicate last column https://github.com/jqnatividad/qsv/pull/1867
* `select`: use underscore char (_) to indicate last column, replacing 9999 sentinel value https://github.com/jqnatividad/qsv/pull/1873
* `sqlp`: add `--streaming` option https://github.com/jqnatividad/qsv/commit/e8bee9a60dccc6ec5b5a43b91cb6f558915faa0e
* `stats`: add Standard Error of the Mean (SEM) & Coefficient of Variation (CV) https://github.com/jqnatividad/qsv/pull/1857
* `validate`: added custom JSONschema format "currency" (decimal with 2 decimal places). Also, added check that only ascii characters are allowed in keys in JSONschema files.
* added `--batch` zero option to all commands with batch processing. This sentinel value is used to indicate that the entire input should be processed in one batch https://github.com/jqnatividad/qsv/commit/feedbda4a3be9f8835eba0626e5fe01147831186
* added typos check to CI https://github.com/jqnatividad/qsv/commit/9fdf0662b6dc4fa6ebfed592a177d8539f264041
* `contrib(fish)`: add fish completions prototype with `qsv.fish` and docs by @rzmk in https://github.com/jqnatividad/qsv/pull/1884
* contrib(bashly): add `--hash <columns>` option to `enum` by @rzmk in https://github.com/jqnatividad/qsv/pull/1905
* contrib(bashly): add `--uuid4` & `--uuid7` for `qsv enum` by @rzmk in https://github.com/jqnatividad/qsv/pull/1915
* `contrib(bashly)`: remove `--ollama` from `qsv describegpt` by @rzmk in https://github.com/jqnatividad/qsv/pull/1951
* `contrib(bashly)`: add `--no-trim` to `frequency` & `--right` to `joinp` by @rzmk in https://github.com/jqnatividad/qsv/pull/1952
* `tests`: add tests for 100.dathere.com/lessons/1 by @rzmk in https://github.com/jqnatividad/qsv/pull/1876
* `tests`: add test_100 for 100.dathere.com & tests for lesson/exercise 0 by @rzmk in https://github.com/jqnatividad/qsv/pull/1848
* `docs`: add 👆 emoji to indicate commands with column selector support https://github.com/jqnatividad/qsv/commit/40ac8a7602315857ca529f43dd4fc45bec65c703
* Incorporate typos check in CI https://github.com/jqnatividad/qsv/pull/1930
### Changed
* `stats`: made several microoptimizations to Field Data Type inferencing https://github.com/jqnatividad/qsv/commit/35004541d25eb29d564ec60824da63d9f32344dd https://github.com/jqnatividad/qsv/commit/f829e0cfbc8a390570f85371e3d661ec33211405
* `select`: `--sort` & `--random` options now work with the initial selection, not just the entire CSV https://github.com/jqnatividad/qsv/pull/1875
* `contrib(bashly)`: update `contrib/bashly/completions.bash` (prep for qsv v0.129.0) by @rzmk in https://github.com/jqnatividad/qsv/pull/1885
* `jsonp`: use `print!` instead of `println!` & add `House.csv` + tests by @rzmk in https://github.com/jqnatividad/qsv/pull/1897
* `docs`: add column selector emoji - 👆 https://github.com/jqnatividad/qsv/pull/1906
* upgrade to polars 0.41.0 https://github.com/jqnatividad/qsv/pull/1907
* `describegpt`: update `dotenv.template` variable with `QSV_LLM_APIKEY` by @rzmk in https://github.com/jqnatividad/qsv/pull/1908
* `describegpt`: change min Ollama version from 0.1.49 to 0.2.0 by @rzmk in https://github.com/jqnatividad/qsv/pull/1954
* `describegpt`: add `{headers}` replaced by `qsv slice ... --len 1 -n` by @rzmk in https://github.com/jqnatividad/qsv/pull/1941
* `validate`: validating against a JSONschema requires headers https://github.com/jqnatividad/qsv/pull/1931
* setting `--batch` to 0 loads all rows at once before parallel processing https://github.com/jqnatividad/qsv/pull/1928
* `deps`: add polars timezones support https://github.com/jqnatividad/qsv/pull/1898
* `tests`: update `test_100/exercise_0.rs` setup file data by @rzmk in https://github.com/jqnatividad/qsv/pull/1858
* build(deps): bump actions/setup-python from 5.1.0 to 5.1.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1961
* build(deps): bump actix-web from 4.6.0 to 4.7.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1866
* build(deps): bump actix-web from 4.7.0 to 4.8.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1901
* build(deps): bump atoi_simd from 0.15.6 to 0.16.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1844
* build(deps): bump cached from 0.51.3 to 0.51.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/1874
* build(deps): bump cached from 0.51.4 to 0.52.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1938
* build(deps): bump csvs_convert from 0.8.10 to 0.8.11 by @dependabot in https://github.com/jqnatividad/qsv/pull/1891
* build(deps): bump csvs_convert from 0.8.11 to 0.8.12 by @dependabot in https://github.com/jqnatividad/qsv/pull/1948
* build(deps): bump curve25519-dalek from 4.1.2 to 4.1.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1893
* build(deps): bump flexi_logger from 0.28.0 to 0.28.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1853
* build(deps): bump flexi_logger from 0.28.1 to 0.28.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/1868
* build(deps): bump flexi_logger from 0.28.2 to 0.28.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1870
* build(deps): bump flexi_logger from 0.28.3 to 0.28.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/1881
* build(deps): bump flexi_logger from 0.28.4 to 0.28.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/1904
* build(deps): bump geosuggest-core from 0.6.2 to 0.6.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1883
* build(deps): bump geosuggest-utils from 0.6.2 to 0.6.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1882
* build(deps): bump jql-runner from 7.1.9 to 7.1.10 by @dependabot in https://github.com/jqnatividad/qsv/pull/1845
* build(deps): bump jql-runner from 7.1.10 to 7.1.11 by @dependabot in https://github.com/jqnatividad/qsv/pull/1856
* build(deps): bump jql-runner from 7.1.11 to 7.1.12 by @dependabot in https://github.com/jqnatividad/qsv/pull/1903
* build(deps): bump jql-runner from 7.1.12 to 7.1.13 by @dependabot in https://github.com/jqnatividad/qsv/pull/1960
* build(deps): bump log from 0.4.21 to 0.4.22 by @dependabot in https://github.com/jqnatividad/qsv/pull/1925
* build(deps): bump mimalloc from 0.1.42 to 0.1.43 by @dependabot in https://github.com/jqnatividad/qsv/pull/1911
* build(deps): bump mlua from 0.9.8 to 0.9.9 by @dependabot in https://github.com/jqnatividad/qsv/pull/1894
* `deps`: apply latest polars upstream with unreleased fixes https://github.com/jqnatividad/qsv/commit/261ede59058a123c4cba62c0945a1fc4e1c77861
* `deps`: we now track py-polars release, instead of rust-polars https://github.com/jqnatividad/qsv/pull/1854
* `deps`: update polars engine to use py-polars-1.0.0-beta1 https://github.com/jqnatividad/qsv/pull/1896
* build(deps): bump polars from 0.41.0 to 0.41.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1909
* build(deps): bump polars from 0.41.1 to 0.41.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/1916
* deps: bump polars from 0.41.2 to 0.41.3 https://github.com/jqnatividad/qsv/commit/dc0492ffe2669ddf8a7ff3f82fcd2db8daad83f9
* build(deps): bump pyo3 from 0.21.2 to 0.22.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1918
* build(deps): bump pyo3 from 0.22.0 to 0.22.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1950
* build(deps): bump regex from 1.10.4 to 1.10.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/1865
* build(deps): bump redis from 0.25.3 to 0.25.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/1846
* build(deps): bump reqwest from 0.12.4 to 0.12.5 by @dependabot in https://github.com/jqnatividad/qsv/pull/1889
* build(deps): bump self_update from 0.40.0 to 0.41.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1939
* build(deps): bump serde from 1.0.203 to 1.0.204 by @dependabot in https://github.com/jqnatividad/qsv/pull/1949
* build(deps): bump serde_json from 1.0.117 to 1.0.118 by @dependabot in https://github.com/jqnatividad/qsv/pull/1920
* build(deps): bump serde_json from 1.0.118 to 1.0.119 by @dependabot in https://github.com/jqnatividad/qsv/pull/1932
* build(deps): bump serde_json from 1.0.119 to 1.0.120 by @dependabot in https://github.com/jqnatividad/qsv/pull/1935
* build(deps): bump simple-expand-tilde from 0.1.6 to 0.1.7 by @dependabot in https://github.com/jqnatividad/qsv/pull/1886
* build(deps): bump strum from 0.26.2 to 0.26.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1913
* build(deps): bump strum_macros from 0.26.2 to 0.26.3 by @dependabot in https://github.com/jqnatividad/qsv/pull/1855
* build(deps): bump strum_macros from 0.26.3 to 0.26.4 by @dependabot in https://github.com/jqnatividad/qsv/pull/1863
* build(deps): bump sysinfo from 0.30.12 to 0.30.13 by @dependabot in https://github.com/jqnatividad/qsv/pull/1957
* build(deps): bump sysinfo from 0.30.12 to 0.30.13 by @dependabot in https://github.com/jqnatividad/qsv/pull/1965
* build(deps): bump titlecase from 3.2.0 to 3.3.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1963
* build(deps): bump tokio from 1.37.0 to 1.38.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1850
* build(deps): bump url from 2.5.0 to 2.5.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1869
* build(deps): bump url from 2.5.1 to 2.5.2 by @dependabot in https://github.com/jqnatividad/qsv/pull/1895
* build(deps): bump uuid from 1.8.0 to 1.9.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1912
* build(deps): bump uuid from 1.9.0 to 1.9.1 by @dependabot in https://github.com/jqnatividad/qsv/pull/1919
* build(deps): bump uuid from 1.9.1 to 1.10.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1964
* build(deps): bump xxhash-rust from 0.8.10 to 0.8.11 by @dependabot in https://github.com/jqnatividad/qsv/pull/1942
* apply select clippy suggestions
* updated several indirect dependencies
* made various usage text improvements
* added several benchmarks
* pin Rust nightly to 2024-06-23
### Fixed
* `frequency`: fix unique identifiers column detection https://github.com/jqnatividad/qsv/pull/1966
* `json`: add empty single JSON object logic & empty tests by @rzmk in https://github.com/jqnatividad/qsv/pull/1958
* `json`: fix typo in error message by @rzmk in https://github.com/jqnatividad/qsv/pull/1929
* `sniff`: fix doc typo by @rzmk in https://github.com/jqnatividad/qsv/pull/1947
* `validate`: validating with a JSONSchema requires headers https://github.com/jqnatividad/qsv/commit/616438213de44e4377a98ea81a676a7900bd4ae9
* Fixed several typos https://github.com/jqnatividad/qsv/commit/9fdf0662b6dc4fa6ebfed592a177d8539f264041
### Removed
* `describegpt`: remove `--ollama` since Ollama v0.1.49 has endpoints by @rzmk in https://github.com/jqnatividad/qsv/pull/1946
* `json`: remove necessity for `polars` feature & fix `--list` formatting by @rzmk in https://github.com/jqnatividad/qsv/pull/1936
* `jsonp`: remove `jsonp` command in favor of `json` by @rzmk in https://github.com/jqnatividad/qsv/pull/1924
* `deps`: fine tune polars features and remove explicit polars-ops dependency https://github.com/jqnatividad/qsv/commit/ccfd000d129799f5a106a7d4c8edab88af37367b
**Full Changelog**: https://github.com/jqnatividad/qsv/compare/0.128.0...0.129.0
</details>
---
To stay updated with datHere's latest news and updates (including [qsv pro](https://qsvpro.dathere.com), [datHere's CKAN DMS](https://dathere.com/ckan-dms/), and [analyze.dathere.com](https://analyze.dathere.com)), subscribe to the newsletter here: [dathere.com/newsletter](https://dathere.com/newsletter/)
## [0.128.0] - 2024-05-25
# ❤️ csv,conf,v8 Edition - [_¡Ándale! ¡Ándale! ¡Arriba! ¡Arriba!_](https://www.youtube.com/watch?v=5bmiDLH5htU) 🎉 #
Yii-hah! We're Mexico bound as we head to [csv,conf,v8](https://csvconf.com) to present and share qsv with fellow data-makers and wranglers from all over!
And we've packed a lot into this release for the occcasion:
* `search` got a lot of love as it now powers `qsv-pro`'s new `search` feature to get near-instant search results even on large datasets.
* `stats` - the ❤️ of qsv, now has cache fine-tuning options with the `--cache-threshold` option. It now also computes `max_precision` for floats and `is_ascii` for strings. It also has a new `--round` 9999 sentinel value to suppress rounding of statistics.
* `schema` & `tojsonl` are now faster thanks to `stats --cache-threshold` autoindex creation/deletion logic.
* We [upgraded Polars to 0.40.0](https://github.com/pola-rs/polars/releases/tag/rs-0.40.0) for even more speed and stability for the `count`, `joinp` & `sqlp` commands.
* `count` now has an additional blazing fast counting mode using Polars' `read_csv()` table function.
* `frequency` gets some micro-optimizations for even faster frequency analysis.
* `luau` is bundled with luau [0.625](https://github.com/luau-lang/luau/releases/tag/0.625) from [0.622](https://github.com/luau-lang/luau/releases/tag/0.622). We also upgraded the bundled LuaDate library [from 2.2.0 to 2.2.1](https://github.com/Tieske/date?tab=readme-ov-file#changes).
Overall, qsv manages to keep its performance edge despite the addition of new capabilities and features, and we'll give a whirlwind tour in [our talk at csv,conf,v8](https://csvconf.com/schedule/).
We'll also preview what we've been calling the __People's APPI__ - our _"Answering People/Policymaker Interface"_ in [qsv pro](https://qsvpro.dathere.com). This is a new way to interact with qsv that's more conversational and less command-line-y using a natural language interface. It's a way to make qsv more accessible to more people, especially those who are not comfortable with the command line.
We're excited to share these with the csv,conf,v8 community and the wider world! Nos vemos en Puebla!
[_¡Ándele! ¡Ándele! ¡Epa! ¡Epa! ¡Epa!_](https://www.youtube.com/watch?v=cc-3wVQuD7k)
---
### Added
* `count`: additional Polars-powered counting mode using `read_csv()` SQL table function https://github.com/jqnatividad/qsv/commit/05c580912365356e9c5383654f351e0cc6ebaab6
* `input`: add `--quote-style` option https://github.com/jqnatividad/qsv/commit/df3c8f14a4eaa2fba7237dfe30df2fef8c98eccd
* `joinp`: add `--coalesce` option https://github.com/jqnatividad/qsv/commit/8d142e51d683ab425fc53b2dddfdeeff6a814ffa
* `search`: add `--preview-match` option https://github.com/jqnatividad/qsv/pull/1785
* `search`: add `--json` output option https://github.com/jqnatividad/qsv/pull/1790
* `search`: add "match-only" `--flag` option mode https://github.com/jqnatividad/qsv/pull/1799
* `search`: add `--not-one` flag for not using exit code 1 when no match by @rzmk in https://github.com/jqnatividad/qsv/pull/1810
* `sqlp`: add `--decimal-comma` option https://github.com/jqnatividad/qsv/pull/1832
* `stats`: add `--cache-threshold` option https://github.com/jqnatividad/qsv/pull/1795
* `stats`: add `--cache-threshold` autoindex creation/deletion logic https://github.com/jqnatividad/qsv/pull/1809
* `stats`: add additional mode to `--cache-threshold` https://github.com/jqnatividad/qsv/commit/63fdc55828ec55bf7545c37bd56a4d537aa0cf71
* `stats`: now computes max_precision for floats https://github.com/jqnatividad/qsv/pull/1815
* `stats`: add `--round` 9999 sentinel value support to suppress rounding https://github.com/jqnatividad/qsv/pull/1818
* `stats`: add `is_ascii` column https://github.com/jqnatividad/qsv/pull/1824
* added new benchmarks for `search` command https://github.com/jqnatividad/qsv/commit/58d73c3beb41071d6cd8532768f0991f0554b717
### Changed
* `count`: document three count modes https://github.com/jqnatividad/qsv/commit/3d5a333ca8aef3aeaf74ff9e153b5118eb6a605b
* `describegpt`: update `--max-tokens` type for LLMs with larger context sizes by @rzmk https://github.com/jqnatividad/qsv/pull/1841
* `excel`: use simpler `range::headers()` to get headers https://github.com/jqnatividad/qsv/commit/069acbf5a6e86132214521324720608f4258c20f
* `frequency`: ensure `--other-sorted` works with `--other-text` https://github.com/jqnatividad/qsv/commit/7430ad76bda869be7729ea5000ad4d85a875433b
* `frequency`: microoptimize hot loop https://github.com/jqnatividad/qsv/commit/d9c01e17fa6c4f853a501fe75c6a6b8a30c269d2, https://github.com/jqnatividad/qsv/commit/7c9f925184100f89f6f3a77ae4f7b93448103f38 and
* `luau`: improve usage text https://github.com/jqnatividad/qsv/commit/cb6b4d9b7bfb60a10385057ca093453e3549e424
* `luau`: we now bundle luau 0.625 from 0.622 https://github.com/jqnatividad/qsv/commit/40609751950a852f998fba41edb35aab31c74c20
* `luau`: update vendored LuaDate library from 2.2.0 to 2.2.1 https://github.com/jqnatividad/qsv/pull/1840
* `schema`: adjust to reflect `stats --cache-threshold` option https://github.com/jqnatividad/qsv/commit/92fed8696fd885d3721f07eeedcf67732febed4c
* `slice`: move json output helpers to util https://github.com/jqnatividad/qsv/commit/1f44b488784fd0c1ef22786ab7aeacbf2f8cf976
* `tojsonl`: refactor boolcheck helper https://github.com/jqnatividad/qsv/commit/74d5f5a8c934254e11ee611973cc10524a288a9e
* `docs`: cross-reference `split` & `partition` commands https://github.com/jqnatividad/qsv/pull/1828
* contrib(bashly): update completions.bash for qsv v0.127.0 by @rzmk in https://github.com/jqnatividad/qsv/pull/1776
* contrib(bashly): update completions.bash for qsv v0.128.0 by @rzmk in https://github.com/jqnatividad/qsv/pull/1838
* `deps`: upgrade to polars 0.40.0 https://github.com/jqnatividad/qsv/pull/1831
* build(deps): bump actix-web from 4.5.1 to 4.6.0 by @dependabot in https://github.com/jqnatividad/qsv/pull/1825
* build(deps): bump anyhow from 1.0.82 to 1.0.83 by @dependabot in https://github.com/jqnatividad/qsv/pull/1798