All notable changes to this project will be documented in this file. See Conventional Commits for commit guidelines.
0.42.1 (2021-07-13)
- plugin-tika: no return outside function in install script (5985d4f)
0.42.0 (2020-08-16)
- plugin-sql: ignore the database id when listing queries (c92ae8e)
- plugin-twitter: set tweet as title instead of description (e6371b7)
- plugin-sql: support query tags for units (99d53a9)
- plugin-twitter: merge the query tags into the data unit and handle Ncube data format (6189dea)
- plugin-youtube: merge the query tags into the data unit (90371cb)
0.41.0 (2020-07-30)
- plugin-media: handle youtube videos with multiple thumbnails correctly (061cd51)
- plugin-twitter: enable ncube data format for feed and tweet plugins (071f904)
0.40.0 (2020-07-20)
- plugin-sql: import and export data from postgresql and sqlite (fb32563)
0.39.0 (2020-06-28)
- plugin-http: catch content-type parsing errors (23535bb)
- plugin-sql: store query tags along with queries (1fddcff)
0.38.0 (2020-05-25)
- core: export empty features as a default (aeca70f)
- plugin-youtube: wait for results when querying youtube channels (eb43af7)
- cli: include feature flags from core (eec31ba)
- plugin-http: add optional ncube data format when importing websites (81cfd68)
- plugin-youtube: add optional ncube data format (ce40cba)
0.37.0 (2020-05-04)
- plugin-twitter: improve api error parsing (3b10e27)
- plugin-twitter: remove unneded log statement (d48637f)
- core: provide a marker for a run rather than generating one (9a2692b)
- plugin-csv: add flag to append failure metrics to a file (eeec2b9)
- plugin-fs: add the fs_from_json plugin (f9241dd)
- plugin-sql: expose public API (766ba2d)
- plugin-tap: write data to files in chunks (eb1b098)
0.36.2 (2020-03-19)
- plugin-mongodb: remove projections when querying units (61a8c0c)
0.36.1 (2020-03-19)
- plugin-twitter: don't throw if tweet id is undefined (03cd3c5)
0.36.0 (2020-03-19)
- utils: depend on Tika 1.24 and fix download url (4298672)
- plugin-twitter: check for failing twitter tweets (599e9ee)
0.35.0 (2020-03-17)
- plugin-youtube: tracked missing youtube videos as a separate metric (e1e8181)
0.34.1 (2020-01-29)
- plugin-http: avoid ENAMETOOLONG error for temporary website downloads (95647de)
0.34.0 (2019-11-24)
- plugin-media: cleanup all stale artefacts (a8de96d)
- plugin-media: stream handling of file downloads (ed46a37)
- plugin-youtube: parse video ids from embed urls (c5e6dd1)
- pluigin-elasticsearch: only instrument counts if there is data in the pipeline (c08bf89)
- plugin-http: import URLs using the hypercube model (89ddee9)
- plugin-media: guess file type of downloads when file extension is missing (1250cf5)
- plugin-sql: add plugins to import and export queries (8edfe47)
0.33.1 (2019-11-15)
- plugin-workflow: merge queries through the queries envelope not the unit field (ca386d9)
0.33.0 (2019-11-15)
- plugin-workflow: deep merge queries into unit (56f5cbf)
- test: add missing _sc_annotations field to data generator (08b816c)
- core: set _sc_annotations on every unit (1bfeda6)
- plugin-fs: extend fs_import to set _sc content fields (39f6737)
- plugin-http: extend http_import to set _sc content fields (1b504fc)
- plugin-tap: add option to exclude fields from printing (9e60fdb)
0.32.1 (2019-11-10)
- cli: only load self if the package starts with sugarcube-plugin (b532817)
0.32.0 (2019-11-09)
- cli: load the current package as plugin (637d824)
0.31.2 (2019-11-07)
- plugin-media: don't throw when mosaic generation fails (2a80f3d)
- plugin-media: handle the end event when downloading files (a14809c)
- plugin-twitter: use log counter correctly (811df5c)
0.31.1 (2019-11-07)
- plugin-mail: consistent use of the no_encrypt option (49fcf83)
- plugin-mail: describe event for mail_report instrument (046535d)
- plugin-mail: send emails sequential (a6a8f62)
- plugin-mail: use correct recipient config field (88abd09)
0.31.0 (2019-11-07)
- core: add missing plugin for failure (d2c91a5)
- core: failing test to instrument failures (e8c361a)
- plugin-mail: update to latest date-fns api (97870db)
- cli: print help output for a single plugin or instrument (6721566)
- pugin-mail: deprecate plugins for reporting in favor of an instrument (9287690)
0.30.2 (2019-11-04)
- plugin-http: remove http_screenshot plugin to avoid a dependency mismatch for puppeteer (2350de1)
- plugin-media: disable sandbox of headless browser (41ef4ff)
0.30.1 (2019-11-04)
Note: Version bump only for package sugarcube
0.30.0 (2019-10-31)
- plugin-media: set type to image for screenshots (7481d16)
- cli: gracefully shutdown on SIGINT signal (55632ec)
- plugin-fs: replace fs_unfold with fs_import plugin and extract data with Tika (7f01656)
- plugin-fs: set OCR language for text extraction on file import (94c9726)
- plugin-http: add the http_import plugin (ae475fa)
- plugin-http: extract body and meta data when importing URLs (e4bfee4)
- plugin-media: add media_fetch plugin as replacement for http_get (fbeb98a)
- plugin-media: add plugin to archive URLs in WARC files (57723a6)
- plugin-media: archive websites in parallel (d5a8ac5)
- plugin-media: deprecate http_screenshot in favor of media_screenshot (2ebc971)
- plugin-media: safely add files and extract cleanUp function into fs plugin (c28e51f)
- plugin-media: safely import media files and allow to keep the original (a6b984c)
- plugin-media: take screenshots of websites in parallel (0b17fbe)
- utils: add a progress logging counter (89d97a7)
0.29.0 (2019-10-10)
- core: emit the plugin name on fail (59f1044)
- core: improve on state construction (f58d212)
- plugin-twitter: filter search urls from user timelines (b6d182b)
- core: collect measurements for non plugin metrics (51a7b60)
- plugin-elasticsearch: track new/existing units by source (a712b4d)
- plugin-googlesheets: add api to fetch all sheets on a spreadsheet (bb9a4d1)
- plugin-twitter: parse and normalize tweet and feed urls (a4dd412)
- plugin-youtube: export parser for video and channel urls (9cfd0ef)
- plugin-youtube: parse and normalize video and channel urls (9ab8455)
0.28.1 (2019-09-26)
- plugin-youtube: remove size limit of youtube failing filter plugin (9bd390d)
0.28.0 (2019-09-26)
- plugin-csv: end csv failures file instrument gracefully if no failures were logged (b31e4b5)
- plugin-youtube: check and filtr failing videos in the pipeline (a2070ca)
0.27.2 (2019-09-25)
- plugin-elasticsearch: supply index to bulk call (27c8777)
0.27.1 (2019-09-22)
Note: Version bump only for package sugarcube
0.27.0 (2019-09-22)
- plugin-googlesheets: trim whitespace from queries to move (f5640f6)
- plugin-media: count existing videos (fcde080)
- plugin-media: handle missing locations when generating mosaics (0da1a81)
- fail gracefully if google sheet doesn't exist (abeb49f)
- introduce an instrumentation API and extract the cli logger to an instrument (c68fc9e)
- core: extract the failure logging into the stats instrument (d91a0e2)
- reworked stats instrumentation and store metrics in StatsD (ca1997b)
- plugin-csv: export failures using the csv_failures_file instrument (d60ecb4)
- plugin-elasticsearch: support ES6 and ES7 (20bf2b1), closes #3 #4
- plugin-twitter: log tweets counter (e2cd4e9)
0.26.1 (2019-07-16)
- delay between youtubedl invocations (e623469)
0.26.0 (2019-07-15)
- balance youtube-dl over one or more source ip addresses (1cf42ce)
- configure a random delay between youtubedl invocations (a0c42cc)
0.25.1 (2019-06-18)
- minor improvements to docs generation (62d2e99)
- plugin-http: accept old image locations based on filenames (ce32cb8)
0.25.0 (2019-06-17)
- plugin-fs: synced package lock file (937744b)
- plugin-csv: add a label to the exported failed stats file name (b7d3280)
- plugin-fs: populate media from a file location (21de55f)
- plugin-media: add media_import_file plugin (b9e5dee)
- plugin-media: allow to bind youtubedl to a source ip address (b30cacd)
0.24.0 (2019-04-25)
- plugin-media: respect the mosaic_parallel option (9da966f)
- plugin-media: add the mosaic plugin (2695d56)
- plugin-workflow: add omit plugin (1a6baa8)
- plugin-workflow: add pick plugin (00cba66)
- utils: use a runCmd utility for calling host commands (3c5fd8f)
0.23.0 (2019-01-28)
- plugin-elasticsearch: log query as JSON on import (e15029a)
- plugin-mail: avoid exception on missing stat.duration field (9712e01)
- plugin-twitter: parse twitter users starting with a number as screen names (c29e2b8)
- plugin-twitter: add plugin to fetch individual tweets (352eaa5)
- plugin-twitter: allow full urls as tweet query (a9f8246)
- plugin-twitter: parse twitter users from full url's (3ae4c23)
0.22.0 (2019-01-22)
- plugin-elasticsearch: use supplement instead of complement left in the logs (c998c32)
- plugin-mongodb: remove superfluous spaces in a log message (8eb6396)
- plugin-workflow: add plugin to merge fields from queries into a unit (c54cdb2)
- plugin-youtube: merge query into the video unit (7b11a51)
0.21.0 (2019-01-20)
- plugin-googlesheets: only apply import filters when provided (a6bb047)
- plugin-googlesheets: treat empty strings as null on import (03ced54)
- plugin-mail: validate inputs to be arrays (936a69b)
- plugin-media: change counter debug log when downloading videos (bbbbbcb)
- plugin-googlesheets: create auxiliary sheets when exporting to a spreadsheet (b44eba4)
0.20.1 (2019-01-02)
- core: avoid a stack overflow when updating state a lot (f9bd78d)
- plugin-mail: log mail progress in a safe way (26d9f33)
- plugin-mail: prevent exceptions when sending mails (9d4b8d1)
0.20.0 (2018-12-21)
- plugin-media: force a video download even if it already exists (00705e3)
0.19.3 (2018-12-21)
- plugin-media: handle youtubedl exiting on failure (239c126)
0.19.2 (2018-12-19)
Note: Version bump only for package sugarcube
0.19.1 (2018-12-13)
- plugin-mail: send failed stats to more than one recipient (cad4089)
0.19.0 (2018-12-13)
- plugin-media: remove development artifact (ec6f39e)
- plugin-csv: add the csv_export_failed plugin (48e7ebe)
- plugin-mail: attach the failed stats csv file if available (216370b)
- plugin-media: add a plugin to check the vailability of videos (131bc15)
0.18.0 (2018-12-11)
- plugin-mail: print the number of failures in the error report (f5e188c)
- plugin-twitter: set language on tweets if available (cf49a74)
- plugin-youtube: set language on videos if available (99e0c23)
0.17.0 (2018-12-01)
- core: temp fix for failing unit test when concatenating (1bf4580)
- plugin-youtube: don't throw on non existing location (2d3a260)
- plugin-youtube: rename location field names (d7cdeb5)
- core: make _sc_locations a fixed field (9d23eef)
- plugin-elasticsearch: added the reindex plugin (288aef6)
- plugin-elasticsearch: create an alias for a numbered index (31f3f34)
- plugin-elasticsearch: properly handle custom mappings and fixes (710f762)
- plugin-elasticsearch: set locations mapping (d201869)
- plugin-elasticsearch: use the scroll API for imports (7f8cd1f)
- plugin-twitter: store coordinates location (46b9def)
- plugin-youtube: store recording location when provided (54e5ac8)
- core: improved pipeline runner and data concatenation (0d840b2)
- plugin-elasticsearch: improved import/export (999174c)
0.16.0 (2018-11-26)
- plugin-elasticsearch: make sure to always create the index before accessing it (41727d1)
- plugin-twitter: edited log output for feeds (6dda5a0)
- plugin-youtube: treat thumbnails as images (d6e2077)
- cli: print the memory limit in debug mode (02f4337)
- plugin-facebook: catch failures for api pages and populate the failed stats (a922914)
- plugin-googlesheets: support last access fields when fetching queries (eadaf13)
- plugin-http: catch failures for http downloads and populate the failed stats (c7f541a)
- plugin-mail: include the pipeline name in the failed stats subject (58a9273)
- plugin-youtube: catch failures for videos and populate the failed stats (2c5b773)
- cleaning up failed downloads for media_youtubedl and http_get (2e8d14f)
0.15.0 (2018-11-25)
- core: chain updates of state correctly (cce3dbd)
- plugin-youtube: handle date ranges correctly when fetching channels (c7c5930)
- cli: set a human friendly name for a pipeline (9ed7ae0)
- instrument the pipeline and deliver a mail report (6018451)
- track failed channel queries and youtubedl downloads (ab2a541)
- cli: include the project name in the pipeline config (f7f1228)
- core: added instrumentation to the pipeline run (5019fe2)
- plugin-elasticsearch: added instrumentation to the complement plugins (a263c7c)
- plugin-elasticsearch: added the supplement plugin alias (cd0069d)
- plugin-mail: email failed stats (0dd699f)
- plugin-mail: mail a report of the pipeline run (a7d1e95)
- plugin-twitter: track failed twitter users when fetching timelines (f8a0d94)
- plugin-youtube: properly test for the existence of channels (59855d6)
0.14.0 (2018-11-22)
- plugin-youtube: flatten video queries when done (c40581d)
- plugin-media: run youtube-dl in parallel (6bac8e4)
- plugin-youtube: chunk video downloads in batches of 50 (82e1fe6)
0.13.2 (2018-11-15)
- plugin-youtube: added missing import (3314280)
0.13.1 (2018-11-15)
- plugin-youtube: more lenient query parsing for videos and errors (e7fa464)
0.13.0 (2018-11-14)
- core: supply stats and cache by reference if fits the interface (06fbba6)
- plugin-googlesheets: removed unneeded log statement from queries move (c2878c1)
- plugin-workflow: treat certain query types special when multiplexing (0b258d1)
- plugin-googlesheets: only move queries that exist in the pipeline (27af8d0)
0.12.0 (2018-11-14)
- plugin-workflow: fixed a typo (445c7b6)
- plugin-youtube: avoid exception on missing channels (e9e0582)
- plugin-googlesheets: added the sheets_move_queries plugin (0a9a2d6)
- plugin-googlesheets: extract additional fields from queries (c33c240)
- plugin-googlesheets: provide a default query type to sheets_query (6ec1f1b)
- plugin-workflow: added the workflow_multiplex plugin (ca14cad)
- plugin-youtube: specify queries alternatively as full URL's (2ef004f)
0.10.0 (2018-10-05)
- Handle media urls better to avoid redownloads. (861e183), closes #33
- plugin-ddg: Return empty list when no results (f8d075a)
- plugin-elasticsearch: Don't export units if the envelope is empty. (a022378)
- plugin-elasticsearch: Strip and unstripify nested values. (ef1f65b)
- cli: Increase heap size of sugarcube command to 4GB. (2d9d9b2), closes #9
- core: Added sToA and aToS value conversions. (487f984)
- core: Concats strings and arrays into an array. (ac1e2c9)
- core: Keep original fetch date if present. (4e990b6)
- core: Store the number of missing arguments on a curried function. (f3b171a)
- plugin-ddg: Retry requests with a delay if access is forbidden. (2f80875)
- plugin-ddg: Set user agent and pick correct href. (1a43e94)
- plugin-elastic: Retrieve highlights and score when querying. (d1aa49b)
- plugin-elasticsearch: Added the queryOne operation. (2753149)
- plugin-elasticsearch: Provide custom mappings when creating an index. (cb62a41)
- plugin-elasticsearch: Update units on export. (ed066ae)
- plugin-googlesheets: Added a sheets_move plugin. (f39e586)
- plugin-googlesheets: Added deleteRows to the sheets API. (6c3a2e0)
- plugin-googlesheets: Added getAndRemoveRowsByField to API. (4e85087)
- plugin-googlesheets: Format the header when exporting or appending. (a620098)
- plugin-googlesheets: Import and move rows based on text equality match. (bdca13e)
- plugin-googlesheets: Set data validation for a field by selecting from a list of items. (5681534)
- plugin-twitter: Include the tweet url in the entity. (3aedfac)
- plugin-twitter: Limit searches by language or geocode. (227cc06)
- plugin-youtube: Fetch details for individual videos. (a493377)