Skip to content

Releases: ReadAlongs/Studio

v1.2.0

18 Dec 21:10
e77e478
Compare
Choose a tag to compare

✨ New Features

  • 36b8b0d - add run-web-api.sh to easy launch dev mode API server (PR #248 by @joanise)
  • 8d7be24 - tests: add silence_c_stderr to silence SoundSwallower in test suites (commit by @joanise)
  • 472672d - add --quiet option to test/run.py and refactor the runner (commit by @joanise)
  • 68d833b - add api.convert_to_readalong (commit by @joanise)
  • d47c24b - add convert_to_offline_html() to api.py (commit by @joanise)

🐛 Bug Fixes

  • 791e0b9 - deps: remove unnecessary dependency declarations (commit by @joanise)
  • 3cab240 - catch exceptions due to failure to create XML from text (PR #242 by @joanise)
  • 1738c15 - deps: relax the pydub requirement to allow the latest (PR #244 by @joanise)
  • eec5662 - repair how multiple examples are declared in web_api (PR #243 by @joanise)
  • 99da050 - tests: silence test_anchors and test_force_align suites (commit by @joanise)
  • 65149e5 - tests: silence all remaining noisy test suites (commit by @joanise)
  • 954211c - readalongs should not NFD normalize its input text (commit by @joanise)
  • 93c9cb0 - our HTML should only have the viewport meta once (commit by @joanise)

♻️ Refactors

  • cb3c5e4 - type file names as Union[str, os.PathLike], à la PEP 519 (commit by @joanise)
  • 25f2dad - give the API convert functions self-documenting names (commit by @joanise)

🔧 Chores

v1.1.0

17 Sep 22:05
Compare
Choose a tag to compare

💥 BREAKING CHANGES

  • due to edaca6a - drop support for EOL Python 3.7 (commit by @joanise):

    Python 3.7 is no longer supported

✨ New Features

🐛 Bug Fixes

  • d5520ea - the CLI needs to output utf8 on Windows (commit by @joanise)
  • 5f446a0 - make Studio work with old and new g2p until the next g2p release (commit by @joanise)
  • 96136b1 - relax the pympi-link reqs that was unecessarily strict (commit by @joanise)
  • 65158d5 - deps: make g2p^=1.0.date instead of >=1.0.date (commit by @joanise)
  • b5d1192 - deploy only 4 workers because we use too much RAM (commit by @joanise)
  • daa41a2 - issue the lexicon warning for all lexicon-based g2p's (commit by @joanise)
  • 6bf0f91 - all empty words from g2p should get treated as error situations (commit by @joanise)
  • 6dfd7fe - deps: adjust deps for latest g2p update (commit by @joanise)
  • 9a526c2 - compensate for soundswallower model breakage (commit by @dhdaines)
  • adfe51f - g2p main is about to be 2.0, but we still want 1.1 on Heroku (commit by @joanise)
  • fba7b3a - api: pin anyio to less than 4.0.0 (commit by @roedoejet)
  • eb03934 - make Studio compatible with Pydantic 2 and thus g2p 2 (commit by @joanise)
  • 022bd31 - coloredlogs: remove bold bug (commit by @joanise)
  • 53be622 - deps: lock numpy<2 because 2.0.0 is coming and has breaking changes (commit by @joanise)
  • ddbfaee - bump lxml to support Python 3.11 on Windows (commit by @joanise)
  • 81effb9 - work around missing/broken editdistance on python 3.12 (commit by @dhdaines)
  • fc5db32 - bump fastapi to minimum Pydantic v2 compatible version (commit by @joanise)
  • 8d8dfc2 - editdistance can now come from PyPI again for Py 3.12 (commit by @joanise)
  • 787ead4 - style: bump black to 24.3.0 to fix black's first CVE (commit by @joanise)
  • 06d9b17 - with pydantic 2, Field only takes examples plural (commit by @joanise)
  • 4dd0e42 - the current web-component version is 1.4.x (commit by @joanise)
  • 6ac285f - update the fallback offline bundles to web-c 1.4.0 (commit by @joanise)
  • 469f9ec - updated the exported readme to include default upload path and image-asset-folder attribute (commit by @deltork)
  • c4cac88 - added meta data to generated HTML (commit by @deltork)
  • 7548038 - docs: fix errors in sphinx docs before conversion to mkdocs (commit by @joanise)
  • d928373 - added optional id to meta tag attributes (PR #232 by @deltork)
  • b432bad - very minor typo correction in cli.py (commit by @MENGZHEGENG)
  • 6087b74 - very minor typo correction in cli.py (commit by @MENGZHEGENG)
  • f114b8d - deps: exclude panphon 0.21 not compat with Python 3.8 (PR #237 by @joanise)
  • 6fa849f - deps: gunicorn 23 has vulnerability fixes that seem worthwhile (PR #239 by @joanise)
  • 0533b4f - remind user to install ffmpeg when an audio file is not found (commit by @joanise)
  • fa7b191 - deps: we are actually compatible with numpy 2 (commit by @joanise)
  • b1db242 - deps: remove panphon declaration since g2p fixed it (commit by @joanise)
  • c4328db - ci: remove the broken sigstore code, and publish only real versions (commit by @joanise)
  • 56b5e1d - fix (or attempt to fix) the pypi publication process (commit by @joanise)
  • 51311e9 - ci: pypi publish does not like the sigstore.json files (commit by @joanise)

⚡ Performance Improvements

  • 6c2eaa6 - update Procfile to start with 5 users instead (commit by @marctessier)
  • 8bf389d - with lower memory use we can have 5 workers again (commit by @joanise)
  • baffdd4 - defer expensive imports to optimize readalongs -h (commit by @joanise)
  • 2daf991 - even more aggressively optimize readalongs -h (commit by @joanise)

♻️ Refactors

  • cd5d6df - move get_langs out to g2p (commit by @joanise)
  • 774e178 - get_langs returns its output in order of codes, no need to re-sort (commit by @joanise)
  • 256d79b - and g2p has been released, only use g2p.get_arpabet_langs (commit by @joanise)
  • 1ad311b - use g2p's lexicon-based eng mapping (commit by @joanise)
  • c34ca8c - strip now-redundant lexicon-g2p code from Studio (commit by @joanise)
  • 94d49ae - let us parse and load XML in just one place (commit by @joanise)
  • 396b2f1 - simplify parsing input_text in web_api /assemble (commit by @joanise)
  • e338f97 - docs: automatically convert from Sphinx .rst to mkdocs .md (commit by @joanise)
  • 191e1fb - **...
Read more

Release v1.0.20230228

28 Feb 23:27
Compare
Choose a tag to compare

1.0.20230228 (2023-02-28)

Features

  • report empty g2p for a word as a warning (b89de62)

Bug Fixes

  • make capture_logs work correctly with Python >= 3.9 (aa1ffca), closes #162
  • where there are no words to align, return 422, not 500 (61d45e0)
  • dtd: effective-g2p-lang missing from w def (dc944ad)
  • by default, output each g2p error at most twice (a1f3c5d)
  • clarify the settings and run the API by default (d5ca78e)
  • do not fail when the lang code is invalid (3b2a433)

Reverts

  • Revert "chore: specify python 3.8 runtime" (8012258)

Code Refactoring

  • we no longer need to support g2p<=0.5.20211029 imports (04eb40e)

Build Systems

  • maximally fix the dockerfile (e4ad7b4)
  • minimally fix dockerfile (f51e841)

Continuous Integration

  • exclude Python 3.11 on Windows from matrix testing, due to dependency error (d80d8ad)
  • run full matrix testing, but only on push to main and release (ea72d5f)

Release v1.0.20230224

25 Feb 00:00
ed0e985
Compare
Choose a tag to compare

1.0.20230224 (2023-02-25)

⚠ BREAKING CHANGES

  • smil-ectomy
  • avoid using dict for things that are lists
  • new web API version
  • use .ras not .xml
  • no more smil
  • update to .ras file extension in output

Features

  • a simple DTD for standalone readalongs (f94174b)
  • add time and dur attributes to w (5e92b91)
  • avoid using dict for things that are lists (20fc1a5)
  • basically s/tei/readalong/gi (5465d22)
  • capture the logs from /assemble endpoint (5713abc)
  • introduce better CORS environment variables (1eb5d74), closes #146
  • introduce better CORS environment variables (d8ab4cf), closes #146
  • log message to say we are in development mode (a1f53bc)
  • new web API version (bec85ee)
  • no more smil (eb9d25b)
  • output to .ras (8d4f76b)
  • refine the DTD somewhat (b7285f5)
  • set our .readalong format to version 1.0 for publication (2f0da60)
  • smil-ectomy (24dabbd)
  • update .ras to .readalong (525facf)
  • update to .ras file extension in output (163de15)
  • update to href= in readalong component (79b8a64)
  • use .ras not .xml (09b34d6)

Bug Fixes

  • accept and use dur not duration (b65a5dd)
  • add class to DTD and update version (903ddd5)
  • add xml:lang and anchors everywhere (7192fa6)
  • address XML external entity expansion vulnerabilities (d0c57f3)
  • correct main guard in test_package_urls.py (fbda869)
  • don't create blank pages (54d23d5), closes #136
  • filter ASCII langs from the Studio-Web via web_api (e63406f)
  • frantic and unsuccessful attempts to make CORS work (52b9a39)
  • handle requests.get() timeouts correctly (464a3cc)
  • make test case valid (fdd8cb0)
  • only wait 10 seconds for JS_ and FONT_BUNDLE_URLs (a67e360)
  • tell the user why their config.json is not valid (4deb32e)
  • test and fix load_xml_zip (10718dc)
  • try validating a different way to see if the CodeQL warning goes away (6e62789)
  • update bundle.css and bundle.js to @readalongs/[email protected] (a462688)
  • update package URLs (4eebbb3)
  • use .xml not .ras to avoid breaking MIME guessing (19fc2f1)
  • validate path request by /file endpoint in views.py (5ba27ad)
  • when g2p fails completely, send the log with the exception (f772eec)
  • woohps tyop (367e539)
  • words may not be aligned (e.g. do-not-align) (586154d)
  • docs: fix formatting of /langs endpoint sample output (7059978)
  • write out a version number in the .readalong format (2e28d86)

Tests

  • add a couple more tests (c68f06e)
  • add bogus alignments (959bda4)
  • add test of RAS XML validation (7dd3072)
  • appease the codecov beast (5dac6f3)
  • basically s/tei/readalong/gi (4c1ed5a)
  • not sure why "lang" not "xml:lang" (96ae876)
  • test new web API (adb6d3e)
  • there is no text + alingment ther eis only readalong (5e623b7)
  • tolerate unpkg timeouts as non-failures (dbd3942)
  • update .ras to .readalong (300a622)
  • update for new component and file format (1d2ca59)

Code Refactoring

  • capture logs with a context manager (87716bd)
  • change master branch name to main (3b64837)
  • move all etree.parse calls to a single well-tested function (c2996b8)
  • reformat (1874e96)
  • webapi: change back to v1 (c80224a)
  • remove deprecated studio (692d630)
  • switch to .readalong extension (c4c6c89)

Continuous Integration

  • activate CodeQL code analysis (7f8761b)
  • combine debug and non-debug web_api tests (214b081)
  • only run CodeQL on cron and push to master and release (6d44875)
  • submit PR instead of pushing version bump (6f181a7), closes #83
  • test web server in development mode too (e7d892c)

Documentation

  • setup: improve pypi (ec00b4f)
  • better document how to install and run the Web API (996a96a)
  • clean up /langs doc (43dd309)
  • document new environment variables (34b4435)
  • remove smil and add ra...
Read more

Release v0.2.20221114

14 Nov 17:27
955afe1
Compare
Choose a tag to compare

0.2.20221114 (2022-11-14)

Features

  • add --align-mode to readalongs align (6023367)
  • improve programmatic API to readalongs (45bfd5c)
  • silence aligner logs unless --debug-aligner is used (69ab713)
  • starting an API for readalongs commands (08b2307)
  • api: created web api with fastapi (2d11d73)
  • api: the API fns now return (status, exception, log) (639c2a9)
  • flask-app: starting to make flask app use API (ddd5071)
  • Add a (hidden) -oo / --output-orth option to control the output orthogrpahy (93d7228)
  • Add header and theme from config.json (c0c8859)
  • add heroku support (17c7d82)
  • apply header and theme to basic html page too (035aeca)
  • clean up log handling and fix debug_aligner (d0edc09)
  • convert_alignment now also supports ELAN eaf (6fcc404)
  • endpoint /convert_alignment supports srt and vtt (e035cb2)
  • error handling and testing for the -oo option (6316d29)
  • new /convert_to_TextGrid endpoint in web_api (42f724f)
  • parse_smil() with unit testing (84af6b0)
  • re-introduce the sub-word functionality in Studio (6349814)
  • support "acoustic_model" in config (30537c2)
  • update for soundswallower 0.4.0 (7c98e45)

Bug Fixes

  • adjust Docker to requirements.* changes (ad80da3)
  • align should delete its temporary files (1b09db9)
  • always explicitly declare the encoding when you open a file (afb0908)
  • api.prepare() still needs to exist, with a deprecation warning (79d04ad)
  • case-insensitive option matching done consistently (a1f0805)
  • clean up our own temp files! (a27a0c7)
  • cli.align() should not modify its arguments (b422083)
  • default for save_temps is None so check that, not truthiness (d955f16)
  • don't save SoundSwallower logs on Windows, it's buggy (ed3e4a8)
  • extract sentences correctly on page changes (e9a2a18), closes #70
  • failure to g2p should be 422, not 400 (b0da504)
  • get_langs() should only return supported langs in the dict (6f8e458)
  • ignore BOMs when reading files, though never generate them (ca8e264)
  • ignore whitespace on blank lines for paragraph and page breaks (aea222f)
  • in 2022, "python" is Python 3 (64051a2)
  • make sure final_end is defined (9e33ecb)
  • make sure the API accepts pathlib.Path objects (50972ae)
  • minor bugs and efficiency improvements (5f0080b)
  • more robust sentence extraction for srt, vtt, TextGrid (3218fe4)
  • new acoustic model requires new soundswallower (973611b)
  • no idea why set_string method does not exist on Travis-CI? (93764a9)
  • noisewords from the acoustic model to avoid misalignments/warnings (bb3c0aa)
  • on Windows, don't let _version.py be generated with CRLF (dedb065)
  • peg web-component version to ^0.1.6 (bbb7edd)
  • remove dead code (that didn't use noisewords properly) (7ab522b)
  • remove unsupported encoding argument from web_api (41d5912)
  • require soundswallower 0.4.1 to fix windows (50cbc37)
  • restore backwards-compatiliby for getLangs==get_langs (93a9b26)
  • restrict to 0.2.x soundswallower (cbf1ef9)
  • switch to binary mdefs (dc212a9)
  • the FSG/JSGF filters out empty ARPABET, not empty words (dc73638)
  • typo (ddd9f65)
  • undo change to pbeam, that was not refactoring! (5e44376)
  • update soundswallower to 0.2.0, fixes failure on long inputs (31606a6)
  • use --debug-g2p instead of --g2p-verbose (cd28076)
  • flask-app: update for current CLI; nicer logs (065e723)
  • update model layout for soundswallower 0.4.0 (d923543)
  • was missing get_string()! mystery solved! (80e4697)
  • work around bug in SoundSwallower on empty alignment (8cfe119)
  • api: allow origins from studio app (3e9ef8a)
  • api: make TextGrid work in studio demo and through API (0fc69a6)
  • LICENSE: state in LICENSE difference in model license (6814640)
  • sub-word: the word is all its subselements, not just word.text (1dc8527)
  • test: fix the sub-word test suite (dd52238)
  • test: make test_audio.py compatible with pydub 0.25.1 (d1a4712)
  • test: make test_indices.py compatible with g2p PR#166 (9119681)
  • test: test suite should be compatible with older g2p versi...
Read more

Release v0.2.20220126

26 Jan 19:47
8d5423f
Compare
Choose a tag to compare

0.2.20220126 (2022-01-26)

Features

  • cli: "und" is now added by default to -l list of languages (fd6189b)
  • cli: accept comma as sparate for lists as well as colon (4f9eafd)
  • cli: added readalongs langs command (dfaaf15)
  • silence: add fallbacks and exception handling (e61d201)

Bug Fixes

  • g2p: better error messages on invalid language codes (9e71372)
  • requirements: remove text-unidecode, no longer used directly (1458fd0)
  • requirements: studio does not actually use Flask-Cors (5128d91)
  • test: reloaded audio file should tolerate a small duration change (ceff68a)
  • avoid stack trace when no non-noise segments are found (fixes #88) (07817fd)
  • video: force audio mimetypes for video formats (f543055)
  • be less strict about failures to guess mime type (17c2492)
  • better error messages on bad utf8 plain text input. Fixes #22 (3c55e15)

Performance Improvements

  • optimize the CLI, mostly by deferring expensive imports (5c9e3fb)

Tests

  • increase test coverage for dev.cli (fdaccfe)

Documentation

  • cli: add better cli documentation for readalongs (db8f826)
  • align -o is for additional formats, on top of XML+SMIL (442e01b)
  • document how to contribute to the docs/ folder (8fa98dd)
  • cli: document the recent changes to the CLI (67f0593)
  • document installation using Anaconda on Windows (d75c7cc)
  • improvements from @roedoejet feedback on PR #93 (1d90f10)
  • mention OpenSamples in README.md (6184732)
  • polish the updated README.md (5ae1cd3)
  • README.md improve with feedback at team meeting (5b511b0)
  • recommend miniconda instead of the full anaconda (3014dae)
  • remove unstable warning (8c10a5d)
  • update TOC in README.md (663a3d8)

Continuous Integration

Code Refactoring

  • silence: change syntax for adding silence and allow output to variety of audio formats (a75dca2)
  • undo und work here since it is now done in g2p (b97e1e7)
  • cli: allow multiple -o values to join colon-joined (9f0a837)
  • cli: change formatting for align output formats help (3588f9e)
  • cli: refactor output formats to -o argument (f34f426)
  • cli: remove -i; auto-determine XML vs plain text (77a1ac8)
  • cli: remove alignment unit option from cli (3002626)
  • cli: remove epub from cli (3e53b72)
  • cli: replace --g2p-fallback option by -l with multiple languages (78ab484)
  • test: stub out SoundSwallower to speed tests that don't care about its output (3428184)
  • remove docs for epub (55d8e04)
  • simplify python version check and move it to init.py (ca2394f)
  • use the more meaningful exceptions from make_g2p when available (52470e9)

Styles

  • add a .pylintrc to tweak pylint output to our liking (572b51b)
  • make test_package_urls.py executable and add fn docstring (2751752)
  • pylint and improve all test suite code (d3a80ee)
  • remove unused imports from views.py (365b3ad)

Release v0.1.20211013

13 Oct 17:45
Compare
Choose a tag to compare

0.1.20211013 (2021-10-13)

Features

  • anchors: extract_section fn for audio files with testing (df974a2)
  • html: first commit for web component html output (bfc1038)
  • add b64 encoding of embedded images (807d913)
  • add silence insertion feature (1663779)
  • anchor times now supported in h/m/s/ms, like in Audacity (c4b4ca8)
  • non-caching-server-3.7.py, compatible with Python 3.7 (fe7df92)
  • package: add packaging of fonts and js bundles (252e605)

Bug Fixes

  • correct attribute name in readalongs (17b4dfe)
  • Docker on Windows compatibility issue (6ee07b1)
  • issue #73, fix correction for multiple DNA segments (277b530)
  • round segment times to 3 digits, i.e. 1ms precision (54c9475)
  • temp files have to be closed to get auto-cleaned on Windows (1260d75)
  • use consistent Copyright notices (d4c61bf)
  • ci: add push (ad2c50e)
  • test: fix mimetype issue with test for b64 encoding (c25502a)

Continuous Integration

  • commit and merge bumped version after release (94b547e)
  • rtd: upgrade read the docs to version 2 (d3ff623)

Styles

  • apply a bunch of pylint recommendations (3eb2a33)
  • blackify non-caching-server-3.7.py (ee480fb)
  • convert some docstrings to google (#16) (7534545)
  • don't hide what is going on running make in docs (f9e68e3)
  • have data/ej-fra.xml also match the current readalongs prepare output (c798f29)
  • nicer argument doc for create_input_tei (db70984)
  • reformat make_smil.py docstring to Google format, as per issue #16 (f0aba91)

Code Refactoring

  • move audio time adj fns to audio_utils.py (7d47de7)
  • align: make align work on a loop of sequences (WIP anchors) (c4bc7ec)
  • change pystache to chevron (d1797c0)
  • make LANGS, LANG_NAMES and parse_g2p_fallback importable (60cf1e9)
  • narrow the try/except to just the parse_g2p_fallback call (753c77e)
  • polish Toby's util.py refactoring (b04e4fb)
  • remove dead create_input_xml (1940bea)
  • remove unused jsgf file (562780c)
  • rename non-caching-server.py -> 3.9.py since only Python 3.9 compat (862469f)
  • splice dna_utils.py out of audio_utils.py (b20d072)
  • use nrc logo instead (d17aa7a)
  • docs: create advanced-use.rst and cli-guide.rst (6cf12e2)
  • docs: remove cli-user-guide*, all contents is now elsewhere (06557ec)
  • docs: rename cli.rst->cli-ref.rst in prep for cli-guide.rst (df403bf)
  • parse_time: remove unreachable else: raise statement (97b563f)
  • silence: change to using proper parse time function instead of only ms (a646574)
  • silence: change variable to be explicit about ms (dda3414)
  • test: create basic test case class (f189614)
  • test: move the align --html test to its own function (6c9dc65)
  • test: name temp dirs after the classes (212df57)
  • test: rename basic test case file (3c52ac2)

Tests

  • unit testing for issue #73 (90cd88a)
  • anchors: check that partial wav files were created (d8eb857)
  • anchors: improve test coverage (97ff810)
  • anchors: rfix ej-fra-anchors2.xml so it's alignable (8ae8850)
  • anchors: test cases for aligning with anchors (7468589)
  • anchors: test data for processing anchors (5c4f6c3)
  • add basic test for single file html output (86d8f1b)
  • better message for tests that might fail when dependencies change (8147b03)
  • disable test_align_cli.test_permission_denied since it is unstable (251ae45)
  • images for RAS testing, created by EJ@NRC (2cc6ed3)
  • improve unittesting for package (231005e)
  • package: add test for bundled web component assets (c190fa5)

Documentation

  • add TL;DR to Contributing.md (552683a)
  • convert audio_utils.py docs to Google standard (8b2e8e1)
  • create CLI guide vs CLI reference sections (ac35316)
  • document run.py vs readalongs/run.py (e4e0794)
  • fix the README.md badges (6c10a5d)
  • improve docstring readability (c95e5f0)
  • improvements to the CLI documentation (9f26b98)
  • include cli-user-guide in read-the-docs output (7888222)
  • make README.md and docs/cli-user-guide.md more coherent ...
Read more

Release v0.1.20210825

25 Aug 20:44
6bbbea7
Compare
Choose a tag to compare