Skip to content

Commit

Permalink
Merge pull request #163 from ReadAlongs/main
Browse files Browse the repository at this point in the history
ICLDC Release
  • Loading branch information
roedoejet authored Feb 24, 2023
2 parents 955afe1 + 850b20c commit ed0e985
Show file tree
Hide file tree
Showing 159 changed files with 1,323 additions and 21,842 deletions.
71 changes: 71 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
push:
branches: [ main, release ]
schedule:
- cron: '19 11 * * 3'

jobs:
analyze:
name: Analyze
runs-on: ubuntu-latest
permissions:
actions: read
contents: read
security-events: write

strategy:
fail-fast: false
matrix:
language: [ 'javascript', 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support

steps:
- name: Checkout repository
uses: actions/checkout@v3

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# Details on CodeQL's query packs refer to : https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality


# Autobuild attempts to build any compiled languages (C/C++, C#, Go, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v2

# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun

# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.

# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v2
with:
category: "/language:${{matrix.language}}"
18 changes: 7 additions & 11 deletions .github/workflows/pythonpublish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,14 +43,10 @@ jobs:
tag_name: ${{ steps.tag_version.outputs.new_tag }}
release_name: Release ${{ steps.tag_version.outputs.new_tag }}
body: ${{ steps.tag_version.outputs.changelog }}
- name: Commit bumped version and merge with master
run: |
git config user.name github-actions
git config user.email [email protected]
git add readalongs/_version.py
git commit -m "chore: commit version"
git fetch --unshallow
git push origin release
git checkout master
git merge release
git push origin master
- name: Submit a PR for the bumped version
uses: peter-evans/create-pull-request@v4
with:
commit-message: "chore: commit version"
delete-branch: true
base: main
add-paths: readalongs/_version.py
7 changes: 5 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,11 @@ jobs:
- name: Run tests
run: |
gunicorn readalongs.app:app --bind 0.0.0.0:5000 --daemon
cd test && coverage run run.py prod && coverage xml
cd test
coverage run --parallel-mode run.py prod
DEVELOPMENT=1 coverage run --parallel-mode test_web_api.py
coverage combine
coverage xml
- name: Nitpicking
run: |
Expand Down
12 changes: 7 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
rev: v4.4.0
hooks:
- id: check-yaml
- id: check-json
- id: end-of-file-fixer
- id: trailing-whitespace
exclude: \.svg$
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.3
hooks:
- id: flake8
- repo: local
# Using local repos because these won't work for me from remote repo -EJ
# They're also more convenient because we install them via requirements.dev.txt
Expand All @@ -33,3 +30,8 @@ repos:
language: system
types: [python]
stages: [commit]
- repo: https://github.com/pycqa/flake8
# do flake8 last to avoid duplicate reports
rev: 3.8.3
hooks:
- id: flake8
4 changes: 0 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,6 @@ install:
- pip3 install -r requirements.dev.txt
- pip3 install coverage
- pip3 install codecov
- pip3 install gunicorn

before_script:
- gunicorn readalongs.app:app --bind 0.0.0.0:5000 --daemon

# commands to run the testing suite. if any of these fail, travis lets us know
script:
Expand Down
8 changes: 0 additions & 8 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,6 @@ RUN git clone https://github.com/roedoejet/g2p.git \
# Install ReadAlong-Studio itself
RUN python3 -m pip install -e .

# Run the default gui (on localhost:5000, make sure you use -p 5000:5000 when
# you docker run the image)
CMD python3 ./run.py

# For a production server, comment out the default gui CMD above, and run the
# gui using gunicorn instead:
# CMD gunicorn -k gevent -w 1 readalongs.app:app --bind 0.0.0.0:$PORT

# For the web API, use this CMD instead, the same on our Heroku deployment, except
# with binding to port 5000
# CMD gunicorn -w 4 -k uvicorn.workers.UvicornWorker readalongs.web_api:web_api_app --bind 0.0.0.0:$PORT
2 changes: 1 addition & 1 deletion Procfile
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
# Command for launching the web API server for ReadAlongs-Studio on Heroku
web: gunicorn -w 4 -k uvicorn.workers.UvicornWorker readalongs.web_api:web_api_app
web: gunicorn -w 8 -k uvicorn.workers.UvicornWorker readalongs.web_api:web_api_app
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# ReadAlong-Studio

[![codecov](https://codecov.io/gh/ReadAlongs/Studio/branch/master/graph/badge.svg)](https://codecov.io/gh/ReadAlongs/Studio)
[![Build Status](https://github.com/readalongs/Studio/actions/workflows/tests.yml/badge.svg?branch=master)](https://github.com/ReadAlongs/Studio/actions)
[![codecov](https://codecov.io/gh/ReadAlongs/Studio/branch/main/graph/badge.svg)](https://codecov.io/gh/ReadAlongs/Studio)
[![Build Status](https://github.com/readalongs/Studio/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/ReadAlongs/Studio/actions)
[![PyPI package](https://img.shields.io/pypi/v/readalongs.svg)](https://pypi.org/project/readalongs/)
[![GitHub license](https://img.shields.io/github/license/ReadAlongs/Studio)](https://github.com/ReadAlongs/Studio/blob/master/LICENSE)
[![GitHub license](https://img.shields.io/github/license/ReadAlongs/Studio)](https://github.com/ReadAlongs/Studio/blob/main/LICENSE)
[![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg?style=flat-square)](https://github.com/ReadAlongs/Studio)
[![Documentation Status](https://readthedocs.org/projects/readalong-studio/badge/)](https://readalong-studio.readthedocs.io)

Expand Down Expand Up @@ -38,8 +38,7 @@ The concept is a web application with a series of stages of
processing, which ultimately leads to a time-aligned audiobook -
i.e. a package of:

- SMIL file describing time alignments
- TEI file describing text
- ReadAlong XML file describing text
- Audio file (WAV or MP3)

Which can be loaded using the read-along [web component](https://github.com/roedoejet/ReadAlong-Web-Component). See also [Studio Output Realizations](https://readalong-studio.readthedocs.io/en/latest/outputs.html).
Expand Down Expand Up @@ -161,7 +160,9 @@ This page lists only the most basic commands.

For more information about how the command line interface works consult the interactive [API Documentation](https://readalong-studio.herokuapp.com/api/v1/docs).

For information on spinning up your own dev Web API server locally, have a look at [web\_api.py](readalongs/web_api.py).
For information on spinning up your own dev Web API server locally, have a look at [web\_api.py](readalongs/web_api.py), but briefly, if you are running it locally for development, use:

DEVELOPMENT=1 uvicorn readalongs.web_api:web_api_app --reload

#### /langs

Expand All @@ -173,7 +174,7 @@ This endpoint is a remote procedural call that assembles the data needed to buil

### Studio web application

ReadAlong-Studio has a web interface for creating interactive audiobooks. The web app can be served by first installing ReadAlong-Studio and then running `python3 run.py`. A web app will then be available on port 5000.
The ReadAlong-Studio web interface is available at https://readalong-studio.mothertongues.org/ and the source code is available here: https://github.com/ReadAlongs/Web-Component

### Docker

Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,6 @@ To build the documentation and review your own changes locally:

## Publish the changes

Once your changes are pushed to GitHub and merged into `master` via a Pull
Once your changes are pushed to GitHub and merged into `main` via a Pull
Request, the documentation will automatically get built and published to
https://readalong-studio.readthedocs.io/en/latest/
52 changes: 29 additions & 23 deletions docs/cli-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ This page contains guidelines on using the ReadAlongs CLI. See also
The ReadAlongs CLI has two main commands: ``readalongs make-xml`` and
``readalongs align``.

- If your data is a plain text file, you can run ``make-xml`` to turn it into
XML, which you can then align with ``align``. Doing this in two steps allows
you to modify the XML file before aligning it (e.g., to mark that some text is
in a different language, to flag some do-not-align text, or to drop anchors
in).
- If your data is a plain text file, you can run ``make-xml`` to turn
it into ReadAlongs XML, which you can then align with
``align``. Doing this in two steps allows you to modify the XML file
before aligning it (e.g., to mark that some text is in a different
language, to flag some do-not-align text, or to drop anchors in).

- Alternatively, if your plain text file does not need to be modified, you can
run ``align`` directly on it, since it also accepts plain text input. You'll
Expand All @@ -36,13 +36,13 @@ then used as input to ``align``.
Getting from TXT to XML with readalongs make-xml
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Run :ref:`cli-make-xml` to make the XML file for ``align`` from a TXT file.
Run :ref:`cli-make-xml` to make the ReadAlongs XML file for ``align`` from a TXT file.

``readalongs make-xml [options] [story.txt] [story.xml]``
``readalongs make-xml [options] [story.txt] [story.readalong]``

``[story.txt]``: path to the plain text input file (TXT)

``[story.xml]``: Path to the XML output file
``[story.readalong]``: Path to the XML output file

The plain text file must be plain text encoded in ``UTF-8`` with one
sentence per line. Paragraph breaks are marked by a blank line, and page
Expand Down Expand Up @@ -72,12 +72,18 @@ and they can also be found in the :ref:`cli-make-xml` reference.
So, a full command for a story in Algonquin, with an implicit g2p fallback to
Undetermined, would be something like:

``readalongs make-xml -l alq Studio/story.txt Studio/story.xml``
``readalongs make-xml -l alq Studio/story.txt Studio/story.readalong``

The generated XML will be parsed in to sentences. At this stage you can
edit the XML to have any modifications, such as adding ``do-not-align``
as an attribute of any element in your XML.

The format of the generated XML is based on [TEI
Lite](https://tei-c.org/guidelines/customization/lite/) but is
considerably simplified. The DTD (document type definition) can be
found in the ReadAlong Studio source code under
`readalongs/static/read-along-1.0.dtd`.

.. _dna:

Handling mismatches: do-not-align
Expand Down Expand Up @@ -130,12 +136,12 @@ Use cases for DNA
Aligning your text and audio with readalongs align
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Run :ref:`cli-align` to align a text file (XML or TXT) and an audio file to
Run :ref:`cli-align` to align a text file (RAS or TXT) and an audio file to
create a time-aligned audiobook.

``readalongs align [options] [story.txt/xml] [story.mp3/wav] [output_base]``

``[story.txt/xml]``: path to the text file (TXT or XML)
``[story.txt/ras]``: path to the text file (TXT or RAS)

``[story.mp3/wav]``: path to the audio file (MP3, WAV or any format
supported by ffmpeg)
Expand Down Expand Up @@ -173,14 +179,14 @@ See above for more information on the ``-l, --language`` argument.

A full command could be something like:

``readalongs align -f -c config.json story.xml story.mp3 story-aligned``
``readalongs align -f -c config.json story.readalong story.mp3 story-aligned``

**Is the text file plain text or XML?**

``readalongs align`` accepts its text input as a plain text file or an XML file.
``readalongs align`` accepts its text input as a plain text file or a ReadAlongs XML file.

- If the file name ends with ``.txt``, it will be read as plain text.
- If the file name ends wiht ``.xml``, it will be read as XML.
- If the file name ends with ``.xml`` or ``.readalong``, it will be read as ReadAlongs XML.
- With other extensions, the beginning of the file is examined to
automatically determine if it's XML or plain text.

Expand Down Expand Up @@ -301,7 +307,7 @@ falling back to ``eng`` and then ``und`` (see below) when needed.

.. code-block:: bash
readalongs make-xml -l fra,eng myfile.txt myfile.xml
readalongs make-xml -l fra,eng myfile.txt myfile.readalong
readalongs align -l fra,eng myfile.txt myfile.wav output-dir
The "Undetermined" language code: und
Expand Down Expand Up @@ -347,10 +353,10 @@ The following series of commands:

::

readalongs make-xml -l l1,l2 file.txt file.xml
readalongs tokenize file.xml file.tokenized.xml
readalongs g2p file.tokenized.xml file.g2p.xml
readalongs align file.g2p.xml file.wav output
readalongs make-xml -l l1,l2 file.txt file.readalong
readalongs tokenize file.readalong file.tokenized.readalong
readalongs g2p file.tokenized.readalong file.g2p.readalong
readalongs align file.g2p.readalong file.wav output

is equivalent to the single command:

Expand Down Expand Up @@ -389,7 +395,7 @@ Example:
.. code-block:: xml
<?xml version='1.0' encoding='utf-8'?>
<TEI> <text xml:lang="eng"> <body>
<read-along version="1.0"> <text xml:lang="eng"> <body>
<anchor time="143ms"/>
<div type="page">
<p>
Expand All @@ -400,7 +406,7 @@ Example:
</p>
</div>
<anchor time="6.74s"/>
</body> </text> </TEI>
</body> </text> </read-along>
Anchor semantics
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -473,7 +479,7 @@ Example:
.. code-block:: xml
<?xml version='1.0' encoding='utf-8'?>
<TEI> <text xml:lang="eng"> <body>
<read-along version="1.0"> <text xml:lang="eng"> <body>
<silence dur="1s"/>
<div type="page">
<p>
Expand All @@ -484,7 +490,7 @@ Example:
</p>
<silence dur="1s"/>
</div>
</body> </text> </TEI>
</body> </text> </read-along>
Silence use cases
^^^^^^^^^^^^^^^^^
Expand Down
10 changes: 4 additions & 6 deletions docs/outputs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Elan/Praat files
Web Component
-------------

When you have standard output from ReadAlong-Studio, consisting of 1) a text file (XML) 2) an audio file and 3) an alignment file (SMIL)
When you have standard output from ReadAlong-Studio, consisting of 1) a ReadALong file (XML) and 2) an audio file
you can mobilize these files to the web or hybrid mobile apps quickly and painlessly.

This is done using the ReadAlong WebComponent. Web components are re-useable, custom-defined HTML elements that you can embed in any HTML, regardless of which
Expand All @@ -32,21 +32,19 @@ Below is an example of a minimal implementation in a basic standalone html page.

<body>
<!-- Here is how you declare the Web Component -->
<read-along text="assets/sample.xml" alignment="assets/sample.smil" audio="assets/sample.wav"></read-along>
<read-along href="assets/sample.readalong" audio="assets/sample.wav"></read-along>
</body>
<!-- The last step needed is to import the package -->
<script type="module" src='https://unpkg.com/@roedoejet/readalong@^0.1.6/dist/read-along/read-along.esm.js'></script>
<script nomodule src='https://unpkg.com/@roedoejet/readalong@^0.1.6/dist/read-along/read-along.js'></script>
<script type="module" src='https://unpkg.com/@readalongs/web-component@^1.0.0/dist/web-component/web-component.esm.js'></script>
</html>


The above assumes the following structure:

| web
| ├── assets
| │ ├── sample.smil
| │ ├── sample.wav
| │ ├── sample.xml
| │ ├── sample.readalong
| ├── index.html
|
|
Expand Down
Loading

0 comments on commit ed0e985

Please sign in to comment.