Add Spectrogram Visualization to Recordings #16

bluemellophone · 2024-01-19T12:16:54Z

This PR adds the visualization and storing of spectrogram images for audio recordings.

Updated Docker Image and Database

!!I MPORTANT !!!

Before you begin testing this PR, please rebuild your Docker image and update your local database version with the following:

docker compose down
docker compose down  # Run twice to fully bring down all container
docker compose build
docker compose run --rm django ./manage.py migrate
docker compose up -d

There has been a new dependency for OpenCV added to the Python environment and the previous container will fail to start if you do not update your Django/Celery image. There is also a single database migration to add the new Spectrogram database table.

FFT Parameters and Rendering Options

The spectrograms are computed with a window size of 1ms and are formatted to be 300 pixels high. The width of the spectrogram is determined by the duration and sampling rate of the underlying audio waveform. By default, the rendering will chunk long audio files into smaller segments and will stitch the resulting set of spectrograms together to create a final image. The spectrogram is resized such that the width is four times the number of milliseconds in the audio clip. For example, if the audio clip is 5 seconds long (5,000 milliseconds), then the resulting spectrogram will be 20,000 pixels wide by 300 pixels tall.

The spectrograms use a frequency range of 5Khz to 200KHz, with a 2Khz padding on the top and bottom for visual clarity. The time and frequency axis are plotted with linear scaling.

Spectrogram REST Endpoints

There are two new API endpoints for the spectrograms:

Version 1 (realtime) - /api/v1/recording/1/spectrogram
Version 2 (compressed) - /api/v1/recording/1/spectrogram/compressed

Both endpoints return a Base64 encoded JPEG image inside a serialized JSON response message.

REST API Option 1 - Realtime Spectrogram

Below is a 1-second sample of a spectrogram conversion. Note the large gaps in time between the chirps.

The API returns a response in the following format:

{
    "base64_spectrogram": str,  # base64-encoded image
    "spectroInfo": {
        "width": int,  # pixels
        "height": int,  # pixels
        "start_time": int,  # always 0
        "end_time": int,  # milliseconds
        "low_freq": int,  # hz
        "high_freq": int,  # hz
    }
}

REST API Option 2 - Compressed Spectrogram

Below is a compressed version of a full 5-second audio clip, where segments around peak amplitudes are clipped out and stacked next to each other. This visualization makes it easier to quickly visualize the overall features of the audio signal by removing large gaps of empty recording.

This image is generated on-the-fly and is computed relatively quickly (i.e., <0.5s) once a pre-computed spectrogram is available. Instead of returning a single start and end time, this API returns a list of the temporal segments from the original spectrogram

{
    "base64_spectrogram": str,  # base64-encoded image
    "spectroInfo": {
        "width": int,  # pixels (inherited from spectogram)
        "height": int,  # pixels (inherited from spectogram)
        "start_times": [ int, int, int, ... ],  # milliseconds
        "end_times": [ int, int, int, ... ],  # milliseconds
        "low_freq": int,  # hz (inherited from spectogram)
        "high_freq": int,  # hz (inherited from spectogram)
    }
}

New Spectrogram Database Model

There is a new Django ORM model to support the storage of the spectrogram image files after they are rendered.

class Spectrogram(TimeStampedModel, models.Model):
    recording = models.ForeignKey(Recording, on_delete=models.CASCADE)
    image_file = models.FileField()
    width = models.IntegerField()  # pixels
    height = models.IntegerField()  # pixels
    duration = models.IntegerField()  # milliseconds
    frequency_min = models.IntegerField()  # hz
    frequency_max = models.IntegerField()  # hz

    @classmethod
    def generate(cls, recording):
        ...

    @property
    def compressed(self):
        ...

    @property
    def image_np(self):
        ...

    @property
    def image_pil(self):
        ...

    @property
    def image(self):
        ...

    @property
    def base64(self):
        ...

New Database Migration

Database migration 0004 has been added in bats_ai/core/migrations/0004_spectrogram.py. The content of the update can be seen below:

class Migration(migrations.Migration):
    dependencies = [
        ('core', '0003_annotations'),
    ]

    operations = [
        migrations.CreateModel(
            name='Spectrogram',
            fields=[
                (
                    'id',
                    models.BigAutoField(
                        auto_created=True, primary_key=True, serialize=False, verbose_name='ID'
                    ),
                ),
                (
                    'created',
                    django_extensions.db.fields.CreationDateTimeField(
                        auto_now_add=True, verbose_name='created'
                    ),
                ),
                (
                    'modified',
                    django_extensions.db.fields.ModificationDateTimeField(
                        auto_now=True, verbose_name='modified'
                    ),
                ),
                ('image_file', models.FileField(upload_to='')),
                ('width', models.IntegerField()),
                ('height', models.IntegerField()),
                ('duration', models.IntegerField()),
                ('frequency_min', models.IntegerField()),
                ('frequency_max', models.IntegerField()),
                (
                    'recording',
                    models.ForeignKey(
                        on_delete=django.db.models.deletion.CASCADE, to='core.recording'
                    ),
                ),
            ],
            options={
                'get_latest_by': 'modified',
                'abstract': False,
            },
        ),
    ]

Updated Recording Model

Three additional properties of the Recording ORM model have been added to access the item's spectrogram. Multiple spectrograms can be associated with a recording, but by default the system returns the most recently generated one. This feature may be expanded in the future to key on different spectrogram parameters.

class Recording(TimeStampedModel, models.Model):
    @property
    def has_spectrogram(self):
    	...

    @property
    def spectrograms(self):
    	...

    @property
    def spectrogram(self):
    	...

Spectogram and Recording Admin pages

The Django admin portal has been updated to show if a spectrogram has been computed for a Recording:

If a spectrogram has not already been pre-processed, the admin list shows "Not computed":

Lastly, there is a new Spectrogram list in the admin portal:

Bulk Compute Celery Task

From the Django admin portal, the user can quickly pre-compute a large number of spectrograms for Recordings in the background.

Global Authentication

A new global authentication mechanism was added to allow for implicit OAuth2 bearer token lookup. If a user is found for a provided token, we authenticate the session by updating the anonymous User on the request object. If the user cannot be authenticated, Django/Ninja will automatically return a 401 - Not Authorized on the application's behalf.

def global_auth(request):
    if request.user.is_anonymous:
        token = request.headers.get('Authorization', '').replace('Bearer ', '')
        if len(token) > 0:
            try:
                access_token = AccessToken.objects.get(token=token)
            except AccessToken.DoesNotExist:
                access_token = None
            if access_token and access_token.user:
                if not access_token.user.is_anonymous:
                    request.user = access_token.user
    return not request.user.is_anonymous

api = NinjaAPI(auth=global_auth)

Changelog

Added

Added Spectrogram Model (bats_ai/core/models/spectrogram.py) to store the computed spectrogram image in the database, which allows the system to cache the rendered images.
Added SpectrogramAdmin to visualize Spectograms in Django's admin web portal
Added links between the Recordings and Spectrograms listing pages
Added bulk compute of Spectrograms from the Recordings page using Celery
Added an example 5-second WAV file in assets/example.wav using Git LFS [4.8Mb]
Added Celery task to compute spectrograms in the background bats_ai/core/tasks.py::recording_compute_spectrogram
Added Flower service for Celery worker and task visualization / management
Added opencv (opencv-python-headless) Python dependency
Added the --noreload flag to Django to prevent unintended restarts during interactive embedding
Added .gitignore for macOS temp files

Updated

Updated all code references to the get_user function in bats_ai/core/views/recording.py
Updated NPM dependencies
Updated the pre-commit configuration file to include more files and added multiple new tools and checks (hadolint, beautysh, markdownlint, ripsecrets)
Linted and reformatted dev/django.Dockerfile
Linted dev/export-env.sh bash script
Linted README files, and added a new "Code Formatting" section on setting up and using pre-commit
Linted bats_ai/core/fixtures/species.json JSON file

Removed

Removed old GlobalAuth authentication class
Removed bats_ai/core/views/recording.py::get_user because the new global auth mechanism will authenticate users with OAuth bearer tokens. If the token is valid, the value of request.user is replaced. Otherwise, the system returns the appropiate 401 HTTP error.

bluemellophone

Self reviewed, ready for external review

BryonLewis

👍

bluemellophone added 8 commits January 18, 2024 12:12

Updates for pre-commit

410abf2

Merge changes after rebase

36a1fd4

Preliminary integration of spectogram conversion

7bc1dbc

Preliminary spectrogram visualization

18e2449

Updated NPM packages

7ae3520

Added Flower and OpenCV

80b4db9

Added example WAV file with LFS

d65a5aa

Added spectogram visualization for Recordings

cccb6ce

bluemellophone commented Jan 19, 2024

View reviewed changes

bluemellophone requested a review from BryonLewis January 19, 2024 13:10

bluemellophone marked this pull request as ready for review January 19, 2024 13:10

bluemellophone added 4 commits January 19, 2024 14:24

Bug fixes

856b620

Linting updates

57e4f01

Revert docker-compose override

d809c13

Improve stability of compressed spectrogram generation

77f2306

BryonLewis approved these changes Jan 20, 2024

View reviewed changes

BryonLewis merged commit 81ab2e8 into main Jan 22, 2024
6 checks passed

bluemellophone deleted the dev/add-spectogram branch February 26, 2024 21:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Spectrogram Visualization to Recordings #16

Add Spectrogram Visualization to Recordings #16

bluemellophone commented Jan 19, 2024 •

edited

Loading

bluemellophone left a comment

BryonLewis left a comment

Add Spectrogram Visualization to Recordings #16

Add Spectrogram Visualization to Recordings #16

Conversation

bluemellophone commented Jan 19, 2024 • edited Loading

Updated Docker Image and Database

FFT Parameters and Rendering Options

Spectrogram REST Endpoints

REST API Option 1 - Realtime Spectrogram

REST API Option 2 - Compressed Spectrogram

New Spectrogram Database Model

New Database Migration

Updated Recording Model

Spectogram and Recording Admin pages

Bulk Compute Celery Task

Global Authentication

Changelog

Added

Updated

Removed

bluemellophone left a comment

Choose a reason for hiding this comment

BryonLewis left a comment

Choose a reason for hiding this comment

bluemellophone commented Jan 19, 2024 •

edited

Loading