Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Spectrogram Visualization to Recordings #16

Merged
merged 12 commits into from
Jan 22, 2024
Merged

Conversation

bluemellophone
Copy link
Collaborator

@bluemellophone bluemellophone commented Jan 19, 2024

This PR adds the visualization and storing of spectrogram images for audio recordings.

Screenshot 2024-01-19 at 03 53 12

Updated Docker Image and Database

!!I MPORTANT !!!

Before you begin testing this PR, please rebuild your Docker image and update your local database version with the following:

docker compose down
docker compose down  # Run twice to fully bring down all container
docker compose build
docker compose run --rm django ./manage.py migrate
docker compose up -d

There has been a new dependency for OpenCV added to the Python environment and the previous container will fail to start if you do not update your Django/Celery image. There is also a single database migration to add the new Spectrogram database table.

FFT Parameters and Rendering Options

The spectrograms are computed with a window size of 1ms and are formatted to be 300 pixels high. The width of the spectrogram is determined by the duration and sampling rate of the underlying audio waveform. By default, the rendering will chunk long audio files into smaller segments and will stitch the resulting set of spectrograms together to create a final image. The spectrogram is resized such that the width is four times the number of milliseconds in the audio clip. For example, if the audio clip is 5 seconds long (5,000 milliseconds), then the resulting spectrogram will be 20,000 pixels wide by 300 pixels tall.

The spectrograms use a frequency range of 5Khz to 200KHz, with a 2Khz padding on the top and bottom for visual clarity. The time and frequency axis are plotted with linear scaling.

Spectrogram REST Endpoints

There are two new API endpoints for the spectrograms:

  • Version 1 (realtime) - /api/v1/recording/1/spectrogram
  • Version 2 (compressed) - /api/v1/recording/1/spectrogram/compressed

Both endpoints return a Base64 encoded JPEG image inside a serialized JSON response message.

REST API Option 1 - Realtime Spectrogram

Below is a 1-second sample of a spectrogram conversion. Note the large gaps in time between the chirps.

spectrogram_u4nPZbu

The API returns a response in the following format:

{
    "base64_spectrogram": str,  # base64-encoded image
    "spectroInfo": {
        "width": int,  # pixels
        "height": int,  # pixels
        "start_time": int,  # always 0
        "end_time": int,  # milliseconds
        "low_freq": int,  # hz
        "high_freq": int,  # hz
    }
}

REST API Option 2 - Compressed Spectrogram

Below is a compressed version of a full 5-second audio clip, where segments around peak amplitudes are clipped out and stacked next to each other. This visualization makes it easier to quickly visualize the overall features of the audio signal by removing large gaps of empty recording.

temp

This image is generated on-the-fly and is computed relatively quickly (i.e., <0.5s) once a pre-computed spectrogram is available. Instead of returning a single start and end time, this API returns a list of the temporal segments from the original spectrogram

{
    "base64_spectrogram": str,  # base64-encoded image
    "spectroInfo": {
        "width": int,  # pixels (inherited from spectogram)
        "height": int,  # pixels (inherited from spectogram)
        "start_times": [ int, int, int, ... ],  # milliseconds
        "end_times": [ int, int, int, ... ],  # milliseconds
        "low_freq": int,  # hz (inherited from spectogram)
        "high_freq": int,  # hz (inherited from spectogram)
    }
}

New Spectrogram Database Model

There is a new Django ORM model to support the storage of the spectrogram image files after they are rendered.

class Spectrogram(TimeStampedModel, models.Model):
    recording = models.ForeignKey(Recording, on_delete=models.CASCADE)
    image_file = models.FileField()
    width = models.IntegerField()  # pixels
    height = models.IntegerField()  # pixels
    duration = models.IntegerField()  # milliseconds
    frequency_min = models.IntegerField()  # hz
    frequency_max = models.IntegerField()  # hz

    @classmethod
    def generate(cls, recording):
        ...

    @property
    def compressed(self):
        ...

    @property
    def image_np(self):
        ...

    @property
    def image_pil(self):
        ...

    @property
    def image(self):
        ...

    @property
    def base64(self):
        ...

New Database Migration

Database migration 0004 has been added in bats_ai/core/migrations/0004_spectrogram.py. The content of the update can be seen below:

class Migration(migrations.Migration):
    dependencies = [
        ('core', '0003_annotations'),
    ]

    operations = [
        migrations.CreateModel(
            name='Spectrogram',
            fields=[
                (
                    'id',
                    models.BigAutoField(
                        auto_created=True, primary_key=True, serialize=False, verbose_name='ID'
                    ),
                ),
                (
                    'created',
                    django_extensions.db.fields.CreationDateTimeField(
                        auto_now_add=True, verbose_name='created'
                    ),
                ),
                (
                    'modified',
                    django_extensions.db.fields.ModificationDateTimeField(
                        auto_now=True, verbose_name='modified'
                    ),
                ),
                ('image_file', models.FileField(upload_to='')),
                ('width', models.IntegerField()),
                ('height', models.IntegerField()),
                ('duration', models.IntegerField()),
                ('frequency_min', models.IntegerField()),
                ('frequency_max', models.IntegerField()),
                (
                    'recording',
                    models.ForeignKey(
                        on_delete=django.db.models.deletion.CASCADE, to='core.recording'
                    ),
                ),
            ],
            options={
                'get_latest_by': 'modified',
                'abstract': False,
            },
        ),
    ]

Updated Recording Model

Three additional properties of the Recording ORM model have been added to access the item's spectrogram. Multiple spectrograms can be associated with a recording, but by default the system returns the most recently generated one. This feature may be expanded in the future to key on different spectrogram parameters.

class Recording(TimeStampedModel, models.Model):
    @property
    def has_spectrogram(self):
    	...

    @property
    def spectrograms(self):
    	...

    @property
    def spectrogram(self):
    	...

Spectogram and Recording Admin pages

The Django admin portal has been updated to show if a spectrogram has been computed for a Recording:

Screenshot 2024-01-19 at 03 46 58

If a spectrogram has not already been pre-processed, the admin list shows "Not computed":

Screenshot 2024-01-19 at 03 49 09

Lastly, there is a new Spectrogram list in the admin portal:

Screenshot 2024-01-19 at 03 48 52

Bulk Compute Celery Task

From the Django admin portal, the user can quickly pre-compute a large number of spectrograms for Recordings in the background.

Screenshot 2024-01-19 at 03 49 20

Global Authentication

A new global authentication mechanism was added to allow for implicit OAuth2 bearer token lookup. If a user is found for a provided token, we authenticate the session by updating the anonymous User on the request object. If the user cannot be authenticated, Django/Ninja will automatically return a 401 - Not Authorized on the application's behalf.

def global_auth(request):
    if request.user.is_anonymous:
        token = request.headers.get('Authorization', '').replace('Bearer ', '')
        if len(token) > 0:
            try:
                access_token = AccessToken.objects.get(token=token)
            except AccessToken.DoesNotExist:
                access_token = None
            if access_token and access_token.user:
                if not access_token.user.is_anonymous:
                    request.user = access_token.user
    return not request.user.is_anonymous

api = NinjaAPI(auth=global_auth)

Changelog

Added

  • Added Spectrogram Model (bats_ai/core/models/spectrogram.py) to store the computed spectrogram image in the database, which allows the system to cache the rendered images.
  • Added SpectrogramAdmin to visualize Spectograms in Django's admin web portal
  • Added links between the Recordings and Spectrograms listing pages
  • Added bulk compute of Spectrograms from the Recordings page using Celery
  • Added an example 5-second WAV file in assets/example.wav using Git LFS [4.8Mb]
  • Added Celery task to compute spectrograms in the background bats_ai/core/tasks.py::recording_compute_spectrogram
  • Added Flower service for Celery worker and task visualization / management
  • Added opencv (opencv-python-headless) Python dependency
  • Added the --noreload flag to Django to prevent unintended restarts during interactive embedding
  • Added .gitignore for macOS temp files

Updated

  • Updated all code references to the get_user function in bats_ai/core/views/recording.py
  • Updated NPM dependencies
  • Updated the pre-commit configuration file to include more files and added multiple new tools and checks (hadolint, beautysh, markdownlint, ripsecrets)
  • Linted and reformatted dev/django.Dockerfile
  • Linted dev/export-env.sh bash script
  • Linted README files, and added a new "Code Formatting" section on setting up and using pre-commit
  • Linted bats_ai/core/fixtures/species.json JSON file

Removed

  • Removed old GlobalAuth authentication class
  • Removed bats_ai/core/views/recording.py::get_user because the new global auth mechanism will authenticate users with OAuth bearer tokens. If the token is valid, the value of request.user is replaced. Otherwise, the system returns the appropiate 401 HTTP error.

Copy link
Collaborator Author

@bluemellophone bluemellophone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self reviewed, ready for external review

@bluemellophone bluemellophone marked this pull request as ready for review January 19, 2024 13:10
Copy link
Collaborator

@BryonLewis BryonLewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@BryonLewis BryonLewis merged commit 81ab2e8 into main Jan 22, 2024
6 checks passed
@bluemellophone bluemellophone deleted the dev/add-spectogram branch February 26, 2024 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants