-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Spectrogram Visualization to Recordings #16
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bluemellophone
commented
Jan 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self reviewed, ready for external review
BryonLewis
approved these changes
Jan 20, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds the visualization and storing of spectrogram images for audio recordings.
Updated Docker Image and Database
!!I MPORTANT !!!
Before you begin testing this PR, please rebuild your Docker image and update your local database version with the following:
docker compose down docker compose down # Run twice to fully bring down all container docker compose build docker compose run --rm django ./manage.py migrate docker compose up -d
There has been a new dependency for OpenCV added to the Python environment and the previous container will fail to start if you do not update your Django/Celery image. There is also a single database migration to add the new Spectrogram database table.
FFT Parameters and Rendering Options
The spectrograms are computed with a window size of 1ms and are formatted to be 300 pixels high. The width of the spectrogram is determined by the duration and sampling rate of the underlying audio waveform. By default, the rendering will chunk long audio files into smaller segments and will stitch the resulting set of spectrograms together to create a final image. The spectrogram is resized such that the width is four times the number of milliseconds in the audio clip. For example, if the audio clip is 5 seconds long (5,000 milliseconds), then the resulting spectrogram will be 20,000 pixels wide by 300 pixels tall.
The spectrograms use a frequency range of 5Khz to 200KHz, with a 2Khz padding on the top and bottom for visual clarity. The time and frequency axis are plotted with linear scaling.
Spectrogram REST Endpoints
There are two new API endpoints for the spectrograms:
/api/v1/recording/1/spectrogram
/api/v1/recording/1/spectrogram/compressed
Both endpoints return a Base64 encoded JPEG image inside a serialized JSON response message.
REST API Option 1 - Realtime Spectrogram
Below is a 1-second sample of a spectrogram conversion. Note the large gaps in time between the chirps.
The API returns a response in the following format:
REST API Option 2 - Compressed Spectrogram
Below is a compressed version of a full 5-second audio clip, where segments around peak amplitudes are clipped out and stacked next to each other. This visualization makes it easier to quickly visualize the overall features of the audio signal by removing large gaps of empty recording.
This image is generated on-the-fly and is computed relatively quickly (i.e., <0.5s) once a pre-computed spectrogram is available. Instead of returning a single start and end time, this API returns a list of the temporal segments from the original spectrogram
New Spectrogram Database Model
There is a new Django ORM model to support the storage of the spectrogram image files after they are rendered.
New Database Migration
Database migration
0004
has been added inbats_ai/core/migrations/0004_spectrogram.py
. The content of the update can be seen below:Updated Recording Model
Three additional properties of the
Recording
ORM model have been added to access the item's spectrogram. Multiple spectrograms can be associated with a recording, but by default the system returns the most recently generated one. This feature may be expanded in the future to key on different spectrogram parameters.Spectogram and Recording Admin pages
The Django admin portal has been updated to show if a spectrogram has been computed for a Recording:
If a spectrogram has not already been pre-processed, the admin list shows "Not computed":
Lastly, there is a new Spectrogram list in the admin portal:
Bulk Compute Celery Task
From the Django admin portal, the user can quickly pre-compute a large number of spectrograms for Recordings in the background.
Global Authentication
A new global authentication mechanism was added to allow for implicit OAuth2 bearer token lookup. If a user is found for a provided token, we authenticate the session by updating the anonymous User on the request object. If the user cannot be authenticated, Django/Ninja will automatically return a
401 - Not Authorized
on the application's behalf.Changelog
Added
Spectrogram
Model (bats_ai/core/models/spectrogram.py
) to store the computed spectrogram image in the database, which allows the system to cache the rendered images.SpectrogramAdmin
to visualize Spectograms in Django's admin web portalassets/example.wav
using Git LFS [4.8Mb]bats_ai/core/tasks.py::recording_compute_spectrogram
opencv-python-headless
) Python dependency--noreload
flag to Django to prevent unintended restarts during interactive embedding.gitignore
for macOS temp filesUpdated
get_user
function inbats_ai/core/views/recording.py
pre-commit
configuration file to include more files and added multiple new tools and checks (hadolint, beautysh, markdownlint, ripsecrets)dev/django.Dockerfile
dev/export-env.sh
bash scriptbats_ai/core/fixtures/species.json
JSON fileRemoved
GlobalAuth
authentication classbats_ai/core/views/recording.py::get_user
because the new global auth mechanism will authenticate users with OAuth bearer tokens. If the token is valid, the value ofrequest.user
is replaced. Otherwise, the system returns the appropiate 401 HTTP error.