Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate UrbanSoundDataset for Audio Data Processing #1179

Open
wants to merge 1 commit into
base: nextjs
Choose a base branch
from

Conversation

codingwithsurya
Copy link
Contributor

@codingwithsurya codingwithsurya commented May 14, 2024

adding urbansound-dataset and schemas.py

Github Issue Number Here: <ntegrate UrbanSoundDataset for Audio Trainspace #1156>
What user problem are we solving?
We are enhancing the Deep Learning Playground's capabilities to include audio data processing by integrating the UrbanSound8K dataset. This allows users to work with audio data seamlessly within the existing pipeline, expanding the versatility and application of the platform.

What solution does this PR provide?
This PR adds a new class, UrbanSoundDataset, to the training/core/dataset.py module. This class encapsulates functionalities for data ingestion, preprocessing, and loading specifically tailored for the UrbanSound8K dataset. It includes dataCreator, train_loader, and test_loader methods to facilitate efficient data loading into the model for training and testing. Additionally, it ensures compatibility with PyTorch's DataLoader mechanism and integrates smoothly with the existing training pipeline.

It also provides a schemas.py file that provides audio params. This file is still a WIP.

Testing Methodology

How did you test your changes and verify that existing functionality is not broken
manual testing

Any other considerations
Updated schemas.py to include AudioParams for defining non-tunable parameters specific to the UrbanSound8K dataset.

we also added 2 new dependencies to poetry -- torchaudio and soundata

@codingwithsurya codingwithsurya requested a review from a team as a code owner May 14, 2024 02:09
@codingwithsurya codingwithsurya linked an issue May 14, 2024 that may be closed by this pull request
Copy link

Quality Gate Passed Quality Gate passed

Issues
10 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@codingwithsurya
Copy link
Contributor Author

codingwithsurya commented May 14, 2024

looks like one of the lints are failing. weird. @karkir0003 @DSGT-DLP/project-lead

the error message is this:
Note: This error originates from the build backend, and is likely not a problem with poetry but with simpleaudio (1.0.4) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "simpleaudio (==1.0.4)"'.
Error: Process completed with exit code 1.

We just added torchaudio and soundata libraries in this pr to poetry so bc of that.

Copy link
Member

@karkir0003 karkir0003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good start for the most part.

please address my questions and add unit tests to ensure the data loading logic works as intended


soundData = tempData
soundFormatted = torch.zeros([32000, 1])
soundFormatted[:32000] = soundData[::5] # Take every fifth sample of soundData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why we are taking every fifth sample from soundData and assigning it to the first 32000 elements of soundFormatted is to downsample the data if soundData is a high-frequency sound signal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. might want to clarify that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

training/training/core/dataset.py Show resolved Hide resolved
training/training/core/dataset.py Show resolved Hide resolved
@karkir0003
Copy link
Member

looks like one of the lints are failing. weird. @karkir0003 @DSGT-DLP/project-lead

the error message is this:
Note: This error originates from the build backend, and is likely not a problem with poetry but with simpleaudio (1.0.4) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "simpleaudio (==1.0.4)"'.
Error: Process completed with exit code 1.

We just added torchaudio and soundata libraries in this pr to poetry so bc of that.

whats simpleaudio? is this a lib thats a dependency used by soundata?

@karkir0003
Copy link
Member

did we install the latest stable version of torchaudio and soundata?

@karkir0003
Copy link
Member

looks like one of the lints are failing. weird. @karkir0003 @DSGT-DLP/project-lead

the error message is this: Note: This error originates from the build backend, and is likely not a problem with poetry but with simpleaudio (1.0.4) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "simpleaudio (==1.0.4)"'. Error: Process completed with exit code 1.

We just added torchaudio and soundata libraries in this pr to poetry so bc of that.

I'd try starting the debugging with the following:

  1. Try running the command that's shown in the log. See if any further logs come up
  2. Maybe we might need to find a compatible version of soundata or potentially add simpleaudio as a dependency just like how you used DLP CLI to install torchaudio
  3. If 1 and 2 don't work, try asking in the poetry github repo by filing a github issue. There's also a discord server for Poetry that can help clarify

@codingwithsurya
Copy link
Contributor Author

looks like one of the lints are failing. weird. @karkir0003 @DSGT-DLP/project-lead
the error message is this:
Note: This error originates from the build backend, and is likely not a problem with poetry but with simpleaudio (1.0.4) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "simpleaudio (==1.0.4)"'.
Error: Process completed with exit code 1.
We just added torchaudio and soundata libraries in this pr to poetry so bc of that.

whats simpleaudio? is this a lib thats a dependency used by soundata?

still tryna figure this one out. i went through the logs and couldnt find any hints

@codingwithsurya
Copy link
Contributor Author

did we install the latest stable version of torchaudio and soundata?

yup i double checked. i just ran dlp-cli backend add ____

@codingwithsurya
Copy link
Contributor Author

looks like one of the lints are failing. weird. @karkir0003 @DSGT-DLP/project-lead
the error message is this: Note: This error originates from the build backend, and is likely not a problem with poetry but with simpleaudio (1.0.4) not supporting PEP 517 builds. You can verify this by running 'pip wheel --no-cache-dir --use-pep517 "simpleaudio (==1.0.4)"'. Error: Process completed with exit code 1.
We just added torchaudio and soundata libraries in this pr to poetry so bc of that.

I'd try starting the debugging with the following:

  1. Try running the command that's shown in the log. See if any further logs come up
  2. Maybe we might need to find a compatible version of soundata or potentially add simpleaudio as a dependency just like how you used DLP CLI to install torchaudio
  3. If 1 and 2 don't work, try asking in the poetry github repo by filing a github issue. There's also a discord server for Poetry that can help clarify

alright bet sounds good

@karkir0003
Copy link
Member

@codingwithsurya , looks like someone from poetry responded to this thread I created and shared with you in discord: https://github.com/orgs/python-poetry/discussions/9418

@codingwithsurya
Copy link
Contributor Author

https://github.com/orgs/python-poetry/discussions/9418

ok. we can add it as a dependency then.

@karkir0003
Copy link
Member

https://github.com/orgs/python-poetry/discussions/9418

ok. we can add it as a dependency then.

let's give this a try and see if that works @codingwithsurya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Integrate UrbanSoundDataset for Audio Trainspace
2 participants