-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Audio PR - rocAL Audio decoder support (#118)
* Audio Decoder PR 1 * channge image_info to sample_info to maintain a generic name for all the use-cases * Change the copyright year from 2023 to 2024 * formatting the files * Resolve PR comments * Resolve PR comments * Change decoded_img_info to decoded_video_info * Change the file_path() function to virtual from pure virtual * Minor change * Minor changes * Add the unit test file * Revert "Add the unit test file" This reverts commit e79cc06. * Introduce CMake for sndfile Modify CMakeLists.txt for the same * Resolve 1st set of PR commenst * Remove commented code for last batch polices and unsued imports * ROI related changes - change from xy to wh to use for samples and channels * Fix seg fault with ROI * Remove opencv usage from the unit test * Resolve the PR comments * Remove instances of the audio_*_time - use the existing variables from Timing struct * Formatting changes in rocal_api_data_loader.cpp and add the opencl and hip conditions for audio loaders * Resolve the internal PR comments * Reformatting the file_source_reader.cpp * Remove _input_path from audio_source_evaluator and audio_read_and_decode as it is unecessary * Change the header formatting * Changes in copy_data() for audio samples * Initialize the status at the beginning * Cmake related changes for audio * Resolve PR comments * Add condition check to eliminate any other file extensions other than a wav file / other image formats and call open_folder deom subfolder_reading() function * Update audio_read_and_decode.cpp * Revert file source reader changes * Update master_graph.cpp * Update tensor.cpp - Remove a commented line of code * Introduce ROCAL_AUDIO flag Introduce flag for audio code, to be disabled when sndfile not found * Minor changes * Minor changes * Add output comparision for Audio outputs * Minor changes * Minor changes to unit test * Remove max_frames and max_channels args * Remove max_frames, max_channels and sample rate from unit test * Minor change * Add python script to run audio unittests * Clean up C++ audio unit test * Modify rocal audio unit test Update README * Minor change * Minor change * Minor variable name change * Minor changes Add wav extension in file reader Add reader in unit test * Update C++ unit test * Name change from sample to data * Change from decoded_data_info to DecodedDataInfo * Remove audio_decoder_factory.cpp file * Minor change * Change variable name * Update the struct variable name in audio files * Minor changes * Change ROCAL_DATA_PATH to exclude rocal_data * Use Pascal case for function names in audio decoder * Modify cmake to have SNDFILE in all capital * Minor changes * Add struct for audio info in AudioReadAndDecode * Fix merge conflict * Renaming crop_image_info to CropImageInfo * Remove - actual_host_buffers - Unused * Rename TimingDBG to TimingDbg * Move the instances of DecodedDataInfo to its base class LoaderModule * Fix a WRN msg in master_graph.cpp * Remove a dangling comment * Rename _circ_data_info to _circ_buff_data_info * Add Glob to CMakeLists.txt * Rename SndFileDecoder to GenericAudioDecoder * Fix build issues * Minor change * Update audio unit test README * Revert "Add Glob to CMakeLists.txt" This reverts commit 47263d9. * Fix include headers for Audio files * Fix copy data 2D * Minor changes * Pass decoded data info to load routine instead of separate vectors * Update CHANGELOG.md * Change swap_handle_time variable name in loader * Formatting changes Add comments * Update doxygen comments * Move file source reader from readers/image to readers folder * Update README for audio test * Minor fix * Minor changes shard_count argument name * Rename set and get functions of data_info to decoded_data_info --------- Co-authored-by: root <[email protected]> Co-authored-by: Swetha B S <[email protected]> Co-authored-by: swetha097 <[email protected]> Co-authored-by: swetha097 <[email protected]> Co-authored-by: Swetha B S <> Co-authored-by: Rajy Rawther <[email protected]>
- Loading branch information
1 parent
20b2fdc
commit 422cbe5
Showing
42 changed files
with
2,177 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
################################################################################ | ||
# | ||
# MIT License | ||
# | ||
# Copyright (c) 2024 Advanced Micro Devices, Inc. | ||
# | ||
# Permission is hereby granted, free of charge, to any person obtaining a copy | ||
# of this software and associated documentation files (the "Software"), to deal | ||
# in the Software without restriction, including without limitation the rights | ||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
# copies of the Software, and to permit persons to whom the Software is | ||
# furnished to do so, subject to the following conditions: | ||
# | ||
# The above copyright notice and this permission notice shall be included in all | ||
# copies or substantial portions of the Software. | ||
# | ||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
# SOFTWARE. | ||
# | ||
################################################################################ | ||
find_path(SNDFILE_INCLUDE_DIRS | ||
NAMES sndfile.h | ||
HINTS | ||
$ENV{SNDFILE_PATH}/include | ||
PATHS | ||
/usr/local/include | ||
/usr/include | ||
) | ||
mark_as_advanced(SNDFILE_INCLUDE_DIRS) | ||
|
||
find_library(SNDFILE_LIBRARIES | ||
NAMES sndfile libsndfile | ||
HINTS | ||
$ENV{SNDFILE_PATH}/lib | ||
$ENV{SNDFILE_PATH}/lib64 | ||
PATHS | ||
${CMAKE_SYSTEM_PREFIX_PATH} | ||
${SNDFILE_PATH} | ||
/usr/local/ | ||
PATH_SUFFIXES lib lib64 | ||
) | ||
mark_as_advanced(SNDFILE_LIBRARIES) | ||
|
||
if(SNDFILE_LIBRARIES AND SNDFILE_INCLUDE_DIRS) | ||
set(SNDFILE_FOUND TRUE) | ||
endif() | ||
|
||
include(FindPackageHandleStandardArgs) | ||
find_package_handle_standard_args(SndFile | ||
FOUND_VAR SNDFILE_FOUND | ||
REQUIRED_VARS | ||
SNDFILE_LIBRARIES | ||
SNDFILE_INCLUDE_DIRS | ||
) | ||
|
||
set(SNDFILE_FOUND ${SNDFILE_FOUND} CACHE INTERNAL "") | ||
set(SNDFILE_LIBRARIES ${SNDFILE_LIBRARIES} CACHE INTERNAL "") | ||
set(SNDFILE_INCLUDE_DIRS ${SNDFILE_INCLUDE_DIRS} CACHE INTERNAL "") | ||
|
||
if(SNDFILE_FOUND) | ||
message("-- ${White}Using SndFile -- \n\tLibraries:${SNDFILE_LIBRARIES} \n\tIncludes:${SNDFILE_INCLUDE_DIRS}${ColourReset}") | ||
else() | ||
message( "-- ${Yellow}NOTE: FindSndFile failed to find -- SndFile${ColourReset}" ) | ||
endif() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
/* | ||
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved. | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <cstddef> | ||
#include <vector> | ||
|
||
#ifdef ROCAL_AUDIO | ||
#include "sndfile.h" | ||
|
||
class AudioDecoder { | ||
public: | ||
enum class Status { | ||
OK = 0, | ||
HEADER_DECODE_FAILED, | ||
CONTENT_DECODE_FAILED, | ||
UNSUPPORTED, | ||
FAILED, | ||
NO_MEMORY | ||
}; | ||
virtual AudioDecoder::Status Initialize(const char* src_filename) = 0; | ||
virtual AudioDecoder::Status Decode(float* buffer) = 0; | ||
virtual AudioDecoder::Status DecodeInfo(int* samples, int* channels, float* sample_rates) = 0; | ||
virtual void Release() = 0; | ||
virtual ~AudioDecoder() = default; | ||
|
||
protected: | ||
SF_INFO _sfinfo; | ||
SNDFILE* _sf_ptr; | ||
}; | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
/* | ||
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved. | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include "decoders/audio/audio_decoder.h" | ||
#include "decoders/audio/generic_audio_decoder.h" | ||
|
||
#ifdef ROCAL_AUDIO | ||
static std::shared_ptr<AudioDecoder> create_audio_decoder(DecoderConfig config) { | ||
switch (config.type()) { | ||
case DecoderType::AUDIO_SOFTWARE_DECODE: | ||
return std::make_shared<GenericAudioDecoder>(); | ||
default: | ||
THROW("Unsupported decoder type " + TOSTR(config.type())); | ||
} | ||
} | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
/* | ||
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved. | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include "decoders/audio/audio_decoder.h" | ||
|
||
#ifdef ROCAL_AUDIO | ||
class GenericAudioDecoder : public AudioDecoder { | ||
public: | ||
//! Default constructor | ||
GenericAudioDecoder(); | ||
AudioDecoder::Status Initialize(const char* src_filename) override; | ||
AudioDecoder::Status Decode(float* buffer) override; | ||
AudioDecoder::Status DecodeInfo(int* samples, int* channels, float* sample_rates) override; | ||
void Release() override; | ||
~GenericAudioDecoder() override; | ||
}; | ||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
/* | ||
Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved. | ||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <string> | ||
#include <thread> | ||
#include <vector> | ||
|
||
#include "loaders/audio/audio_read_and_decode.h" | ||
#include "loaders/circular_buffer.h" | ||
#include "pipeline/commons.h" | ||
#include "meta_data/meta_data_reader.h" | ||
|
||
#ifdef ROCAL_AUDIO | ||
|
||
// AudioLoader runs an internal thread for loading and decoding of audios asynchronously | ||
// It uses a circular buffer to store decoded audios for the user | ||
class AudioLoader : public LoaderModule { | ||
public: | ||
explicit AudioLoader(void* dev_resources); | ||
~AudioLoader() override; | ||
LoaderModuleStatus load_next() override; | ||
void initialize(ReaderConfig reader_cfg, DecoderConfig decoder_cfg, RocalMemType mem_type, unsigned batch_size, bool keep_orig_size = false) override; | ||
void set_output(Tensor* output_audio) override; | ||
size_t remaining_count() override; // returns number of remaining items to be loaded | ||
void reset() override; // Resets the loader to load from the beginning of the media | ||
Timing timing() override; | ||
void start_loading() override; | ||
LoaderModuleStatus set_cpu_affinity(cpu_set_t cpu_mask); | ||
LoaderModuleStatus set_cpu_sched_policy(struct sched_param sched_policy); | ||
std::vector<std::string> get_id() override; | ||
DecodedDataInfo get_decode_data_info() override; | ||
void set_prefetch_queue_depth(size_t prefetch_queue_depth) override; | ||
void set_gpu_device_id(int device_id); | ||
void shut_down() override; | ||
void feed_external_input(const std::vector<std::string>& input_images_names, const std::vector<unsigned char*>& input_buffer, | ||
const std::vector<ROIxywh>& roi_xywh, unsigned int max_width, unsigned int max_height, unsigned int channels, | ||
ExternalSourceFileMode mode, bool eos) override { THROW("external source feed is not supported in audio loader") } | ||
|
||
private: | ||
bool is_out_of_data(); | ||
void de_init(); | ||
void stop_internal_thread(); | ||
LoaderModuleStatus update_output_audio(); | ||
LoaderModuleStatus load_routine(); | ||
std::shared_ptr<AudioReadAndDecode> _audio_loader; | ||
Tensor* _output_tensor; | ||
std::vector<std::string> _output_names; // audio file name/ids that are stored in the _output_audio | ||
MetaDataBatch* _meta_data = nullptr; // The output of the meta_data_graph | ||
bool _internal_thread_running; | ||
size_t _output_mem_size, _batch_size, _max_decoded_samples, _max_decoded_channels; | ||
std::thread _load_thread; | ||
RocalMemType _mem_type; | ||
DecodedDataInfo _decoded_audio_info; | ||
DecodedDataInfo _output_decoded_audio_info; | ||
CircularBuffer _circ_buff; | ||
TimingDbg _swap_handle_time; | ||
bool _is_initialized; | ||
bool _stopped = false; | ||
bool _loop; // If true the reader will wrap around at the end of the media (files/audios/...) and wouldn't stop | ||
size_t _prefetch_queue_depth = 0; // Used for circular buffer's internal buffer allocation | ||
size_t _audio_counter = 0; // How many audios have been loaded already | ||
size_t _remaining_audio_count; // How many audios are there yet to be loaded | ||
int _device_id; | ||
}; | ||
#endif |
Oops, something went wrong.