You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently working on a project that is built in Unity where I am modulating voices (e.g. source speech → voice modulator → target speech (elf)). I currently have an E2E flow with the TFLite models, but there is a decent amount of noise being added to the speech generation. It sounds almost like a clipping noise. I'm currently using the TFLite models from the repo and I have split the quantizer into a QuantizerEncoder & QuantizerDecoder. I'm not sure if a better solution is to attempt to convert Lyra into a DLL and just run that in Unity vs the models, but this is what I have so far.
E2E Flow
Load a wav file via librosa
Pad the data to meet data_length % 320 == 0
Feed the data through the 4 models: Encoder, QuantizerEncoder, QuantizerDecoder, & Decoder
Store the waveform data as I go as:
One singular array of data
A series of audio clips
Save the singular array of waveform data as a wav file
Is the better solution to create a DLL and run that in Unity?
Do the models encompass all of the pre & post-processing needed to provide a clean output signal that the cpp implementation provides (e.g. Integration Test Example)
Have I made an error in my implementation? I haven't been able to find a python implementation yet that runs the data and tests, so this is what I've come up with so far.
Is the noise possibly due to the fact that I am concatenating all of the data to test vs playing each clip back iteratively? I attempted to playback the output iteratively as plain waveform data in a Jupyter Notebook vs stored wavfiles, but no sound was produced.
@shlomiez The output of the data had a prefix that was being added with values that were really high or low. I don't remember the exact cause of the issue, but I do remember the fix was managing how we were staging and creating the DLL. One of the methods we added to the DLL was incorrectly prefixing those out of range data points to that data.
Description
I am currently working on a project that is built in Unity where I am modulating voices (e.g. source speech → voice modulator → target speech (elf)). I currently have an E2E flow with the TFLite models, but there is a decent amount of noise being added to the speech generation. It sounds almost like a clipping noise. I'm currently using the TFLite models from the repo and I have split the quantizer into a QuantizerEncoder & QuantizerDecoder. I'm not sure if a better solution is to attempt to convert Lyra into a
DLL
and just run that in Unity vs the models, but this is what I have so far.E2E Flow
Encoder
,QuantizerEncoder
,QuantizerDecoder
, &Decoder
Code
Questions
DLL
and run that in Unity?Integration Test Example
)Resources
Sound samples.zip
The text was updated successfully, but these errors were encountered: