You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm curious about using EnCodec to interpolate between audio clips in latent space. However, model.encode(inputs["input_values"], inputs["padding_mask"]) returns discrete integer codes and not a continuous vector representation. Is interpolation possible?
The text was updated successfully, but these errors were encountered:
My understanding of the code is the embedding creation and quantization are done together when calling model.encode(), making interpolation challenging. I reimplemented encoding and decoding with embedding generation and quantization broken out as separate steps in my repository at https://github.com/jhurliman/music-interpolation. This is not using the quantization at all so effectively only the SEANet encoder-decoder, but with the ability to use the pre-trained "facebook/encodec_*khz" models.
❓ Questions
I'm curious about using EnCodec to interpolate between audio clips in latent space. However,
model.encode(inputs["input_values"], inputs["padding_mask"])
returns discrete integer codes and not a continuous vector representation. Is interpolation possible?The text was updated successfully, but these errors were encountered: