Replies: 1 comment 1 reply
-
because if we just use the same audio as conditioning it overfits into that data. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In the paper :
Why use two encodings (one is self, the other is another clip of the same person speaking) ? How about use the single encoding of current sample ?
Beta Was this translation helpful? Give feedback.
All reactions