[QUESTION] reaching out for a contact, and to share some details about bark re: the tokens it uses #223

Alignment-Lab-AI · 2024-04-01T03:43:21Z

iirc, bark is an encoder decoder based on a t5, and uses wav2vec-bert as the encoder

i dont recall where i learned that but i feel like i validated this at some point, wav2vec-bert uses tokens which represent phonemes, those are converted to encodec codebooks which produce the final audio

id love to discuss some of our current work with tts, stt, and sts if you have the bandwidth! my email is [email protected]

more context about our org at https://AlignmentLab.ai

thanks!
Austin

gitmylo · 2024-04-01T09:57:36Z

When I was figuring out how bark works, I wrote down my observations in here

I came up with a few theoretical methods that would allow voice cloning, which would be correct if my observations were correct. They were, and I published the code.
The source code written related to "Method 2" can be found here
The source code for "Method 3" can probably be found in the commit history for audio-webui. But you might need to go really far back. This method's outputs are far less convincing and much lower quality than method 2. So this might not be as interesting, but it could still help explain how bark's 3 step process works, exactly.

Alignment-Lab-AI added the user question A question is asked by a user. label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION] reaching out for a contact, and to share some details about bark re: the tokens it uses #223

[QUESTION] reaching out for a contact, and to share some details about bark re: the tokens it uses #223

Alignment-Lab-AI commented Apr 1, 2024

gitmylo commented Apr 1, 2024

[QUESTION] reaching out for a contact, and to share some details about bark re: the tokens it uses #223

[QUESTION] reaching out for a contact, and to share some details about bark re: the tokens it uses #223

Comments

Alignment-Lab-AI commented Apr 1, 2024

gitmylo commented Apr 1, 2024