Skip to content

Commit

Permalink
Merge pull request #49 from JarodMica/version_3
Browse files Browse the repository at this point in the history
Update to v3
  • Loading branch information
JarodMica authored Jun 10, 2024
2 parents 3672514 + cb8fba7 commit 77f435c
Show file tree
Hide file tree
Showing 6 changed files with 18 additions and 274 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,7 @@ __pycache__/
venv/
audiobooks/
output/
.vscode/
tortoise_api/


5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ There are two ways to install this, via Package or Manually. If you don't have
- [ ] Highlight sentences for generation later (will need to do some type of edit to the json structure so that even if you close out, they are still highlighted)
- [ ] Find a way to do "multiple speakers" for dialogue in the book (might involve a new tab where users can select sentences to regenerate)
- [ ] Auto sentence regeneration and comparison using whisper (https://github.com/maxbachmann/RapidFuzz/)
- [ ] Add a toggleable option for using rvc conversion


## Prerequisites:
Expand Down Expand Up @@ -85,7 +86,7 @@ venv\Scripts\activate
```
4. Install pytorch using command below (recommended) or get from https://pytorch.org/get-started/locally/:

```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117```
```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121```

5. Install requirements:

Expand All @@ -97,6 +98,8 @@ venv\Scripts\activate

```pip install git+https://github.com/JarodMica/rvc-tts-pipeline.git@lightweight#egg=rvc_tts_pipe```

```pip install git+https://github.com/JarodMica/tortoise_api.git```

6. Download and install ffmpeg: https://ffmpeg.org/download.html
- Place ffmpeg.exe and ffprobe.exe inside of audiobook_maker OR make sure they are in your environment path variable

Expand Down
9 changes: 5 additions & 4 deletions audio_book_app_2_0.py → audio_book_app.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@
script_directory = os.path.dirname(os.path.realpath(__file__))
sys.path.append(script_directory)

from tortoise_api import Tortoise_API
from tortoise_api import load_sentences
from tortoise_api.tortoise_api import load_sentences, load_config, call_api

from rvc_pipe.rvc_infer import rvc_convert

class AudioGenerationWorker(QThread):
Expand Down Expand Up @@ -72,7 +72,6 @@ def __init__(self):

self.init_ui()

self.tortoise = Tortoise_API()

def init_ui(self):
# Main Layout
Expand Down Expand Up @@ -785,7 +784,9 @@ def generate_audio_for_sentence_threaded(self, directory_path, progress_callback
progress_callback(progress_percentage)\

def generate_audio(self, sentence):
audio_path = self.tortoise.call_api(sentence)
tort_setup = os.path.join(script_dir, "tort.yaml")
parameters = load_config(tort_setup)
audio_path = call_api(sentence, **parameters)
selected_voice = self.voice_models_combo.currentText()
selected_index = self.voice_index_combo.currentText()
voice_model_path = os.path.join(self.voice_folder_path, selected_voice)
Expand Down
4 changes: 4 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog & thoughts

# 6/9/2024
Bug fix for tortoise TTS API call implemented, lots of things in the pipeline need a little refreshing
- Package version is not done yet.

# 10/17/2023
Bug fixes for next patch
- Fixed hardcoded path in lightweight rvc package under configs.py for nvidia cards under 4GB
Expand Down
63 changes: 2 additions & 61 deletions text_test1.txt
Original file line number Diff line number Diff line change
@@ -1,61 +1,2 @@
---- Test 1 ----
This is a simple test. It should work without any issues.
-- Expected Output --
["This is a simple test.", "It should work without any issues."]

---- Test 2 ----
Although I went to the store, I forgot to buy milk. Next time, I’ll make a list.
-- Expected Output --
["Although I went to the store, I forgot to buy milk.", "Next time, I’ll make a list."]

---- Test 3 ----
Hello World!! What's happening?? #excited.
-- Expected Output --
["Hello World!!", "What's happening??", "#excited."]

---- Test 4 ----
This is a weird case.. It happens sometimes..
-- Expected Output --
["This is a weird case.", "It happens sometimes."]

---- Test 5 ----
I went to the store, bought milk. Then, went to the park, enjoyed the day.
-- Expected Output --
["I went to the store, bought milk.", "Then, went to the park, enjoyed the day."]

---- Test 6 ----

-- Expected Output --
[]

---- Test 7 ----
###!!!
-- Expected Output --
[]

---- Test 8 ----
This is a test.

....?????##
$$%^#$@
!@#$!@%%
@@@
!!
...
....////\\][[]]

It should return two sentences.
-- Expected Output --
["This is a test.", "It should return two sentences."]

---- Test 9 ----
Although I went to the store,
I forgot to buy milk.
Next time, I’ll make a list.
-- Expected Output --
["Although I went to the store, I forgot to buy milk.", "Next time, I’ll make a list."]

---- Test 10 ----
Is this real?? Or #fantasy... Caught in a landslide, no escape...
-- Expected Output --
["Is this real??", "Or #fantasy.", "Caught in a landslide, no escape..."]
These are the 5 BEST open source text to speech softwares that I've come across over the past year.
This here is just a quick sample of my voice with a british accent, and this is how I actually sound.
208 changes: 0 additions & 208 deletions tortoise_api.py

This file was deleted.

0 comments on commit 77f435c

Please sign in to comment.