Merge pull request #49 from JarodMica/version_3

Update to v3
JarodMica · Jun 10, 2024 · 77f435c · 77f435c
2 parents 3672514 + cb8fba7
commit 77f435c
Show file tree

Hide file tree

Showing 6 changed files with 18 additions and 274 deletions.
diff --git a/.gitignore b/.gitignore
@@ -12,4 +12,7 @@ __pycache__/
 venv/
 audiobooks/
 output/
+.vscode/
+tortoise_api/
+
 
diff --git a/README.md b/README.md
@@ -33,6 +33,7 @@ There are two ways to install this, via Package or Manually.  If you don't have
     - [ ] Highlight sentences for generation later (will need to do some type of edit to the json structure so that even if you close out, they are still highlighted)
     - [ ] Find a way to do "multiple speakers" for dialogue in the book (might involve a new tab where users can select sentences to regenerate)
     - [ ] Auto sentence regeneration and comparison using whisper (https://github.com/maxbachmann/RapidFuzz/) 
+    - [ ] Add a toggleable option for using rvc conversion
 
 
 ## Prerequisites:
@@ -85,7 +86,7 @@ venv\Scripts\activate
 ```
 4. Install pytorch using command below (recommended) or get from https://pytorch.org/get-started/locally/:
 
-```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117```
+```pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121```
 
 5. Install requirements:
 
@@ -97,6 +98,8 @@ venv\Scripts\activate
 
 ```pip install git+https://github.com/JarodMica/rvc-tts-pipeline.git@lightweight#egg=rvc_tts_pipe```
 
+```pip install git+https://github.com/JarodMica/tortoise_api.git```
+
 6. Download and install ffmpeg: https://ffmpeg.org/download.html
     - Place ffmpeg.exe and ffprobe.exe inside of audiobook_maker OR make sure they are in your environment path variable
 

diff --git a/audio_book_app_2_0.py → audio_book_app.py b/audio_book_app_2_0.py → audio_book_app.py
@@ -30,8 +30,8 @@
 script_directory = os.path.dirname(os.path.realpath(__file__))
 sys.path.append(script_directory)
 
-from tortoise_api import Tortoise_API
-from tortoise_api import load_sentences
+from tortoise_api.tortoise_api import load_sentences, load_config, call_api
+
 from rvc_pipe.rvc_infer import rvc_convert
 
 class AudioGenerationWorker(QThread):
@@ -72,7 +72,6 @@ def __init__(self):
 
         self.init_ui()
 
-        self.tortoise = Tortoise_API()
 
     def init_ui(self):
         # Main Layout
@@ -785,7 +784,9 @@ def generate_audio_for_sentence_threaded(self, directory_path, progress_callback
             progress_callback(progress_percentage)\
 
     def generate_audio(self, sentence):
-        audio_path = self.tortoise.call_api(sentence)
+        tort_setup = os.path.join(script_dir, "tort.yaml")
+        parameters = load_config(tort_setup)
+        audio_path = call_api(sentence, **parameters)
         selected_voice = self.voice_models_combo.currentText()
         selected_index = self.voice_index_combo.currentText()
         voice_model_path = os.path.join(self.voice_folder_path, selected_voice)

diff --git a/changelog.md b/changelog.md
@@ -1,5 +1,9 @@
 # Changelog & thoughts
 
+# 6/9/2024
+Bug fix for tortoise TTS API call implemented, lots of things in the pipeline need a little refreshing
+- Package version is not done yet.
+
 # 10/17/2023
 Bug fixes for next patch
 - Fixed hardcoded path in lightweight rvc package under configs.py for nvidia cards under 4GB

diff --git a/text_test1.txt b/text_test1.txt
@@ -1,61 +1,2 @@
----- Test 1 ----
-This is a simple test. It should work without any issues.
--- Expected Output --
-["This is a simple test.", "It should work without any issues."]
-
----- Test 2 ----
-Although I went to the store, I forgot to buy milk. Next time, I’ll make a list.
--- Expected Output --
-["Although I went to the store, I forgot to buy milk.", "Next time, I’ll make a list."]
-
----- Test 3 ----
-Hello World!! What's happening?? #excited.
--- Expected Output --
-["Hello World!!", "What's happening??", "#excited."]
-
----- Test 4 ----
-This is a weird case.. It happens sometimes..
--- Expected Output --
-["This is a weird case.", "It happens sometimes."]
-
----- Test 5 ----
-I went to the store, bought milk. Then, went to the park, enjoyed the day.
--- Expected Output --
-["I went to the store, bought milk.", "Then, went to the park, enjoyed the day."]
-
----- Test 6 ----
-
--- Expected Output --
-[]
-
----- Test 7 ----
-###!!!
--- Expected Output --
-[]
-
----- Test 8 ----
-    This is a test.
-
-....?????##
-$$%^#$@
-!@#$!@%%
-@@@
-!!
-...
-....////\\][[]]
-
-It should return two sentences.   
--- Expected Output --
-["This is a test.", "It should return two sentences."]
-
----- Test 9 ----
-Although I went to the store,
-I forgot to buy milk.
-Next time, I’ll make a list.
--- Expected Output --
-["Although I went to the store, I forgot to buy milk.", "Next time, I’ll make a list."]
-
----- Test 10 ----
-Is this real?? Or #fantasy... Caught in a landslide, no escape...
--- Expected Output --
-["Is this real??", "Or #fantasy.", "Caught in a landslide, no escape..."]
+These are the 5 BEST open source text to speech softwares that I've come across over the past year.
+This here is just a quick sample of my voice with a british accent, and this is how I actually sound.
diff --git a/tortoise_api.py b/tortoise_api.py