Merge branch 'master' of https://github.com/thorstenMueller/deep-learning-german-tts

2024-11-24 17:03:11 +01:00 · 2021-09-18 16:04:59 +02:00 · 2021-09-18 16:04:59 +02:00 · 2853f111dc
commit 2853f111dc
parent 7540606247 0b9e929ce0
3 changed files with 24 additions and 19 deletions
--- a/README.md
+++ b/README.md
@ -15,8 +15,9 @@
  - [Emotional dataset download information](#emotional-dataset-download-information)

 - [Pretrained TTS models](#pretrained-tts-models)
-  - [Coqui models](#coqui-models)
-  - [Pre-trained Silero-models](#silero-models)
+  - [Quick setup voice synthesizing with Coqui models](#quick-steps-for-synthesizing-voice)
+  - [Pre-trained Silero-models](#silero)
+  - [ZDisket TensorVox](#ZDisket)

 - [Public talks](#public-talks)

@ -162,37 +163,33 @@ If you trained a model on "Thorsten" dataset please file an issue with some info
 My personal training sessions are based on TTS repo code (originally initiated by Mozilla) and now maintained through https://www.coqui.ai (:frog:)
 ## Coqui models

-### Easy pip install
+### Quick steps for synthesizing voice
 For all "Thorsten" coqui models i recommend setting up a virtual environment (*venv*).
+> Python 3.6 - 3.9 required
 * mkdir ThorstenVoice
 * cd ThorstenVoice
 * python3 -m venv .
 * source ./bin/activate
-* pip install -U pip
-* pip install -U tts
-* tts --list
-
-> tts-server --model_name tts_models/de/thorsten/tacotron2-DCA
-
-or
-
-> tts-server --model_name tts_models/de/thorsten/tacotron2-DCA --vocoder_name vocoder_models/universal/libri-tts/fullband-melgan 
-
+* pip install -U pip TTS
+* tts-server --model_name tts_models/de/thorsten/tacotron2-DCA
 * Open web-browser on http://localhost:5002

-Check details here: > see: https://github.com/coqui-ai/TTS/releases/tag/v0.0.11
+Details: https://github.com/coqui-ai/TTS/releases/tag/v0.0.11 or https://github.com/coqui-ai/TTS/releases/tag/v0.1.3 

+![Coqui web interface ](./img/CoquiWebThorstenVoice.png)

+Instead of web frontend you can call it by cli.
+> curl http://localhost:5002/api/tts?text=TEXT --output test.wav
 ### Download Coqui trained checkpoints / config
 | Model name | Coqui Repo branch / commit | Release date | Google Drive Download Link |
 |----------------------------------|---------------------------------------------------|----------------------|--------------------------------------------------------------------------------------|
 | Thorsten Tacotron2 DCA           | master / 0ee3eeefb553678d56c49534f3972a426a254649 | 2021-04-02           | [Google Drive Thorsten Taco2 DCA](https://drive.google.com/drive/folders/1m4RuffbvdOmQWnmy_Hmw0cZ_q0hj2o8B?usp=sharing) |
 | Thorsten Vocoder WaveGrad        | master / 0ee3eeefb553678d56c49534f3972a426a254649 | 2021-04-02           | [Google Drive Thorsten Vocoder WaveGrad](https://drive.google.com/drive/folders/1uOWpYH3yoDv5_3Dn_aDbAprEmyk1tDw5?usp=sharing) |
-| Thorsten Vocoder Fullband-MelGAN | master / 0ee3eeefb553678d56c49534f3972a426a254649 | training-in-progress | training-in-progress                                                                 |
+| Thorsten Vocoder Fullband-MelGAN | master / 0ee3eeefb553678d56c49534f3972a426a254649 | 2021-07-26 | [Google Drive Thorsten Vocoder Fullband-MelGAN](https://drive.google.com/drive/folders/1hsfaconm4Yd9wPVyOtrXjWQs4ZAPoouY?usp=sharing) or [Coqui v0.1.3 model download](https://github.com/coqui-ai/TTS/releases/tag/v0.1.3)                                     |
 | Thorsten Vocoder HifiGAN         |                                                   | planned              | planned                                                                              |
 | Thorsten Vocoder WaveRNN         |                                                   | planned              | planned                                                                              |

-## Silero-models
+## Silero

 You can use a free A-GPL licensed models trained on this dataset via the [silero-models](https://github.com/snakers4/silero-models) project. The full list of models including their older version is available via this [yaml file](https://github.com/snakers4/silero-models/blob/master/models.yml).

@ -201,7 +198,8 @@ You can use a free A-GPL licensed models trained on this dataset via the [silero
 | thorsten_8khz  | m      | de       | [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |
 | thorsten_16khz | m      | de       | [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |

-
+## ZDisket
+[ZDisket](https://github.com/ZDisket/) made a tool called [TensorVox](https://github.com/ZDisket/TensorVox) for setting up an TTS environment on Windows easily and included the german TTS model trained by [monatis](https://github.com/monatis/german-tts). Thanks for sharing that :thumbsup:. You can find more details on how to set up [here](https://github.com/ZDisket/TensorVox) or see it live in action on [Youtube](https://youtu.be/tY6_xZnkv-A).
 # Public talks
 > I really want to bring the topic "OpenVoice" to a bigger public attention, so i am happy to be invited as a speaker on that.

--- a/helperScripts/Dockerfile.Jetson-Coqui
+++ b/helperScripts/Dockerfile.Jetson-Coqui
@ -7,7 +7,7 @@ RUN echo "deb https://repo.download.nvidia.com/jetson/common r32.4 main" >> /etc
 RUN echo "deb https://repo.download.nvidia.com/jetson/t194 r32.4 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list

 RUN apt-get update -y
-RUN apt-get install vim python-mecab libmecab-dev cuda-toolkit-10-2 libcudnn8 libcudnn8-dev libsndfile1-dev -y
+RUN apt-get install vim python-mecab libmecab-dev cuda-toolkit-10-2 libcudnn8 libcudnn8-dev libsndfile1-dev locales -y

 # Setting some environment vars
 ENV LLVM_CONFIG=/usr/bin/llvm-config-9
@ -20,6 +20,13 @@ ENV NVIDIA_VISIBLE_DEVICES all
 ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
 LABEL com.nvidia.volumes.needed="nvidia_driver"

+# Adjust locale setting to your personal needs
+RUN sed -i '/de_DE.UTF-8/s/^# //g' /etc/locale.gen && \
+    locale-gen
+ENV LANG de_DE.UTF-8
+ENV LANGUAGE de_DE:de
+ENV LC_ALL de_DE.UTF-8
+
 RUN mkdir /coqui
 WORKDIR /coqui

--- a/img/CoquiWebThorstenVoice.png
+++ b/img/CoquiWebThorstenVoice.png