Thorsten-Voice/README.md

![Thorsten - Open German Voice Dataset](./img/ThorstenVoice_Logo_Small.png "Thorsten - Open German Voice Dataset")

- [Introduction to "Thorsten-Voice" :speaking_head: :speech_balloon: :sloth:](#introduction-to-thorsten-voice-speaking_head-speech_balloon-sloth)
  
- [A personal note](#please-read-some-personal-words-before-using-dataset--tts-models)

- [Voice "Thorsten" (neutral)](#dataset-thorsten-neutral)
  - [Samples of my neutral voice](#samples-of-my-neutral-voice)
  - [Dataset information :microphone:](#dataset-information-microphone)
  - [Dataset evolution / changelog](#dataset-evolution)
  - [Download information](#neutral-dataset-download-information)

- [Voice "Thorsten" (emotional)](#dataset-Thorsten-emotional)
  - [Samples of my emotional voice](#samples-of-my-emotional-voice)
  - [Emotional dataset information :microphone:](#emotional-dataset-information-microphone)
  - [Emotional dataset download information](#emotional-dataset-download-information)

- [Pretrained TTS models](#pretrained-tts-models)
  - [Coqui models](#coqui-models)
  - [Pre-trained Silero-models](#silero-models)

- [Feel free to file an issue if you ...](#feel-free-to-file-an-issue-if-you-)
- [Recommended projects / communities](#recommended-projects)
- [Special thanks](#special-thanks)
- [Additional links](#additional-links)


# Introduction to "Thorsten-Voice" :speaking_head: :speech_balloon: :sloth:
## **A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.**

Speaking tech devices and voice based smart assistants are very popular ourdays. But for providing nice sounding TTS lot of projects depend on big tech cloud services for synthezing voice. While quality is quite good, there remain critical aspects like **privacy concerns** and **missing offline availablitiy**.

## True, but what is this all about
> I want to (*hopefully*) fill that german TTS gap and make the most personal contribution i can give.<br>
**I contribute my personal voice!** :green_heart:

## This contribution is split into three parts:
* "Thorsten" **neutral** dataset
* "Thorsten" **emotional** dataset
* Pretrained TTS models based on "Thorsten" dataset

# Please read some personal words before using dataset / TTS models
> I contribute my voice as a person believing in a world where all people are equal. No matter of gender, sexual orientation, religion, skin color and geocoordinates of birth location. A global world where everybody is warmly welcome on any place on this planet and open and free knowledge and education is available to everyone. :earth_africa:

**So hopefully my voice is used in this manner to make this world a better place for all of us :smiley:.**

**tl;dr** Please don't use for evil!

# Datasets


> For both datasets please keep in mind, that **i am no professional voice talent**. I'm just a normal guy sharing his voice with you.

## Dataset "Thorsten" neutral
### Samples of my neutral voice
To get an impression what my voice sounds to decide if it fits to your project i published some sample recordings, so no need to download complete dataset first.

* [Das Teilen eines Benutzerkontos ist strengstens untersagt.](./samples/original_recording/recorded_sample_01.wav )
* [Der Prophet spricht stets in Gleichnissen.](./samples/original_recording/recorded_sample_02.wav )
* [Bitte schmeißt euren Müll nicht einfach in die Walachei.](./samples/original_recording/recorded_sample_03.wav )
* [So etwas würde mir nie in den Sinn kommen.](./samples/original_recording/recorded_sample_04.wav )
* [Sie klettert auf einen Stein und nimmt eine Denkerpose ein.](./samples/original_recording/recorded_sample_05.wav )
* [Jede gute Küchenwaage hat eine Tara-Funktion.](./samples/original_recording/recorded_sample_06.wav )
* [Jeden Gedanken kannst du hier loswerden.](./samples/original_recording/recorded_sample_07.wav )

### Dataset information :microphone:

* ljspeech-1.1 structure
* 22.668 recorded phrases (wav files)
* more than 23 hours of pure audio
* samplerate 22.050Hz
* mono
* normalized to -24dB
* phrase length (min/avg/max): 2 / 52 / 180 chars
* no silence at beginning/ending
* avg spoken chars per second: 14
* sentences with question mark: 2.780
* sentences with exclamation mark: 1.840

![text length vs. mean audio duration](./img/thorsten-de---datasetAnalysis1.png)
![text length vs. median audio duration](./img/thorsten-de---datasetAnalysis2.png)
![text length vs. STD](./img/thorsten-de---datasetAnalysis3.png)
![text length vs. number instances](./img/thorsten-de---datasetAnalysis4.png)
![signal noise ratio](./img/thorsten-de---datasetAnalysis5.png)
![bokeh](./img/thorsten-de---datasetAnalysis6.png)

### Dataset evolution
As described in the pdf document ([evolution of thorsten dataset](./EvolutionOfThorstenDataset.pdf)) this dataset consists of three recording phases.

* **phase1**: Recorded with a cheap usb microphone
* **phase2**: Recorded with a good microphone
* **phase3**: Recorded with same good microphone but longer phrases (> 100 chars)

If you wanna use just a dataset subset (phase1 and/or phase2 and/or phase3) you can see which files belong to which recording phase in [recording quality](./RecordingQuality.csv) csv file.


### Neutral dataset download information
> Download size: 2,7GB

| Version         | Description                                                                                       | Date       | Link                                                                                                            |
| --------------- | ------------------------------------------------------------------------------------------------- | ---------- | --------------------------------------------------------------------------------------------------------------- |
| thorsten-de-v01 | Initial version                                                                                   | 2020-06-28 | [Google Drive Download v01](https://drive.google.com/file/d/1yKJM1LAOQpRVojKunD9r8WN_p5KzBxjc/view?usp=sharing) |
| thorsten-de-v02 | Normalized to -24dB and split metadata.csv into shuffeled metadata_train.csv and metadata_val.csv | 2020-08-22 | [Google Drive Download v02](https://drive.google.com/file/d/1mGWfG0s2V2TEg-AI2m85tze1m4pyeM7b/view?usp=sharing) |
| thorsten-de-v03 | Based on v02 dataset, but with increased speed by 10% (using ffmpeg atempo=1.1).                  | 2021-02-10 | [Google Drive Download v03](https://drive.google.com/file/d/134_UramfCRoAxRrOnhbPJ2YHHTwxRtr-/view?usp=sharing) |


## Dataset "Thorsten" emotional
### Samples of my emotional voice
*Btw. i mentioned, that i'm no professional voice talent, did i?*
> "Mist, wieder nichts geschafft."
* [neutral](./samples/emotional_recording/neutral.wav)
* [disgusted](./samples/emotional_recording/disgusted.wav)
* [angry](./samples/emotional_recording/angry.wav)
* [amused](./samples/emotional_recording/amused.wav)
* [surprised](./samples/emotional_recording/surprised.wav)
* [sleepy](./samples/emotional_recording/sleepy.wav)
### Emotional dataset information :microphone:
* 300 sentences * 6 emotions = 1.800 recordings
* recorded by Thorsten Müller (optimized by Dominik Kreutz)
* mono
* samplerate 22.050Hz
* normalized to -24dB
* no silence at beginning/ending
* sentence length: 59 - 148 chars

| Emotion   | Minutes |
|-----------|---------|
| Normal :slightly_smiling_face:    | 19 min. |
| Disgusted :nauseated_face: | 23 min. |
| Angry :angry:    | 20 min. |
| Amused :grinning:    | 18 min. |
| Surprised :astonished: | 18 min. |
| Sleepy :pensive:    | 30 min. |

### Emotional dataset download information
> Download size: 300MB

| Version         | Description                                                                                       | Date       | Link                                                                                                            |
| --------------- | ------------------------------------------------------------------------------------------------- | ---------- | --------------------------------------------------------------------------------------------------------------- |
| thorsten-de-emotional-v01 | Initial version                                                                                   | 2021-04-03 | [Google Drive Download v01](https://drive.google.com/file/d/1fm2IqXMLr6jaZCgG_Mt4vq_O3ZubiIQ6/view?usp=sharing) |


# Pretrained TTS models
If you trained a model on "Thorsten" dataset please file an issue with some information on it. Sharing a trained model is highly appreciated.

My personal training sessions are based on TTS repo code (originally initiated by Mozilla) and now maintained through https://www.coqui.ai (:frog:)
## Coqui models
For all "Thorsten" coqui models i recommend setting up a virtual environment (*venv*).
* mkdir ThorstenVoice
* cd ThorstenVoice
* python3 -m venv .
* source ./bin/activate
* pip install -U pip
* pip install -U tts
* *start coqui server model with one of the following model combinations*
* Open web-browser on http://localhost:5002
### Tacotron2 + DCA (DynamicConvolution Attention) & WaveGrad vocoder
> Using option "use_cuda=true" is recommended for better real time factor.
RTF (CPU) around 25x realtime
RTF (GPU) around 4x realtime
* tts-server --model_name tts_models/de/thorsten/tacotron2-DCA

See: https://github.com/coqui-ai/TTS/releases/tag/v0.0.11

### Tacotron2 + DCA (DynamicConvolution Attention) & Fullband-MelGAN (universal) vocoder
> RTF is less then 0.5 realtime
* tts-server --model_name tts_models/de/thorsten/tacotron2-DCA --vocoder_name vocoder_models/universal/libri-tts/fullband-melgan 

## Silero-models

You can use a free A-GPL licensed models trained on this dataset via the [silero-models](https://github.com/snakers4/silero-models) project. The full list of models including their older version is available via this [yaml file](https://github.com/snakers4/silero-models/blob/master/models.yml).

| Speaker        | Gender | Language | Examples                                                                                                                                                                                     | Colab                                                                                                                                                                        |
| -------------- | ------ | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| thorsten_8khz  | m      | de       | [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |
| thorsten_16khz | m      | de       | [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) |


# Feel free to file an issue if you ...
* have improvements on dataset
* use my TTS voice in your project(s)
* want to share your trained "Thorsten" model
* get to know about any abuse usage of my voice

# Recommended projects
* https://mycroft.ai/ (*for building an opensource privacy friendly voice assistant*)
* https://www.mozilla.org (*for initiating voice projects for STT and TTS*)
* https://coqui.ai/ (*for keeping voice projects running*)
* https://github.com/coqui-ai/TTS
* https://github.com/TensorSpeech/TensorFlowTTS
* https://github.com/rhasspy/de_larynx-thorsten

# Special thanks
I want to thank all open source communities for providing great projects.

Just to name some nice guys who joined me on this TTS roadtrip:

* eltocino (https://github.com/el-tocino/)
* erogol (https://github.com/erogol/)
* gras64 (https://github.com/gras64/)
* krisgesling (https://github.com/krisgesling/)
* nmstoker (https://github.com/nmstoker)
* othiele (https://discourse.mozilla.org/u/othiele/summary)
* repodiac (https://github.com/repodiac)
* SanjaESC (https://github.com/SanjaESC)

Additionally, a really nice thanks for my dear colleague, Sebastian Kraus, for supporting me with audio recording equipment and for being the creative mastermind behind the logo design.

And last but not least i want to say a huge thank you to a special guy who supported me on this journey right from the beginning. Not just with nice words, but with his time, audio optimization knowhow and finally his gpu computing power. 

Without his amazing support this dataset (in it's current way) would not exists.

Thank you Dominik (@domcross / https://github.com/domcross/)

# Additional links
* https://medium.com/@thorsten_Mueller/why-ive-chosen-to-donate-my-german-voice-for-mankind-177beeb91675
* https://discourse.mozilla.org/t/contributing-my-german-voice-for-tts/48150
* https://community.mycroft.ai/
* https://github.com/MycroftAI/mimic-recording-studio

We'll hear us in future :speaking_head:

Thorsten
(https://twitter.com/ThorstenVoice)
Added smaller logo 2021-03-30 08:00:58 +02:00			`![Thorsten - Open German Voice Dataset](./img/ThorstenVoice_Logo_Small.png "Thorsten - Open German Voice Dataset")`
Add silero-models 2021-04-03 07:17:14 +02:00
			`- [Introduction to "Thorsten-Voice" :speaking_head: :speech_balloon: :sloth:](#introduction-to-thorsten-voice-speaking_head-speech_balloon-sloth)`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00
			`- [A personal note](#please-read-some-personal-words-before-using-dataset--tts-models)`

			`- [Voice "Thorsten" (neutral)](#dataset-thorsten-neutral)`
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`- [Samples of my neutral voice](#samples-of-my-neutral-voice)`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`- [Dataset information :microphone:](#dataset-information-microphone)`
			`- [Dataset evolution / changelog](#dataset-evolution)`
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`- [Download information](#neutral-dataset-download-information)`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`- [Voice "Thorsten" (emotional)](#dataset-Thorsten-emotional)`
			`- [Samples of my emotional voice](#samples-of-my-emotional-voice)`
			`- [Emotional dataset information :microphone:](#emotional-dataset-information-microphone)`
			`- [Emotional dataset download information](#emotional-dataset-download-information)`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00
Small TOC fix 2021-04-03 23:48:10 +02:00			`- [Pretrained TTS models](#pretrained-tts-models)`
			`- [Coqui models](#coqui-models)`
			`- [Pre-trained Silero-models](#silero-models)`

Add silero-models 2021-04-03 07:17:14 +02:00			`- [Feel free to file an issue if you ...](#feel-free-to-file-an-issue-if-you-)`
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`- [Recommended projects / communities](#recommended-projects)`
Add silero-models 2021-04-03 07:17:14 +02:00			`- [Special thanks](#special-thanks)`
			`- [Additional links](#additional-links)`


Added cute sloth smiley. 2021-03-30 12:07:41 +02:00			`# Introduction to "Thorsten-Voice" :speaking_head: :speech_balloon: :sloth:`
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`## A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.`
Update README.md 2019-10-31 22:07:59 +01:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`Speaking tech devices and voice based smart assistants are very popular ourdays. But for providing nice sounding TTS lot of projects depend on big tech cloud services for synthezing voice. While quality is quite good, there remain critical aspects like privacy concerns and missing offline availablitiy.`
README update due dataset release 2020-08-05 17:25:01 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`## True, but what is this all about`
			`> I want to (hopefully) fill that german TTS gap and make the most personal contribution i can give.<br>`
			`I contribute my personal voice! :green_heart:`
README update due dataset release 2020-08-05 17:25:01 +02:00
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`## This contribution is split into three parts:`
			`* "Thorsten" neutral dataset`
			`* "Thorsten" emotional dataset`
Small README updates 2021-03-16 18:51:21 +01:00			`* Pretrained TTS models based on "Thorsten" dataset`
README update due dataset release 2020-08-05 17:25:01 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`# Please read some personal words before using dataset / TTS models`
			`> I contribute my voice as a person believing in a world where all people are equal. No matter of gender, sexual orientation, religion, skin color and geocoordinates of birth location. A global world where everybody is warmly welcome on any place on this planet and open and free knowledge and education is available to everyone. :earth_africa:`
README update due dataset release 2020-08-05 17:25:01 +02:00
Small README updates 2021-03-16 18:51:21 +01:00			`So hopefully my voice is used in this manner to make this world a better place for all of us :smiley:.`
Small change 2020-08-05 20:13:59 +02:00
Update README.md 2020-08-05 20:22:21 +02:00			`tl;dr Please don't use for evil!`
Update README.md 2019-10-31 22:07:59 +01:00
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`# Datasets`


			`> For both datasets please keep in mind, that i am no professional voice talent. I'm just a normal guy sharing his voice with you.`

			`## Dataset "Thorsten" neutral`
			`### Samples of my neutral voice`
README update due dataset release 2020-08-05 17:25:01 +02:00			`To get an impression what my voice sounds to decide if it fits to your project i published some sample recordings, so no need to download complete dataset first.`
Sample recordings of my original voices (#1). 2020-01-09 22:24:35 +01:00
Added phrases of sample wavs in README 2020-01-10 20:07:53 +01:00			`* [Das Teilen eines Benutzerkontos ist strengstens untersagt.](./samples/original_recording/recorded_sample_01.wav )`
			`* [Der Prophet spricht stets in Gleichnissen.](./samples/original_recording/recorded_sample_02.wav )`
			`* [Bitte schmeißt euren Müll nicht einfach in die Walachei.](./samples/original_recording/recorded_sample_03.wav )`
			`* [So etwas würde mir nie in den Sinn kommen.](./samples/original_recording/recorded_sample_04.wav )`
			`* [Sie klettert auf einen Stein und nimmt eine Denkerpose ein.](./samples/original_recording/recorded_sample_05.wav )`
			`* [Jede gute Küchenwaage hat eine Tara-Funktion.](./samples/original_recording/recorded_sample_06.wav )`
			`* [Jeden Gedanken kannst du hier loswerden.](./samples/original_recording/recorded_sample_07.wav )`
Update README.md 2019-10-31 22:23:03 +01:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`### Dataset information :microphone:`
Update README.md 2019-10-31 22:15:51 +01:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`* ljspeech-1.1 structure`
			`* 22.668 recorded phrases (wav files)`
			`* more than 23 hours of pure audio`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00			`* samplerate 22.050Hz`
			`* mono`
Added normalization info in README 2020-08-22 13:12:50 +02:00			`* normalized to -24dB`
README update due dataset release 2020-08-05 17:25:01 +02:00			`* phrase length (min/avg/max): 2 / 52 / 180 chars`
			`* no silence at beginning/ending`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00			`* avg spoken chars per second: 14`
README update due dataset release 2020-08-05 17:25:01 +02:00			`* sentences with question mark: 2.780`
			`* sentences with exclamation mark: 1.840`
Update README.md 2019-10-31 22:23:03 +01:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`![text length vs. mean audio duration](./img/thorsten-de---datasetAnalysis1.png)`
			`![text length vs. median audio duration](./img/thorsten-de---datasetAnalysis2.png)`
			`![text length vs. STD](./img/thorsten-de---datasetAnalysis3.png)`
			`![text length vs. number instances](./img/thorsten-de---datasetAnalysis4.png)`
			`![signal noise ratio](./img/thorsten-de---datasetAnalysis5.png)`
			`![bokeh](./img/thorsten-de---datasetAnalysis6.png)`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`### Dataset evolution`
fixed typo 2020-09-23 19:32:27 +02:00			`As described in the pdf document ([evolution of thorsten dataset](./EvolutionOfThorstenDataset.pdf)) this dataset consists of three recording phases.`
Added recording quality csv 2020-09-23 18:05:38 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`* phase1: Recorded with a cheap usb microphone`
			`* phase2: Recorded with a good microphone`
			`* phase3: Recorded with same good microphone but longer phrases (> 100 chars)`
Added recording quality csv 2020-09-23 18:05:38 +02:00
			`If you wanna use just a dataset subset (phase1 and/or phase2 and/or phase3) you can see which files belong to which recording phase in [recording quality](./RecordingQuality.csv) csv file.`

Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`### Neutral dataset download information`
Update for dataset version 2 2020-08-22 12:06:07 +02:00			`> Download size: 2,7GB`
Update README.md 2019-10-31 22:23:03 +01:00
Add silero-models 2021-04-03 07:17:14 +02:00			`\| Version \| Description \| Date \| Link \|`
			`\| --------------- \| ------------------------------------------------------------------------------------------------- \| ---------- \| --------------------------------------------------------------------------------------------------------------- \|`
			`\| thorsten-de-v01 \| Initial version \| 2020-06-28 \| [Google Drive Download v01](https://drive.google.com/file/d/1yKJM1LAOQpRVojKunD9r8WN_p5KzBxjc/view?usp=sharing) \|`
			`\| thorsten-de-v02 \| Normalized to -24dB and split metadata.csv into shuffeled metadata_train.csv and metadata_val.csv \| 2020-08-22 \| [Google Drive Download v02](https://drive.google.com/file/d/1mGWfG0s2V2TEg-AI2m85tze1m4pyeM7b/view?usp=sharing) \|`
			`\| thorsten-de-v03 \| Based on v02 dataset, but with increased speed by 10% (using ffmpeg atempo=1.1). \| 2021-02-10 \| [Google Drive Download v03](https://drive.google.com/file/d/134_UramfCRoAxRrOnhbPJ2YHHTwxRtr-/view?usp=sharing) \|`
Update README.md 2019-10-31 22:23:03 +01:00
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`## Dataset "Thorsten" emotional`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`### Samples of my emotional voice`
			`Btw. i mentioned, that i'm no professional voice talent, did i?`
			`> "Mist, wieder nichts geschafft."`
			`* [neutral](./samples/emotional_recording/neutral.wav)`
			`* [disgusted](./samples/emotional_recording/disgusted.wav)`
			`* [angry](./samples/emotional_recording/angry.wav)`
			`* [amused](./samples/emotional_recording/amused.wav)`
			`* [surprised](./samples/emotional_recording/surprised.wav)`
			`* [sleepy](./samples/emotional_recording/sleepy.wav)`
			`### Emotional dataset information :microphone:`
			`* 300 sentences * 6 emotions = 1.800 recordings`
			`* recorded by Thorsten Müller (optimized by Dominik Kreutz)`
			`* mono`
			`* samplerate 22.050Hz`
			`* normalized to -24dB`
			`* no silence at beginning/ending`
			`* sentence length: 59 - 148 chars`
First draft for script/Dockerimage 2020-08-09 11:33:37 +02:00
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`\| Emotion \| Minutes \|`
			`\|-----------\|---------\|`
			`\| Normal :slightly_smiling_face: \| 19 min. \|`
			`\| Disgusted :nauseated_face: \| 23 min. \|`
			`\| Angry :angry: \| 20 min. \|`
			`\| Amused :grinning: \| 18 min. \|`
			`\| Surprised :astonished: \| 18 min. \|`
			`\| Sleepy :pensive: \| 30 min. \|`
Table formatting in README 2020-08-09 11:39:15 +02:00
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`### Emotional dataset download information`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`> Download size: 300MB`

			`\| Version \| Description \| Date \| Link \|`
			`\| --------------- \| ------------------------------------------------------------------------------------------------- \| ---------- \| --------------------------------------------------------------------------------------------------------------- \|`
			`\| thorsten-de-emotional-v01 \| Initial version \| 2021-04-03 \| [Google Drive Download v01](https://drive.google.com/file/d/1fm2IqXMLr6jaZCgG_Mt4vq_O3ZubiIQ6/view?usp=sharing) \|`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00
			`# Pretrained TTS models`
			`If you trained a model on "Thorsten" dataset please file an issue with some information on it. Sharing a trained model is highly appreciated.`

			`My personal training sessions are based on TTS repo code (originally initiated by Mozilla) and now maintained through https://www.coqui.ai (:frog:)`
			`## Coqui models`
added details on coqui model usage. 2021-04-05 16:57:36 +02:00			`For all "Thorsten" coqui models i recommend setting up a virtual environment (venv).`
			`* mkdir ThorstenVoice`
			`* cd ThorstenVoice`
			`* python3 -m venv .`
			`* source ./bin/activate`
			`* pip install -U pip`
			`* pip install -U tts`
			`* start coqui server model with one of the following model combinations`
			`* Open web-browser on http://localhost:5002`
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`### Tacotron2 + DCA (DynamicConvolution Attention) & WaveGrad vocoder`
added details on coqui model usage. 2021-04-05 16:57:36 +02:00			`> Using option "use_cuda=true" is recommended for better real time factor.`
			`RTF (CPU) around 25x realtime`
			`RTF (GPU) around 4x realtime`
			`* tts-server --model_name tts_models/de/thorsten/tacotron2-DCA`

			`See: https://github.com/coqui-ai/TTS/releases/tag/v0.0.11`

			`### Tacotron2 + DCA (DynamicConvolution Attention) & Fullband-MelGAN (universal) vocoder`
			`> RTF is less then 0.5 realtime`
			`* tts-server --model_name tts_models/de/thorsten/tacotron2-DCA --vocoder_name vocoder_models/universal/libri-tts/fullband-melgan`

Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`## Silero-models`
Add silero-models 2021-04-03 07:17:14 +02:00
			`You can use a free A-GPL licensed models trained on this dataset via the [silero-models](https://github.com/snakers4/silero-models) project. The full list of models including their older version is available via this [yaml file](https://github.com/snakers4/silero-models/blob/master/models.yml).`

			`\| Speaker \| Gender \| Language \| Examples \| Colab \|`
			`\| -------------- \| ------ \| -------- \| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- \| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- \|`
			`\| thorsten_8khz \| m \| de \| [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) \|`
			`\| thorsten_16khz \| m \| de \| [8000](https://drive.google.com/drive/folders/1mpQCK5E_IqhcSurnYuGePJiJWL4ZL08z?usp=sharing) / [16000](https://drive.google.com/drive/folders/1tR6w4kgRS2JJ1TWZhwoFuU04Xkgo6YAs?usp=sharing) \| [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_tts.ipynb) \|`


Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`# Feel free to file an issue if you ...`
			`* have improvements on dataset`
			`* use my TTS voice in your project(s)`
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`* want to share your trained "Thorsten" model`
README update due dataset release 2020-08-05 17:25:01 +02:00			`* get to know about any abuse usage of my voice`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Small fixes in TOC 2021-04-03 23:45:46 +02:00			`# Recommended projects`
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`* https://mycroft.ai/ (for building an opensource privacy friendly voice assistant)`
			`* https://www.mozilla.org (for initiating voice projects for STT and TTS)`
			`* https://coqui.ai/ (for keeping voice projects running)`
			`* https://github.com/coqui-ai/TTS`
			`* https://github.com/TensorSpeech/TensorFlowTTS`
			`* https://github.com/rhasspy/de_larynx-thorsten`

Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00			`# Special thanks`
README update due dataset release 2020-08-05 17:25:01 +02:00			`I want to thank all open source communities for providing great projects.`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`Just to name some nice guys who joined me on this TTS roadtrip:`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`* eltocino (https://github.com/el-tocino/)`
			`* erogol (https://github.com/erogol/)`
			`* gras64 (https://github.com/gras64/)`
			`* krisgesling (https://github.com/krisgesling/)`
			`* nmstoker (https://github.com/nmstoker)`
			`* othiele (https://discourse.mozilla.org/u/othiele/summary)`
			`* repodiac (https://github.com/repodiac)`
Added nice guy SanjaESC to thanks section 2021-01-22 16:24:56 +01:00			`* SanjaESC (https://github.com/SanjaESC)`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
Added Sebastian to thanks section - Thank you :-) 2021-01-16 08:24:10 +01:00			`Additionally, a really nice thanks for my dear colleague, Sebastian Kraus, for supporting me with audio recording equipment and for being the creative mastermind behind the logo design.`

README update due dataset release 2020-08-05 17:25:01 +02:00			`And last but not least i want to say a huge thank you to a special guy who supported me on this journey right from the beginning. Not just with nice words, but with his time, audio optimization knowhow and finally his gpu computing power.`
Update tensorboard graphs, dataset details and added samples 2020-04-17 18:56:35 +02:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`Without his amazing support this dataset (in it's current way) would not exists.`
Update README.md 2019-10-31 22:23:03 +01:00
README update due dataset release 2020-08-05 17:25:01 +02:00			`Thank you Dominik (@domcross / https://github.com/domcross/)`
Added graphics, google drive download link 2019-11-03 20:50:49 +01:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`# Additional links`
			`* https://medium.com/@thorsten_Mueller/why-ive-chosen-to-donate-my-german-voice-for-mankind-177beeb91675`
README update due dataset release 2020-08-05 17:25:01 +02:00			`* https://discourse.mozilla.org/t/contributing-my-german-voice-for-tts/48150`
Update README.md 2019-11-01 10:34:46 +01:00			`* https://community.mycroft.ai/`
			`* https://github.com/MycroftAI/mimic-recording-studio`
README update due dataset release 2020-08-05 17:25:01 +02:00
Small text adjustments and formatting on README. 2021-03-16 18:41:39 +01:00			`We'll hear us in future :speaking_head:`
README update due dataset release 2020-08-05 17:25:01 +02:00
Small README updates 2021-03-16 18:51:21 +01:00			`Thorsten`
Adding info on emotional dataset. 2021-04-03 23:24:53 +02:00			`(https://twitter.com/ThorstenVoice)`