0 Model v01 Tacotron2, DDC PWGAN
Thorsten Müller edited this page 2020-09-15 06:22:29 +02:00

Based on "thorsten" voice dataset v02 (normalized)

Release Candidate 1 - "early bird"

This early bird rc was published on september, 14th 2020 and is meant for first experiments. All files (configs and checkpoints on google drive.

Content of this rc

  • trained on dataset v02 (normalized)
  • trained on mozilla tts repo (commit d4319fe)
  • tacotron2 + ddc training for 360k steps (thanks to Olaf)
  • pwgan vocoder model (2,75 mio. steps) thanks to erogol
  • provide audio samples
  • jupyter notebook for testing
  • usage of german phoneme cleaner (thanks to repodiac)
  • usage of english loan words
  • use compute_statistics output in vocoder training
  • tbd

Audio samples

  • soundcloud link

How to use

  • Link to repodiacs repos (phoneme cleaner and dockerfile)
  • Link to ojthiele notebook

Known issues

  • Some problems with english words in german sentences
  • Sometimes last character in a word is dismissed
  • Output sounds little metallic

Release Candidate 2 - to come