As some people have privacy concerns using these services there are some (open source) projects trying to build offline and/or privacy aware alternatives.
But speech recognition and text synthesis still requires cloud services for providing these in a decent quality.
# MyCroft AI
> https://mycroft.ai/
MyCroft is a company developing an opensource voice assistant with a very nice and active community. But the stt/tts parts are still cloud based (eg. google services), even if requests are anonymized by a mycroft proxy in between. But integration with locally hosted services such as deepspeech (stt) or mimic/tacotron (tts) is possible.
# Mozilla
Mozilla works on these really important aspects for free and open human machine voice interaction.
## STT - speech to text
> https://commonvoice.mozilla.org/
"STT" needs lots of audio training data by many speakers (women/men/kids) of all ages, dialects and in various audio quality levels. So any voice contribution for common voice project is highly welcome.
## TTS - text to speech
> https://github.com/mozilla/tts
"TTS" needs lots of clean recordings by one speaker to train a model. Mozilla is developing a software stack for proper model training based on tacotron2 papers.
I want to make the most personal contribution i can give and contribute my personal voice (**german**) for TTS training to the community for free usage.
To get an impression what my voice sounds to decide if it fits to your project i published some sample recordings, so no need to download complete dataset first.
## Please read some personal words before downloading the dataset
I contribute my voice as a person believing in a world where all people are equal. No matter of gender, sexual orientation, religion, skin color and geocoordinates of birth location. A global world where everybody is warmly welcome on any place on this planet and open and free knowledge and education is available to everyone.
And last but not least i want to say a huge thank you to a special guy who supported me on this journey right from the beginning. Not just with nice words, but with his time, audio optimization knowhow and finally his gpu computing power.