whisper.cpp/examples/talk
Georgi Gerganov 3a5302108d
sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677)
* sync : ggml

* sync : llama.cpp

* talk-llama : fix obsolete param

* ggml-alloc : fix ggml_tallocr_is_own

* talk.wasm : update to new ggml

* ggml : fix type punning in ggml_scale

* ggml : cuda jetson + arm quants warnings
2023-12-22 17:53:39 +02:00
..
.gitignore talk, talk-llama : add basic example script for eleven-labs tts (#728) 2023-04-14 19:53:58 +03:00
CMakeLists.txt whisper : add integer quantization support (#540) 2023-04-30 18:51:57 +03:00
eleven-labs.py examples : update elevenlabs scripts to use official python API (#837) 2023-05-24 21:11:01 +03:00
gpt-2.cpp sync : ggml (ggml_scale, ggml_row_size, etc.) (#1677) 2023-12-22 17:53:39 +02:00
gpt-2.h whisper : add integer quantization support (#540) 2023-04-30 18:51:57 +03:00
README.md examples: Update the README for Talk - fixing the gpt2 URL (#1334) 2023-10-01 04:21:32 +08:00
speak speak scripts for Windows 2023-06-01 22:45:00 +10:00
speak.bat speak scripts for Windows 2023-06-01 22:45:00 +10:00
speak.ps1 speak scripts for Windows 2023-06-01 22:45:00 +10:00
talk.cpp whisper : add context param to disable gpu (#1293) 2023-11-06 11:04:24 +02:00

talk

Talk with an Artificial Intelligence in your terminal

Demo Talk

Web version: examples/talk.wasm

Building

The talk tool depends on SDL2 library to capture audio from the microphone. You can build it like this:

# Install SDL2 on Linux
sudo apt-get install libsdl2-dev

# Install SDL2 on Mac OS
brew install sdl2

# Build the "talk" executable
make talk

# Run it
./talk -p Santa

GPT-2

To run this, you will need a ggml GPT-2 model: instructions

Alternatively, you can simply download the smallest ggml GPT-2 117M model (240 MB) like this:

wget --quiet --show-progress -O models/ggml-gpt-2-117M.bin https://huggingface.co/ggerganov/ggml/resolve/main/ggml-model-gpt-2-117M.bin

TTS

For best experience, this example needs a TTS tool to convert the generated text responses to voice. You can use any TTS engine that you would like - simply edit the speak script to your needs. By default, it is configured to use MacOS's say or espeak or Windows SpeechSynthesizer, but you can use whatever you wish.