mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2024-12-30 18:49:01 +01:00
fb8d77f760
Used to overwrite the audio context size of the Encoder. For example, setting "audio_ctx = 512" will make it run about 3 times faster, processing about 10s of audio, instead of 30s. The transcription quality drops, but this can be used for real-time streaming purposes where performance is important. |
||
---|---|---|
.. | ||
CMakeLists.txt | ||
README.md | ||
stream.cpp |
stream
This is a naive example of performing real-time inference on audio from your microphone.
The stream
tool samples the audio every half a second and runs the transcription continously.
More info is available in issue #10.
./stream -m ./models/ggml-base.en.bin -t 8 --step 500 --length 5000
https://user-images.githubusercontent.com/1991296/194935793-76afede7-cfa8-48d8-a80f-28ba83be7d09.mp4
The stream
tool depends on SDL2 library to capture audio from the microphone. You can build it like this:
# Install SDL2 on Linux
sudo apt-get install libsdl2-dev
# Install SDL2 on Mac OS
brew install sdl2
make stream