1
0
mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-16 01:07:57 +02:00
Files
.devops
.github
bindings
ci
cmake
examples
addon.node
bench
bench.wasm
cli
command
CMakeLists.txt
README.md
command.cpp
commands.txt
command.wasm
deprecation-warning
lsp
python
quantize
server
stream
stream.wasm
sycl
talk-llama
wchess
whisper.android
whisper.android.java
whisper.nvim
whisper.objc
whisper.swiftui
whisper.wasm
CMakeLists.txt
coi-serviceworker.js
common-ggml.cpp
common-ggml.h
common-sdl.cpp
common-sdl.h
common-whisper.cpp
common-whisper.h
common.cpp
common.h
ffmpeg-transcode.cpp
generate-karaoke.sh
grammar-parser.cpp
grammar-parser.h
helpers.js
json.hpp
livestream.sh
miniaudio.h
server.py
stb_vorbis.c
twitch.sh
yt-wsp.sh
ggml
grammars
include
models
samples
scripts
src
tests
.gitignore
.gitmodules
AUTHORS
CMakeLists.txt
LICENSE
Makefile
README.md
README_sycl.md
build-xcframework.sh
close-issue.yml
whisper.cpp/examples/command
Georgi Gerganov c64f3e8ada common : separate whisper sources ()
* common : separate whisper sources

* examples : add chrono

* examples : add more headers
2025-02-27 12:50:32 +02:00
..
2022-12-16 19:38:18 +02:00

whisper.cpp/examples/command

This is a basic Voice Assistant example that accepts voice commands from the microphone. More info is available in issue #171.

# Run with default arguments and small model
./whisper-command -m ./models/ggml-small.en.bin -t 8

# On Raspberry Pi, use tiny or base models + "-ac 768" for better performance
./whisper-command -m ./models/ggml-tiny.en.bin -ac 768 -t 3 -c 0

https://user-images.githubusercontent.com/1991296/204038393-2f846eae-c255-4099-a76d-5735c25c49da.mp4

Web version: examples/command.wasm

Guided mode

"Guided mode" allows you to specify a list of commands (i.e. strings) and the transcription will be guided to classify your command into one from the list. This can be useful in situations where a device is listening only for a small subset of commands.

Initial tests show that this approach might be extremely efficient in terms of performance, since it integrates very well with the "partial Encoder" idea from #137.

# Run in guided mode, the list of allowed commands is in commands.txt
./whisper-command -m ./models/ggml-base.en.bin -cmd ./examples/command/commands.txt

# On Raspberry Pi, in guided mode you can use "-ac 128" for extra performance
./whisper-command -m ./models/ggml-tiny.en.bin -cmd ./examples/command/commands.txt -ac 128 -t 3 -c 0

https://user-images.githubusercontent.com/1991296/207435352-8fc4ed3f-bde5-4555-9b8b-aeeb76bee969.mp4

Building

The whisper-command tool depends on SDL2 library to capture audio from the microphone. You can build it like this:

# Install SDL2
# On Debian based linux distributions:
sudo apt-get install libsdl2-dev

# On Fedora Linux:
sudo dnf install SDL2 SDL2-devel

# Install SDL2 on Mac OS
brew install sdl2

cmake -B build -DWHISPER_SDL2=ON
cmake --build build --config Release