whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-06 06:55:53 +02:00

Author	SHA1	Message	Date
Daniel Bevenius	32cf4e2aba	whisper : add version function (#3289 ) * whisper : add version function This commit adds a version function to the whisper API. The motivation for this is that it might be convenient to have a way to programmatically check the version. Example usage: ```c++ printf("Using whisper version: %s\n", whisper_version()); ``` Will output: ```console Using whisper version: 1.7.6 ``` * examples : add version to android example CMakeLists.txt	2025-06-26 18:09:42 +02:00
Georgi Gerganov	dc8dda60ee	bench : print system info before ctx check	2025-06-25 16:01:32 +03:00
Daniel Bevenius	1ad258ca31	stream : add nullptr check of whisper_context (#3283 ) * stream : add nullptr check of whisper_context This commit adds a check to ensure that the `whisper_context` is not null after initialization. The motivation for this is that currently, if the initialization fails, the program continues to run leading to a segmentation fault. This sort of check is performed by others examples like whisper-cli. Refs: https://github.com/ggml-org/whisper.cpp/issues/3280#issuecomment-3003778035 * examples : add nullptr check for whisper_context	2025-06-25 14:16:31 +02:00
Aaron Ang	4d6ae52ed3	command: output commands to text file (#3273 ) This commit implements code for the command line argument `-f --file FNAME` which is currently missing.	2025-06-24 06:41:21 +02:00
Georgi Gerganov	e6c10cf3d5	talk-llama : sync llama.cpp ggml-ci	2025-06-21 07:34:17 +03:00
Daniel Bevenius	3e65f518dd	android : update CMakeLists.txt to use FetchContent for ggml (#3268 ) * android : update CMakeLists.txt to use FetchContent for ggml This commit updates the CMakeLists.txt file for the Android Whisper example to use FetchContent for managing the ggml library. The motivation for this change is avoid having to make manual changes to the CMakeLists.txt file after syncing the ggml library. I've built and run the example locally to verify that it works as expected. Refs: https://github.com/ggml-org/whisper.cpp/pull/3265#issuecomment-2986715717 * android.java : update cmake to use FetchContent for ggml This commit updates the CMake configuration for the Android Java example to use `FetchContent` for including the `ggml` library. Do be able to use FetchContent we also update the `compileSdkVersion` and `targetSdkVersion` to 31, and the `buildToolsVersion` to '30.0.3'. This also required a an update to the Gradle plugin version to 7.4.0. The motivation for this change is avoid having to make manual changes to the CMakeLists.txt file after syncing the ggml library.	2025-06-19 16:06:42 +02:00
Georgi Gerganov	17bece1885	cmake : fix android build (#3265 ) * cmake : fix android build --------- Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2025-06-19 08:24:41 +02:00
Daniel Bevenius	ecb8f3c2b4	examples : add stereo to mono conversion in read_audio_data (#3266 ) This commit adds a conversion from stereo to mono in the `read_audio_data` function of `common-whisper.cpp`. The motivation for this change is prior to Commit `7d3da68f79` ("examples : use miniaudio for direct decoding flac, mp3, ogg and wav (#2759)", there was a step that read stereo int16 data -> pcm16 (448512 samples), and then converted to mono (224256 samples), and then also convert to stereo in `pcmf32s. The middle step here seems to have been missed when rewriting the code to use Miniaudio and caused issues then transcribing stereo audio files. For example, currently using the audio sample in the linked issue the output is: ```console [00:00:00.000 --> 00:00:03.000] (speaker 1) Sous-titres réalisés para la communauté d'Amara.org ``` And with the change in this commit the output is: ``` [00:00:00.000 --> 00:00:01.500] (speaker 1) sonnerie de téléphone [00:00:01.500 --> 00:00:07.000] (speaker 1) Salut jeune homme ! [00:00:07.000 --> 00:00:08.500] (speaker 0) C'est vrai que je te dérange ? [00:00:08.500 --> 00:00:10.500] (speaker 1) Ah pas du tout, pas du tout, pas du tout ! [00:00:10.500 --> 00:00:12.500] (speaker 1) J'étais en train de... [00:00:12.500 --> 00:00:14.500] (speaker 1) de préparer un courrier ``` Resolves: https://github.com/ggml-org/whisper.cpp/issues/3092	2025-06-18 17:41:43 +02:00
Georgi Gerganov	2f60ebc3c2	talk-llama : sync llama.cpp ggml-ci	2025-06-18 12:40:34 +03:00
Daniel Bevenius	f3ff80ea8d	examples : set the C++ standard to C++17 for server (#3261 ) This commit updates the server example to use C++17 as the standard. The motivation for this change is that currently the ci-run `ggml-100-mac-m4` is failing when compiling the server example on macOS. The `talk-llama` example also has this setting so it looks like an alright change to make. ggml-ci Refs: https://github.com/ggml-org/ci/tree/results/whisper.cpp/2a/4d6db7d90899aff3d58d70996916968e4e0d27/ggml-100-mac-m4	2025-06-17 11:29:48 +02:00
w1redch4d	2a4d6db7d9	examples : update usage/help in yt-wsp.sh (#3251 ) This commit updates the usage/help message to be more readable and include the environment variables available to set options.	2025-06-16 12:21:16 +02:00
Sacha Arbonel	107c303e69	server : graceful shutdown, atomic server state, and health endpoint Improvements (#3243 ) * feat(server): implement graceful shutdown and server state management * refactor(server): use lambda capture by reference in server.cpp	2025-06-16 10:14:26 +02:00
Daniel Bevenius	0a4d85cf8a	server : add Voice Activity Detection (VAD) support (#3246 ) * server : add Voice Activity Detection (VAD) support This commit adds support for Voice Activity Detection (VAD) in the server example. The motivation for this is to enable VAD processing when using whisper-server. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3089 * server : add VAD parameters to usage in README.md [no ci] This commit also adds a few missing parameters. * server : fix conflicting short options [no ci]	2025-06-13 13:24:03 +02:00
Daniel Bevenius	9df8d54bcb	cli : fix short name conflict for vad options [no ci] (#3247 ) This commit fixes a short name conflict whisper-cli for `--vad-min-speech-duration-ms` and `--vad-min-silence-duration-ms` which currently have the same short name `-vsd`. Refs: https://github.com/ggml-org/whisper.cpp/pull/3246#pullrequestreview-2923800114	2025-06-13 10:25:25 +02:00
Georgi Gerganov	962361bd79	android : fix builds (#0 ) ggml-ci	2025-06-10 12:40:33 +03:00
Georgi Gerganov	db264d6220	talk-llama : sync llama.cpp ggml-ci	2025-06-10 12:40:33 +03:00
Daniel Bevenius	b505539670	node : add language detection support (#3190 ) This commit add support for language detection in the Whisper Node.js addon example. It also updates the node addon to return an object instead of an array as the results. The motivation for this change is to enable the inclusion of the detected language in the result, in addition to the transcription segments. For example, when using the `detect_language` option, the result will now be: ```console { language: 'en' } ``` And if the `language` option is set to "auto", it will also return: ```console { language: 'en', transcription: [ [ '00:00:00.000', '00:00:07.600', ' And so my fellow Americans, ask not what your country can do for you,' ], [ '00:00:07.600', '00:00:10.600', ' ask what you can do for your country.' ] ] } ```	2025-06-02 14:58:05 +02:00
Georgi Gerganov	7fd6fa8097	talk-llama : sync llama.cpp ggml-ci	2025-06-01 15:14:44 +03:00
Daniel Bevenius	73a8c5fb94	whisper : remove whisper_load_backends function (#3196 ) * whisper : remove whisper_load_backends function This commit removes the `whisper_load_backends` function, which was used to load all GGML backends. The motivation for this change push the responsibility of loading backends to user applications to give them more control over which backends to load and when. See the references below for more context. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3182 Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801778733 Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801928990 * ruby : add check for rwc is NULL This commit adds a check to ensure that the `rwc` pointer is not NULL before attempting to mark its members in the garbage collector. The motivation for this is an attempt to see if this fixed the CI build as I'm not able to reproduce the issue locally. Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/15299612277/job/43036694928?pr=3196	2025-05-29 08:03:17 +02:00
Georgi Gerganov	26eb48cb08	talk-llama : sync llama.cpp ggml-ci	2025-05-27 18:03:00 +03:00
Daniel Bevenius	450de0787e	node : enable no_prints to suppress all output (#3189 ) This commit enable the node addon to suppress all output, even the result of the transcription if the no_prints parameter is set to true. The motivation for this is that for the node addon there is a fullfilment handler/success callback to process the transcription result. And it might be useful to be able to disable the printing of the transcription result to the console, so that the user can handle the result in their own way. Refs: https://github.com/ggml-org/whisper.cpp/issues/3176	2025-05-27 05:51:47 +02:00
matteng1	ea9f206f18	talk-llama : fix for swedish umlauts + expose model inference settings in talk-llama.cpp (#3187 ) Quick fix for not removing swedish umlauts. * Update talk-llama.cpp Expose model inference settings to user instead of hard coding them. Same defaults as previous defaults. * Update examples/talk-llama/talk-llama.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-26 07:57:39 +02:00
Sacha Arbonel	78b31ca782	server : Add k6 Load Testing Script (#3175 ) * add load testing script and update README for k6 integration	2025-05-22 10:03:04 +02:00
Georgi Gerganov	6b6cf19c65	talk-llama : sync llama.cpp ggml-ci	2025-05-19 14:58:39 +03:00
Daniel Bevenius	f389d7e3e5	examples : add --print-confidence option to cli (#3150 ) * examples : add --print-confidence option to cli This commit adds a new command-line option `--print-confidence` to the whisper-cli. When enabled, this option prints the confidence level of each token in the transcribed text using ANSI formatting codes. The confidence levels are represented using different styles: ```console main: confidence: highlighted (low confidence), underlined (medium), dim (high confidence) ``` Refs: https://github.com/ggml-org/whisper.cpp/issues/3135	2025-05-14 19:21:48 +02:00
Daniel Bevenius	3882a099e1	server : add --flash-attn usage output (#3152 ) This commit adds the `--flash-attn` option to the usage output of the server example. The motivation for this change is that while it is possible to set this option it is not printed in the usage output.	2025-05-14 15:22:05 +02:00
Georgi Gerganov	f890560575	talk-llama : sync llama.cpp ggml-ci	2025-05-13 13:59:21 +03:00
Daniel Bevenius	fbad8058c4	examples : add VAD speech segments example (#3147 ) This commit adds an example that demonstrates how to use a VAD (Voice Activity Detection) model to segment an audio file into speech segments. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3144	2025-05-13 12:31:00 +02:00
Daniel Bevenius	b2513a6208	vad : remove shortform for --vad option in cli.cpp (#3145 ) This commit removes the shortform for the --vad option in cli.cpp. The motivation for this is that `-v` is often used for verbose or version is many tools and this might cause confusion. Refs: https://github.com/ggml-org/whisper.cpp/pull/3065#issuecomment-2873243334	2025-05-13 06:04:05 +02:00
Tomer Schlesinger	587ea01f55	docs : update README.md for whisper.objc app (#2569 )	2025-05-13 06:03:50 +02:00
Daniel Bevenius	e41bc5c61a	vad : add initial Voice Activity Detection (VAD) support (#3065 ) * vad : add initial Voice Activity Detection (VAD) support This commit add support for Voice Activity Detection (VAD). When enabled this feature will process the audio input and detect speech segments. This information is then used to reduce the number of samples that need to be processed by whisper_full. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3003 --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-12 16:10:11 +02:00
Daniel Bevenius	186855e38b	cli : print color scheme info for --print-colors (#3141 ) This commit adds a description of the color scheme used in the CLI when the --print-colors option is enabled. The motivation for this is that it is not immediately clear what the color scheme is when using the CLI with the --print-colors option. Example output: ```console $ ./build/bin/whisper-cli -f samples/jfk.wav --print-colors ... main: color scheme: red (low confidence), yellow (medium), green (high confidence) [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country. ``` The description will not be dispayed if the `--no-prints` options is set. Refs: https://github.com/ggml-org/whisper.cpp/issues/3135	2025-05-12 10:43:04 +02:00
Daniel Bevenius	4730950492	examples : update link to Paul Tol's color scheme [no ci] (#3140 ) This commit updates the link to Paul Tol's color scheme in the `examples/common.h` file. The previous link was outdated and pointed to a non-existent page.	2025-05-12 09:02:06 +02:00
Enes Grahovac	5d4390d281	examples : add HEAPU8 to all of the exported runtime methods (#3134 ) This commit adds HEAPU8 to the list of exported methods. The motivation for this commit is that currently this is causing an error on Window systems where HEAPU8 in undefined, which results in the following error message in the web console: main.js:1 Uncaught TypeError: Cannot read properties of undefined (reading 'buffer') at __emval_get_property (main.js:1:1363125) at 003a453a:0xc4a47 at 003a453a:0xc51cd at Object.full_default (eval at craftInvokerFunction (main.js:1:1347011), <anonymous>:9:10) at whisper.cpp/:647:42 danbev originally fixed this for whisper.wasm, stream.wasm, and command.stream, but the issue still exists on the other examples which I patch in this code. Resolves: #3059	2025-05-10 06:44:13 +02:00
Daniel Bevenius	9791647653	wasm : add note about worker.js file generation [no ci] (#3133 ) This commit updates the documentation for the WASM examples to include a note about the generation of the `worker.js` file. As of Emscripten 3.1.58 (April 2024), separate worker.js files are no longer generated and the worker is embedded in the main JS file. The motivation for this change is to inform users about the new behavior of Emscripten and why the `worker.js` file may not be present. Refs: https://github.com/ggml-org/whisper.cpp/issues/3123	2025-05-09 15:42:45 +02:00
Daniel Bevenius	b6f3fa4059	stream.wasm : add HEAPU8 to exported runtime methods (#3130 ) * stream.wasm : add HEAPU8 to exported runtime methods This commit adds HEAPU8 to the list of exported methods for stream.wasm. The motivation for this is that without it HEAPUD8 will be undefined and when its 'buffer' attribute is accessed this will cause error as reported in the referenced issue. Note that to test this make sure that the web browsers caches is cleared first. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3123 * command.wasm : add HEAPU8 to exported runtime methods	2025-05-08 16:58:34 +02:00
Georgi Gerganov	4a512cb153	cli : avoid std::exchange ggml-ci	2025-05-07 15:39:32 +03:00
Daniel Bevenius	09846f4e12	whisper: remove MSVC warnings pragmas (#3090 ) * ggml : remove MSVC warnings pragmas This commit removes the MSVC-specific pragmas as these are now handled in CMakeLists.txt. * whisper : remove MSVC warning pragmas This commit removes the MSVC-specific pragmas. These are now handled in the CMakeLists.txt file.	2025-05-05 13:09:35 +02:00
Sacha Arbonel	bcf1ed0163	server: update abort mechanism to handle HTTP connection closure (#3112 )	2025-05-05 07:16:54 +02:00
Daniel Tang	934d4b3083	cli : support "-" for stdout like stdin (#3050 ) This changes examples/cli/cli.cpp to be like examples/common-whisper.cpp. "-of -" can be specified (or this can be inferred from "-" as the input file) to output to stdout. This is useful for piping to other applications. Log fname_out consistently when not stdout - Terminals have stdout=stderr, so remove the message before successful output to ease copying - Don't affect actual error messages - Move opening the ofstream into the factory, fixing missing open and/or error messages in output_score/output_wts - Fix struct naming convention Closes #3048	2025-05-05 07:15:39 +02:00
Arpit Jain	988dcd4b5b	docs : Update cli documentation (#3102 ) * docs : Update cli documentation This updates the documentation of cli based on the actual output In the longterm this should ideally be auto generated to prevent mismatch * docs : Update cli documentation This updates the documentation of cli based on the actual output In the longterm this should ideally be auto generated to prevent mismatch	2025-05-02 14:18:33 +02:00
Sacha Arbonel	1fa17bc752	server : update httplib.h to version 0.20.0 (#3101 )	2025-05-02 06:09:41 +02:00
Georgi Gerganov	0778b6ff5f	talk-llama : sync llama.cpp ggml-ci	2025-05-01 13:29:02 +03:00
Daniel Bevenius	25efcfe3ed	server : add --no-gpu option to print usage output (#3098 ) This commit adds the the command line option `--no-gpu` to the server examples print usage function. The motivation for this is that this options is available and can be set but it is not displayed in the usage message. Refs: https://github.com/ggml-org/whisper.cpp/issues/3095	2025-05-01 09:15:12 +03:00
Sacha Arbonel	f0171f0616	examples : expose language detection probabilities to server example (#3044 ) * feat: expose language detection probabilities to server.cpp * feat: enhance language detection output in server.cpp * Remove empty spaces.	2025-04-28 18:25:45 +02:00
Georgi Gerganov	f3c42399a3	talk-llama : sync llama.cpp (#3084 ) ggml-ci	2025-04-28 16:40:23 +03:00
Pedro	f9b2dfdd8c	examples : fix deprecated FFmpeg functions (#3073 ) * Fix deprecated FFmpeg functions and free packet * avcodec_free_context	2025-04-28 06:16:50 +02:00
Daniel Bevenius	3a88f1e504	examples : add HEAPU8 to exported runtime methods (#3062 ) This commit adds `HEAPU8` to the list of exported methods. The motivation for this commit is that currently this is causing an error on Window systems where HEAPU8 in undefined, which results in the following error message in the web console: ```console main.js:1 Uncaught TypeError: Cannot read properties of undefined (reading 'buffer') at __emval_get_property (main.js:1:1363125) at 003a453a:0xc4a47 at 003a453a:0xc51cd at Object.full_default (eval at craftInvokerFunction (main.js:1:1347011), <anonymous>:9:10) at whisper.cpp/:647:42 ``` Resolves: https://github.com/ggml-org/whisper.cpp/issues/3059	2025-04-20 19:40:25 +02:00
Sacha Arbonel	170b2faf75	whisper : add no_context parameter to whisper_params (#3045 )	2025-04-16 06:24:38 +02:00
Fujimoto Seiji	f8a3509b6d	examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (#3038 ) FFmpeg introduced a new channel layout API that uses `AVChannelLayout` interface in v6.0. It subsequently dropped the old bitmask-based API in v7.0. This updates decode_audio() to support the new channel layout API, so that we can compile `whisper-cli` and `whisper-server` with FFmpeg v7.0 or later. Tested on on Ubuntu 24.10 with FFmpeg v7.0.2. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>	2025-04-15 06:09:00 +02:00

1 2 3 4 5 ...

535 Commits