whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-03 17:28:21 +02:00

Author	SHA1	Message	Date
Daniel Bevenius	2e30e6df59	whisper : fix grammar advance stack warning (#3087 ) This commit addresses a warnings that is present for Release builds: ```console [ 30%] Building CXX object src/CMakeFiles/whisper.dir/whisper.cpp.o In file included from /usr/include/c++/13/bits/stl_tree.h:63, from /usr/include/c++/13/map:62, from /home/danbev/work/ai/whisper.cpp/src/whisper-arch.h:5, from /home/danbev/work/ai/whisper.cpp/src/whisper.cpp:2: In static member function ‘static void std::__copy_move<false, false, std::random_access_iterator_tag>::__assign_one(_Tp, _Up) [with _Tp = const whisper_grammar_element; _Up = const whisper_grammar_element const]’, inlined from ‘static _Up* std::__copy_move<_IsMove, true, std::random_access_iterator_tag>::__copy_m(_Tp, _Tp, _Up) [with _Tp = const whisper_grammar_element const; _Up = const whisper_grammar_element; bool _IsMove = false]’ at /usr/include/c++/13/bits/stl_algobase.h:440:20, inlined from ‘_OI std::__copy_move_a2(_II, _II, _OI) [with bool _IsMove = false; _II = const whisper_grammar_element const; _OI = const whisper_grammar_element]’ at /usr/include/c++/13/bits/stl_algobase.h:506:30, inlined from ‘_OI std::__copy_move_a1(_II, _II, _OI) [with bool _IsMove = false; _II = const whisper_grammar_element const; _OI = const whisper_grammar_element*]’ at /usr/include/c++/13/bits/stl_algobase.h:533:42, ... ``` This warning is caused by the fact that the `stack` vector is empty when it is passed to `new_stacks.push_back(stack);`. The suggested fix is to use `new_stacks.emplace_back();` instead of `new_stacks.push_back(stack);`.	2025-04-28 19:11:38 +02:00
Georgi Gerganov	549db9376f	whisper : reduce delta_min from 1000ms to 100ms (#3028 ) ggml-ci	2025-04-11 06:23:02 +02:00
Fujimoto Seiji	e6234cd435	whisper : fix "bench-all outputs an invalid result on larger models" (#3002 ) The benchmark script 'scripts/bench-all.sh' assumes that the 11th field of the output line is a timestamp. This assumption does not hold when the target model takes a bit longer to process. Fix this issue by introducing an explicit whitespace to the output lines of `whisper_print_timings()`. Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>	2025-04-04 18:36:19 +03:00
Georgi Gerganov	2b6d0d2200	rename : ggerganov -> ggml-org (#3005 )	2025-04-04 16:11:52 +03:00
Daniel Bevenius	11688b262f	coreml: fix Whisper to CoreML conversion by disabling SDPA [no ci] (#2979 ) * coreml: fix Whisper to CoreML conversion by disabling SDPA This commit disables the use of PyTorch's `scaled_dot_product_attention` in the Whisper model to avoid compatibility issues during CoreML conversion. The issue occurs because coremltools requires PyTorch 2.5.0, but the Whisper implementation may expect behavior from newer PyTorch versions. By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to use its fallback manual attention implementation, which works correctly with PyTorch 2.5.0 during the tracing process. Refs: https://github.com/ggerganov/whisper.cpp/issues/2783 * coreml: fix audio shape in whisper decoder conversion This commit fixes the audio shape in the whisper decoder conversion script. The motivation for this is that the audio shape was incorrect and was causing the conversion to fail. * coreml : set -e in generate-coreml-interface.sh The commit sets the -e flag in the generate-coreml-interface.sh script to make sure the script fails if any command fails. * coreml : update generated encoder/decoder interfaces This commit updates the generated encoder/decoder interfaces for the whisper model which is the result of running the generate-coreml-interface.sh script.	2025-04-01 18:01:23 +02:00
Daniel Bevenius	f92bd59951	whisper : remove unnecessary GGML_UNUSED macro (#2960 )	2025-03-30 05:56:10 +02:00
Dan Johansson	21d890d534	whisper : add support for backends with multiple ggml_backend_buffer_type (#2863 ) * whisper : add support for ggml_backend_buffer_type Signed-off-by: Dan Johansson <dan.johansson@arm.com> * fix compile error when building on Ubuntu Signed-off-by: Dan Johansson <dan.johansson@arm.com> * remove copyright header from include file Signed-off-by: Dan Johansson <dan.johansson@arm.com> --------- Signed-off-by: Dan Johansson <dan.johansson@arm.com>	2025-03-26 16:54:02 +02:00
Daniel Bevenius	cf5ddb8c21	whisper : initialize decoder's rng with unique seed (#2932 ) This change initializes each decoder's random number generator with a unique seed. The motivation for this is that currently all decoders are initialized with the same seed value, 0. The result of this is that for the same state (logits, probs, and logprobs) they will produce the same output.	2025-03-24 09:36:07 +01:00
Daniel Bevenius	be9de81171	whisper : add check for CPU backend initialization (#2918 ) This commit adds a check for the CPU backend initialization in the whisper library. If the initialization fails, an exception is thrown. The motivation for this change is to make the library more robust and handle the case when the CPU backend initialization fails. Resolves: https://github.com/ggerganov/whisper.cpp/issues/2917	2025-03-21 09:53:26 +01:00
Daniel Bevenius	215990abde	whisper : fix compiler warnings in whisper.cpp (#2895 ) This commit fixes compiler warnings in whisper.cpp by changing the type of the loop index variable from int64_t to size_t. Currently the following warnings are generated by the compiler: ```console /whisper.cpp/src/whisper.cpp:209:27: warning: comparison of integers of different signs: 'int64_t' (aka 'long long') and 'size_t' (aka 'unsigned long') [-Wsign-compare] 209 \| for (int64_t i = 0; i < nels; ++i) { \| ~ ^ ~~~~ /whisper.cpp/src/whisper.cpp:219:27: warning: comparison of integers of different signs: 'int64_t' (aka 'long long') and 'size_t' (aka 'unsigned long') [-Wsign-compare] 219 \| for (int64_t i = 0; i < nels; ++i) { \| ~ ^ ~~~~ ```	2025-03-18 13:38:41 +01:00
Daniel Bevenius	740bf7f6a1	whisper : enable compiler warnings for src (#2891 ) * whisper : enable compiler warnings for src This commit enables compiler warnings for the src directory. Currently when the WHISPER_ALL_WARNINGS flag is set to ON is only enables warnings in ggml, by setting GGML_ALL_WARNINGS to ON. This commit adds the same compiler flags for whisper's src directory. The motivation for this is to catch potential bugs and issues early on in the development process. * squash! whisper : enable compiler warnings for src Remove GF_C_FLAGS and GF_CXX_FLAGS from add_compile_options.	2025-03-18 05:19:18 +01:00
Diego Devesa	339a1cba5d	whisper : support GGML_BACKEND_DL (#2843 ) * whisper : support GGML_BACKEND_DL * fix DTW crash * whisper.objc : fix build - add ggml-cpp.h --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-02-27 13:35:07 +01:00
Thomas Fitzsimmons	47e14c0529	whisper : restore big endian support (#2816 ) * whisper : fix BYTESWAP whitespace * whisper : make byteswap useable with C++17 * cmake : define WHISPER_BIG_ENDIAN for big-endian targets * ci : fix (again) arm64 build fails * docker : attempt fixing arm64 build on ci * qemu v7.0.0-28 [imported from https://github.com/ggml-org/llama.cpp /commit/818a340ea8be55b3706e1772527cb8738e90a8c7 (#11895)] --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>	2025-02-25 11:38:13 +02:00
Georgi Gerganov	589b40810a	ci : dummy commit to trigger CI	2025-02-03 16:32:48 +02:00
Georgi Gerganov	eb68324c86	whisper : fix gpu device selection (#2728 )	2025-01-13 13:11:37 +02:00
Sandro Hanea	2ab2eb5110	whisper : add whisper_full_get_segment_no_speech_prob_from_state (#2716 )	2025-01-09 16:21:07 +02:00
Sacha Arbonel	4183517076	server : add no-speech threshold parameter and functionality (#2654 )	2024-12-21 17:00:08 +02:00
Georgi Gerganov	f4668169a0	whisper : rename suppress_non_speech_tokens to suppress_nst (#2653 )	2024-12-21 12:54:35 +02:00
Karthick	f897eb7670	whisper : support no_speech_thold (#2625 ) * Implement no_speech_thold no_speech_thold functionality is on par with OpenAI's whisper * Addressed review comments	2024-12-17 19:15:47 +02:00
Karthick	2f2841bfce	whisper : add single-timestamp logic (#2629 ) * Fix hallucinations during silence When the predicted tokens end with a single timestamp the the entire 30 segment should be considered as done, to avoid hallucinations for the remaining part of segment. This behaviour is on par with openai's whisper. Refer to logic related to `single_timestamp_ending` in https://github.com/openai/whisper/blob/main/whisper/transcribe.py * Accept review comments related to formatting. Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-12-17 19:07:08 +02:00
Georgi Gerganov	37c88027e1	whisper : use backend registry (#0 )	2024-11-20 21:00:08 +02:00
Georgi Gerganov	7fd8d9c220	whisper : adapt to new ggml (wip)	2024-11-20 21:00:08 +02:00
Georgi Gerganov	5089ab2d6a	whisper : fix build (#0 )	2024-11-15 15:21:04 +02:00
Jhen-Jie Hong	5f8a086e22	whisper.swiftui : add model download list & bench methods (#2546 ) * swift : fix resources & exclude build * whisper : impl whisper_timings struct & api * whisper.swiftui : model list & bench methods * whisper : return ptr for whisper_get_timings * revert unnecessary change * whisper : avoid designated initializer * whisper.swiftui: code style changes * whisper.swiftui : get device name / os from UIDevice * whisper.swiftui : fix UIDevice usage * whisper.swiftui : add memcpy and ggml_mul_mat (commented)	2024-11-13 21:51:34 +02:00
thewh1teagle	5ccca19f0c	ggml : vulkan logs (#2547 )	2024-11-13 21:47:15 +02:00
Vin Misra	31aea563a8	whisper : fix extra memory usage (#2534 ) * passing samples_padded by ref to the threads. * passing samples_padded by ref to the threads. --------- Co-authored-by: Vinith Misra <physicsdemon@gmail.com>	2024-11-06 23:02:11 +02:00
Georgi Gerganov	0377596b77	whisper : backend registry init before model load	2024-11-01 10:19:05 +02:00
Georgi Gerganov	aa037a60f3	ggml : alloc ggml_contexts on the heap (#2525 ) * whisper : reduce ggml_context usage * ggml : allocate contexts on the heap (v2) * ggml : aligned malloc -> malloc	2024-10-31 22:00:09 +02:00
Georgi Gerganov	3f020fac9d	whisper : minor compile warning	2024-10-29 19:30:26 +02:00
jettoblack	1626b73b03	whisper : move new-segment callback after DTW step (#2515 )	2024-10-29 08:47:21 +02:00
Josscii	0fbaac9c89	whisper : fix index overflow in token-level timestamp logic (#2505 )	2024-10-23 15:14:03 +03:00
Rotem Dan	b6049060dd	whisper : add dtw preset for large-v3-turbo (#2481 )	2024-10-15 21:00:21 +03:00
Sandro Hanea	fdbfb460ed	whisper : add OpenVINO init with state (#2464 ) * Fixed OpenVino init on state * Removed an empty line * Fixed typo * Replaced tabs with spaces --------- Co-authored-by: Sandro Hanea <sandrohanea@users.noreply.github.com>	2024-10-08 20:08:00 +03:00
Georgi Gerganov	847f94fdeb	whisper : zero-out the KV cache upon clear (#2445 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	396089f3cf	whisper : revert mel-related changes (#0 ) too much extra logic and complexity for small benefit	2024-10-05 15:23:51 +03:00
Georgi Gerganov	941912467d	whisper : adapt to latest ggml (skip) (#0 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	f62a546e03	whisper : fix excessive memory usage (#2443 ) * whisper : fix KV cache allocation * whisper : reduce memory overhead from unused input tensors	2024-10-05 12:36:40 +03:00
Georgi Gerganov	ccc2547210	talk-llama : sync llama.cpp	2024-10-03 12:22:17 +03:00
Georgi Gerganov	fe18c29ab8	talk-llama : sync llama.cpp	2024-09-24 19:45:08 +03:00
Georgi Gerganov	34291099fb	ggml : refactoring (llama/#0) - d6a04f87 - 23e0d70b	2024-09-24 19:45:08 +03:00
Georgi Gerganov	9d754a56cf	whisper : update FA call	2024-08-28 13:22:20 +03:00
Georgi Gerganov	6e9596f6de	whisper : fix compile warning for unused params	2024-08-28 11:40:11 +03:00
Mengqing Cao	81c999fe0a	cann : add Ascend NPU support (#2336 ) * enable Ascend NPU in src/whisper.cpp * sync test-backend-ops with llama.cpp	2024-08-09 15:21:56 +03:00
Georgi Gerganov	4b7de08bfd	whisper : fix compile warning (#0 )	2024-08-09 09:58:16 +03:00
Daven Sanassy	fe36c90971	cmake : fix compile in xcode (#2311 )	2024-08-05 09:48:26 +03:00
Georgi Gerganov	6739eb83c3	whisper : handle empty mel (#2324 )	2024-07-27 20:35:04 +03:00
Matt Stephenson	f68298ce06	whisper : use vulkan as gpu backend when available (#2302 ) * ggml: use vulkan as gpu backend when available Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com> * whisper: enable using vk as default buffer type Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com> --------- Signed-off-by: Matt Stephenson <mstephenson6@users.noreply.github.com>	2024-07-16 10:21:09 +03:00
arizhih	7ae885c1ef	whisper : fix DTW assert (#2299 )	2024-07-15 15:50:36 +03:00
Georgi Gerganov	d207c68822	cmake : use WHISPER_EXTRA_FLAGS (#2294 )	2024-07-09 18:54:18 +03:00
Georgi Gerganov	1c31f9d4a8	cmake : try to fix openvino build (#2281 )	2024-07-08 15:36:51 +03:00

1 2

55 Commits