whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-18 18:38:42 +02:00

Author	SHA1	Message	Date
Jeff Bolz	e46df4850f	vulkan: Allow up to 4096 elements for mul_mat_id row_ids (llama/13326) This assert fired running Qwen_Qwen3-30B-A3B-Q2_K.gguf: GGML_ASSERT(nei0 * nei1 <= 3072); The tensor is 8 x 512. Increase this array size to accommodate.	2025-05-13 13:59:21 +03:00
Alberto Cabrera Pérez	e8a7f1b7bb	sycl: addressing non-contiguous src1 mul_mats (nc and batched) (llama/13343) * sycl: fixed non-contiguous src1 mul_mats (nc and batched) * Fixed wrong static_cast inside kernel	2025-05-13 13:59:21 +03:00
Daniel Bevenius	fbad8058c4	examples : add VAD speech segments example (#3147 ) This commit adds an example that demonstrates how to use a VAD (Voice Activity Detection) model to segment an audio file into speech segments. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3144	2025-05-13 12:31:00 +02:00
Daniel Bevenius	b2513a6208	vad : remove shortform for --vad option in cli.cpp (#3145 ) This commit removes the shortform for the --vad option in cli.cpp. The motivation for this is that `-v` is often used for verbose or version is many tools and this might cause confusion. Refs: https://github.com/ggml-org/whisper.cpp/pull/3065#issuecomment-2873243334	2025-05-13 06:04:05 +02:00
Tomer Schlesinger	587ea01f55	docs : update README.md for whisper.objc app (#2569 )	2025-05-13 06:03:50 +02:00
Daniel Bevenius	e41bc5c61a	vad : add initial Voice Activity Detection (VAD) support (#3065 ) * vad : add initial Voice Activity Detection (VAD) support This commit add support for Voice Activity Detection (VAD). When enabled this feature will process the audio input and detect speech segments. This information is then used to reduce the number of samples that need to be processed by whisper_full. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3003 --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-05-12 16:10:11 +02:00
Daniel Bevenius	e39ba750cd	whisper : remove dummy commit comment [no ci] (#3143 ) This commit removes a dummy comment that was add by Commit(`589b408` "ci : dummy commit to trigger CI").	2025-05-12 14:40:17 +02:00
Daniel Bevenius	db0fc9edc6	docs : fix -owts flag typo karaoke section [no ci] (#3142 )	2025-05-12 10:56:39 +02:00
Daniel Bevenius	186855e38b	cli : print color scheme info for --print-colors (#3141 ) This commit adds a description of the color scheme used in the CLI when the --print-colors option is enabled. The motivation for this is that it is not immediately clear what the color scheme is when using the CLI with the --print-colors option. Example output: ```console $ ./build/bin/whisper-cli -f samples/jfk.wav --print-colors ... main: color scheme: red (low confidence), yellow (medium), green (high confidence) [00:00:00.000 --> 00:00:11.000] And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country. ``` The description will not be dispayed if the `--no-prints` options is set. Refs: https://github.com/ggml-org/whisper.cpp/issues/3135	2025-05-12 10:43:04 +02:00
Simon Booth	a513146102	docs : update Readme to recommend same Openvino as Python tools (#3138 )	2025-05-12 09:06:51 +02:00
Daniel Bevenius	4730950492	examples : update link to Paul Tol's color scheme [no ci] (#3140 ) This commit updates the link to Paul Tol's color scheme in the `examples/common.h` file. The previous link was outdated and pointed to a non-existent page.	2025-05-12 09:02:06 +02:00
KITAITI Makoto	9dd9685c79	ruby : test extra build options only when env var specified (#3136 ) * Test Ruby bindings' extra options only when commanded * ruby : test extra build options only when env var specified * Fix extra_options * Update gem date	2025-05-12 06:49:46 +02:00
Daniel Bevenius	2e310b841e	ruby : omit test_build_options locally (#3132 ) This commit omits the test for `test_build_options` when run locally as it currently fails on Linux and MacOS platforms. ` The motivation for this change is that currently when running the tests locally on a non-macOS platform the test fails with the following error: ```console .F ======================================================================== Failure: test_build_options(TestPackage): <["ACCELERATE_FRAMEWORK", "CMAKE_OSX_ARCHITECTURES", "CMAKE_OSX_SYSROOT", "FOUNDATION_LIBRARY", "METALKIT_FRAMEWORK", "METAL_FRAMEWORK"]> was expected to be empty. /home/danbev/work/ai/whisper.cpp/bindings/ruby/tests/test_package.rb:43:in `test_build_options' 40: options = BuildOptions::Options.new 41: assert_empty options.missing_options 42: unless ENV["CI"] => 43: assert_empty options.extra_options 44: end 45: end 46: end ======================================================================== ```	2025-05-10 08:18:08 +02:00
Enes Grahovac	5d4390d281	examples : add HEAPU8 to all of the exported runtime methods (#3134 ) This commit adds HEAPU8 to the list of exported methods. The motivation for this commit is that currently this is causing an error on Window systems where HEAPU8 in undefined, which results in the following error message in the web console: main.js:1 Uncaught TypeError: Cannot read properties of undefined (reading 'buffer') at __emval_get_property (main.js:1:1363125) at 003a453a:0xc4a47 at 003a453a:0xc51cd at Object.full_default (eval at craftInvokerFunction (main.js:1:1347011), <anonymous>:9:10) at whisper.cpp/:647:42 danbev originally fixed this for whisper.wasm, stream.wasm, and command.stream, but the issue still exists on the other examples which I patch in this code. Resolves: #3059	2025-05-10 06:44:13 +02:00
Daniel Bevenius	9791647653	wasm : add note about worker.js file generation [no ci] (#3133 ) This commit updates the documentation for the WASM examples to include a note about the generation of the `worker.js` file. As of Emscripten 3.1.58 (April 2024), separate worker.js files are no longer generated and the worker is embedded in the main JS file. The motivation for this change is to inform users about the new behavior of Emscripten and why the `worker.js` file may not be present. Refs: https://github.com/ggml-org/whisper.cpp/issues/3123	2025-05-09 15:42:45 +02:00
Daniel Bevenius	288304ee64	whisper : deprecate WHISPER_CCACHE CMake option (#3131 ) * whisper : deprecate WHISPER_CCACHE CMake option This commit deprecates the WHISPER_CCACHE CMake option in favor of the GGML_CCACHE option. The motivation for this change is that currently when setting, or not setting WHISPER_CCACHE, the outut message from ggml will be that to enable ccache you need to set GGML_CCACHE which can be confusing. This also seems to be inline with what llama.cpp does which does not have a LLAMA_CCACHE option as far as I know. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3063 * ruby : change "WHISPER_CCACHE" to "GGML_CCACHE" * ruby : move GGML_CCACHE to sorted position	2025-05-09 14:13:41 +02:00
Daniel Bevenius	b6f3fa4059	stream.wasm : add HEAPU8 to exported runtime methods (#3130 ) * stream.wasm : add HEAPU8 to exported runtime methods This commit adds HEAPU8 to the list of exported methods for stream.wasm. The motivation for this is that without it HEAPUD8 will be undefined and when its 'buffer' attribute is accessed this will cause error as reported in the referenced issue. Note that to test this make sure that the web browsers caches is cleared first. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3123 * command.wasm : add HEAPU8 to exported runtime methods	2025-05-08 16:58:34 +02:00
Georgi Gerganov	cb2bd11ee8	sync : ggml ggml-ci	2025-05-07 21:00:32 +03:00
R0CKSTAR	09e6b66025	cuda : remove nrows_x in mul_mat_q_process_tile (llama/13325) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2025-05-07 21:00:32 +03:00
Johannes Gäßler	d41cf26a0f	CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (llama/13135)	2025-05-07 21:00:32 +03:00
Akarshan Biswas	3c67195be9	SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled (llama/13254) * SYCL: Do not set tensor extras when reorder optimize is disabled * SYCL: Disable reorder optimize by default	2025-05-07 21:00:32 +03:00
Johannes Gäßler	f9f78a773f	CUDA: fix bad asserts for partial offload (llama/13337)	2025-05-07 21:00:32 +03:00
Johannes Gäßler	be55e25cac	CUDA: fix --split-mode row for MMQ (llama/13323)	2025-05-07 21:00:32 +03:00
Johannes Gäßler	2ffdda99e8	CUDA: fix logic for clearing padding with -ngl 0 (llama/13320)	2025-05-07 21:00:32 +03:00
Akarshan Biswas	9bbedc51cc	SYCL: Disable mul_mat kernels for noncontiguous tensor b (llama/13308) ggml-ci	2025-05-07 21:00:32 +03:00
Diego Devesa	1e1fa27add	rpc : use backend registry, support dl backends (llama/13304)	2025-05-07 21:00:32 +03:00
Aaron Teo	e1bdd148c5	ggml : activate s390x simd for Q3_K (llama/13301) Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>	2025-05-07 21:00:32 +03:00
Johannes Gäßler	7fa8bb303f	CUDA: fix race condition in MMQ stream-k fixup (llama/13299)	2025-05-07 21:00:32 +03:00
Johannes Gäßler	7564f5e6f1	CUDA: fix race condition in MMQ ids_dst (llama/13294)	2025-05-07 21:00:32 +03:00
Jeff Bolz	22ba2e27ce	vulkan: Additional type support for unary, binary, and copy (llama/13266) Support f16->f32 copy. Support f16->f16 and f32->f32 unary ops. Support all combinations of f16/f32 for src0/src1/dst for add/sub/mul/div.	2025-05-07 21:00:32 +03:00
Daniel Bevenius	0676b2dab2	ci : add bindings-java jar artifact to release (#3126 ) This commit adds the jar artifact from bindings java to the release process.	2025-05-07 16:26:54 +02:00
Georgi Gerganov	4a512cb153	cli : avoid std::exchange ggml-ci	2025-05-07 15:39:32 +03:00
Georgi Gerganov	76171ce199	sync : ggml ggml-ci	2025-05-07 15:39:32 +03:00
Georgi Gerganov	5eac2a3fbb	vulkan : fix lint (llama/0)	2025-05-07 15:39:32 +03:00
shalinib-ibm	42938398f9	ggml : Enable MMA for BF16 in llamafile_sgemm (llama/13148) This patch upstreams llamafile's cpu matrix multiplication kernels for ppc64le using MMA builtins for BF16 data type. This change results in 9x - 40x gains in total speed S t/s (ie all tokens/total time), across various batch sizes tested using llama-batched-bench benchmark. The patch is tested with Meta-Lllama-3-8B, and Mistral-7B models (BF16 models generated by using llama-quantize from corresponding FP32 models) on an IBM POWER10 machine. Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>	2025-05-07 15:39:32 +03:00
Justin Santa Barbara	a8fe90ae15	rpc : avoid uninitialized memory in serialize_tensor (llama/13210) Zero out the name and padding buffers.	2025-05-07 15:39:32 +03:00
Jesse Gross	c5a5a2da5b	ggml: Don't assert fail when tensor data changes (llama/13222) The following scenario will cause an assertion failure in the graph allocator: - Build and allocate a graph containing a tensor with a non-NULL data pointer - Build and allocate a new graph where that data is NULL Result: ggml-alloc.c:819: GGML_ASSERT(talloc->buffer_id >= 0) failed This happens during revalidation because we think that memory should have been previously allocated based on the current graph but in reality the previous graph was different. In this situation, we should do a full reallocation pass.	2025-05-07 15:39:32 +03:00
Diego Devesa	8316bfd82b	build : fix build info on windows (llama/13239) * build : fix build info on windows * fix cuda host compiler msg	2025-05-07 15:39:32 +03:00
Jeff Bolz	fd1cb9fc12	vulkan: Add bfloat16 support (llama/12554) * vulkan: Add bfloat16 support This adds bfloat16 matrix multiply support based on VK_KHR_shader_bfloat16. The extension is required for coopmat multiply support, but matrix-vector multiply trivially promotes bf16 to fp32 and doesn't require the extension. The copy/get_rows shaders also don't require the extension. It's probably possible to fall back to non-coopmat and promote to fp32 when the extension isn't supported, but this change doesn't do that. The coopmat support also requires a glslc that supports the extension, which currently requires a custom build. * vulkan: Support bf16 tensors without the bf16 extension or coopmat support Compile a variant of the scalar mul_mm shader that will promote the bf16 values to float, and use that when either the bf16 extension or the coopmat extensions aren't available. * vulkan: bfloat16 fixes (really works without bfloat16 support now) * vulkan: fix spirv-val failure and reenable -O	2025-05-07 15:39:32 +03:00
Jeff Bolz	17f6b8225e	vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader (llama/13191) * vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader	2025-05-07 15:39:32 +03:00
Acly	6374ea32ca	vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204) * vulkan : add kernels for depthwise 2d convolution (OP_CONV_2D_DW) * review: remove src_x/y < 0 checks; add performance tests	2025-05-07 15:39:32 +03:00
Daniel Bevenius	3a66f9f248	ci : zip windows artifacts for release uploading (#3124 ) This commit adds steps to the windows jobs to zip and upload artifacts produced. The motivation for this is that currently the artifacts are not zipped which means that will not be picked up by the release job and hence not be included in github releases. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3119	2025-05-07 13:12:08 +02:00
Daniel Bevenius	9b584b0cc0	ci : add zip extension to xcframework artifact name (#3120 ) This commit add the .zip extension to the xcframework artifact name in the GitHub Actions workflow. The motivation for this that the release job will look for .zip files and will not find the xcframework artifact without the extension, and hence will not upload it to the release.	2025-05-07 12:02:29 +02:00
Daniel Bevenius	09846f4e12	whisper: remove MSVC warnings pragmas (#3090 ) * ggml : remove MSVC warnings pragmas This commit removes the MSVC-specific pragmas as these are now handled in CMakeLists.txt. * whisper : remove MSVC warning pragmas This commit removes the MSVC-specific pragmas. These are now handled in the CMakeLists.txt file.	2025-05-05 13:09:35 +02:00
Sacha Arbonel	bcf1ed0163	server: update abort mechanism to handle HTTP connection closure (#3112 )	2025-05-05 07:16:54 +02:00
Daniel Tang	934d4b3083	cli : support "-" for stdout like stdin (#3050 ) This changes examples/cli/cli.cpp to be like examples/common-whisper.cpp. "-of -" can be specified (or this can be inferred from "-" as the input file) to output to stdout. This is useful for piping to other applications. Log fname_out consistently when not stdout - Terminals have stdout=stderr, so remove the message before successful output to ease copying - Don't affect actual error messages - Move opening the ofstream into the factory, fixing missing open and/or error messages in output_score/output_wts - Fix struct naming convention Closes #3048	2025-05-05 07:15:39 +02:00
Arpit Jain	988dcd4b5b	docs : Update cli documentation (#3102 ) * docs : Update cli documentation This updates the documentation of cli based on the actual output In the longterm this should ideally be auto generated to prevent mismatch * docs : Update cli documentation This updates the documentation of cli based on the actual output In the longterm this should ideally be auto generated to prevent mismatch	2025-05-02 14:18:33 +02:00
Jared Tweed	9f540ad8cb	cmake : removed stdc++fs (#3097 ) * removed stdc++fs * kept line, but removed stdc++fs	2025-05-02 12:41:35 +03:00
Sacha Arbonel	1fa17bc752	server : update httplib.h to version 0.20.0 (#3101 )	2025-05-02 06:09:41 +02:00
KITAITI Makoto	366082d072	ruby : refine HTTP cache feature (#3109 ) * Use cache file when model host doesn't support if-modified-since * Update gem date * Revert "ruby : ignore "Downloading" output in test_log_suppress (#3106)" This reverts commit `edbd4cb7f5`.	2025-05-01 23:04:53 +09:00

1 2 3 4 5 ...

2572 Commits