whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-07-18 15:14:55 +02:00

Author	SHA1	Message	Date
Romain Biessy	e29e36aee7	ggml-cpu: sycl: Re-enable exp f16 (llama/14462)	2025-07-01 17:54:53 +03:00
xiaobing318	6bb1234a56	cmake : Remove redundant include path in CMakeLists.txt (llama/14452) * Update docker.yml 修改docker.yml文件中的内容使其停止周期性的运行该workflow，如果想要运行该workflow可以手动启动 * Remove redundant include path in CMakeLists.txt The parent directory '..' was removed from the include directories for the ggml-cpu-feats target, to avoid unnecessary include paths. * Enable scheduled Docker image builds Uncomments the workflow schedule to trigger daily Docker image rebuilds at 04:12 UTC, improving automation and keeping images up to date.	2025-07-01 17:54:53 +03:00
Vedran Miletić	3239359bd1	scripts : make the shell scripts cross-platform (llama/14341)	2025-07-01 17:54:53 +03:00
Akarshan Biswas	e81be92931	SYCL: disable faulty fp16 exp kernel (llama/14395) * SYCL: disable faulty fp16 CPU exponent for now * Revert "SYCL: disable faulty fp16 CPU exponent for now" This reverts commit ed0aab1ec31b4eb4b0f275dd7acd41d96a375202. * SYCL: disable faulty fp16 CPU exponent for now * Fix logic of disabling exponent kernel	2025-07-01 17:54:53 +03:00
Sigbjørn Skjæret	130044f228	ggml : fix unmerged GGML_FPxx_TO_FPxx refactoring (llama/14443)	2025-07-01 17:54:53 +03:00
Sigbjørn Skjæret	8bc638ee56	ggml : implement REGLU/GEGLU/SWIGLU ops (llama/14158) * implement unary REGLU/GEGLU/SWIGLU cpu ops * relax constraints * duplicate shape of source * fix ggml_vec_geglu_f16 * special case gated ops * implement unary REGLU/GEGLU/SWIGLU cuda ops * tighten constraints again * refactor into GGML_GLU_OP * metal : add glu kernels ggml-ci * add CUDA_GLU_BLOCK_SIZE [no ci] * more constraints and use 64bit ints ggml-ci * 64bit multiplication [no ci] * implement swapped variants (cpu/cuda) * update comment [no ci] ggml-ci * Vulkan: Add GLU ops and shaders * SYCL: Implement fused kernel GEGLU, SWIGLU and REGLU for single up+gate * ggml : implement GLU for split up/gate (llama/14181) * implement GLU for split up/gate * add tests for ggml_glu_split * Vulkan: Implement glu_split logic and shader support * add split to logging [no ci] * SYCL: refactor element_size ops and add split up and gate support to gated kernels * SYCL: switch GEGLU to use tanh approximation --------- Co-authored-by: 0cc4m <picard12@live.de> Co-authored-by: Akarshan <akarshan@menlo.ai> * GGML: increase OP count in assertion * Refactor: Optimize SYCL element-wise operations with unary function inlining This commit refactors the SYCL element-wise operations to improve performance by: - Inlining unary operations (sgn, abs, elu, gelu, silu, etc.) to reduce kernel launch overhead. - Introducing helper functions `op_xxx` for each unary operation to encapsulate the logic. - Replacing direct kernel calls with calls to these inlined functions. - Using `__dpct_inline__` to encourage compiler inlining. - Minor code cleanup and consistency improvements. The changes aim to reduce kernel launch overhead and improve the overall efficiency of element-wise operations on SYCL devices. * vulkan: Increase workgroup size for GLU, for performance (llama/14345) * vulkan: Increase workgroup size for GLU, for performance * vulkan: change GLU shaders to do one element per invocation rather than one row per workgroup * merge fix * metal : add support for split and swap ggml-ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: 0cc4m <picard12@live.de> Co-authored-by: Akarshan <akarshan@menlo.ai> Co-authored-by: Jeff Bolz <jbolz@nvidia.com>	2025-07-01 17:54:53 +03:00
Jeff Bolz	00b36237ba	vulkan: Add fusion support for RMS_NORM+MUL (llama/14366) * vulkan: Add fusion support for RMS_NORM+MUL - Add a use_count to ggml_tensor, so we can detect if an output is used more than once. - Change the ggml-vulkan rms_norm shader to optionally multiply by another tensor. - Add detection logic and basic fusion logic in ggml-vulkan. - Add some testing support for fusion. Rather than computing one node at a time, allow for computing the whole graph and just testing one node's results. Add rms_norm_mul tests and enable a llama test. * extract some common fusion logic * fix -Winconsistent-missing-override * move ggml_can_fuse to a common function * build fix * C and C++ versions of can_fuse * move use count to the graph to avoid data races and double increments when used in multiple threads * use hash table lookup to find node index * change use_counts to be indexed by hash table slot * minimize hash lookups style fixes * last node doesn't need single use. fix type. handle mul operands being swapped. * remove redundant parameter --------- Co-authored-by: slaren <slarengh@gmail.com>	2025-07-01 17:54:53 +03:00
Aman Gupta	b900ee424c	CUDA: add bf16 and f32 support to cublas_mul_mat_batched (llama/14361) * CUDA: add bf16 and f32 support to cublas_mul_mat_batched * Review: add type traits and make function more generic * Review: make check more explicit, add back comments, and fix formatting * Review: fix formatting, remove useless type conversion, fix naming for bools	2025-07-01 17:54:53 +03:00
Jeff Bolz	f641a4c410	vulkan: handle noncontig in the final case of ggml_vk_get_cpy_pipeline (llama/14378)	2025-07-01 17:54:53 +03:00
Jeff Bolz	9e48afba2f	vulkan: lock accesses of pinned_memory vector (llama/14333)	2025-07-01 17:54:53 +03:00
Xinpeng Dou	f31ed384f4	fix async_mode bug (llama/14432)	2025-07-01 17:54:53 +03:00
Jeff Bolz	0b09f5bbad	vulkan: Fix GGML_VULKAN_SHADER_DEBUG_INFO (llama/14427) This setting needs to be passed through to vulkan-shaders-gen	2025-07-01 17:54:53 +03:00
Radoslav Gerganov	48fb51f314	ggml : add ggml_set_rows (llama/14274) * ggml : add ggml_set_rows Add ggml_set_rows(a, b, c) which copies rows from 'b' into 'a' using indices from 'c'. ref: #8366 * use I64 for indices * ggml : add repeat impl for i64 * ggml : add ggml_is_contiguous_rows * ggml : ggml_set_rows support broadcast * ggml : ggml_set_rows support quantized dst ggml-ci * ggml : support GGML_TYPE_F32 ".from_float" trait * ggml : ggml_set_rows update comment + better index name * tests : add ggml_set_rows * metal : add ggml_set_rows implementation ggml-ci * ggml : simplify forward_dup_f32 * ggml : fix supports_op * tests : add comment to set_rows * ggml : leave the repeat_i64 for a separate PR ggml-ci * ggml : set_rows use std::min instead of MIN * ggml : better error message for set_rows unsupported type * metal : perform op->type check only once * tests : more consistent implementation + more tests ggml-ci --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-07-01 17:54:53 +03:00
bandoti	566462a5c0	cmake: regen vulkan shaders when shaders-gen sources change (llama/14398) * Add shaders-gen sources as target deps	2025-07-01 17:54:53 +03:00
Georgi Gerganov	c300f1e32d	metal : add special-case mat-vec mul for ne00 == 4 (llama/14385) ggml-ci	2025-07-01 17:54:53 +03:00
Georgi Gerganov	c848b9fbef	metal : batch rows copy in a single threadgroup (llama/14384) * metal : batch rows copy in a single threadgroup ggml-ci * metal : handle some edge cases when threadgroup size is not a power of 2 ggml-ci	2025-07-01 17:54:53 +03:00
R0CKSTAR	a5e6a3c953	musa: enable fp16 mma (all) and cublas on qy2 (llama/13842) * musa: enable fp16 mma (all) and cublas on qy2 Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Update ggml/src/ggml-cuda/ggml-cuda.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * Address review comments Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> * musa: disable MUL_MAT_ID (q2_k × f32) due to precision issues Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> --------- Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2025-07-01 17:54:53 +03:00
Aaron Teo	16aa7d151d	ggml-cpu: enable IBM NNPA Vector Intrinsics (llama/14317) * ggml-cpu: add nnpa compile flag Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 4a9f60c201573128f73a65999b3e5cc497fae5c1) * ggml-cpu: add fp16->fp32 nnpa first Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 8d4a7987f9c1887f716be96250f2caeee0253929) * ggml-cpu: add fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 0ff0d6516247a41d2ade42b42cf0d676a4dd1627) * ggml-cpu: better variable names Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 2f58bbcbb89c183340e252362b2a40651f573f1f) * docs: update s390x docs Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 01b929491b50071a5d0572235dcf5a449da70aa7) * ggml-cpu: add debugging prints to see if dlf16 is correct Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix print vs printf Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix float placeholder Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: ensure fp16 and fp32 load and stores are called Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fp16 load ensured to hit Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove sigint from fp16 store for some reason, the function is not getting a hit when debugged with gdb. we will need to investigate further Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: activate nnpa for ggml_cpu_fp16_to_fp32 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: nnpa activate ggml_cpu_fp16_to_fp32 for 8 elements Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: nnpa switch to vec_xst test Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: switch to vec_xst for 4 element loops also Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: rework noop Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove noop, general code cleanup Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: clarify variable naming Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: activate nnpa for ggml_cpu_fp32_to_fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add breakpoint for debugging Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: test fix for conversion failure Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: disable fp32->fp16 nnpa conversions for now there are some conversion failures in nnpa that requires the eyes of an ibm stsm. will create a separate pr to introduce the fp32->fp16 change. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: switch to elif macro Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: reattempt fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix typo Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: reattempt fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix compiler types Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: change to typedef vector types Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add 4 element loops for fp32->fp16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: clarified vector naming Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: bring back fp32->fp16 store nnpa Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: activate nnpa fp32->fp16 or fp16->fp32 compute Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add nnpa macro check in ggml-impl Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add missing __func__ Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: diagnose why __NNPA__ macro is not being defined Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: import vecintrin.h to fix compiler errors Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: update macro tests Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move s390x typedef to own header file Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: move s390x typedef to own header file" This reverts commit 157f856c34589566151630e294563a420702db39. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: switch to importing ggml-cpu-impl instead Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix macro declaration Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: test more macros Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add debug prints Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: bruteforce macro definitions Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move macro definitions Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add ggml-impl.h to cmakelists Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: switch to private macros Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move s390x typedef to own header file Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 157f856c34589566151630e294563a420702db39) * ggml-cpu: move things around Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: bring back compile macros Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: switch to quotes for import Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add compiler error macro Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add s390x detection in ggml-src Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: bring back compile definitions Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: undo cmakelists work Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: move s390x typedef to own header file" This reverts commit 18d79e1a30b39d9aaa0bd58400c5cf2c32135c9a. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove typedefs.h Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove typedef from cmakelists Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add ggml-impl.h future notes Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: add todo comment for future reference Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: clarify naming of dlf16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove unnecessary target compile definitions Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move nnpa fp16->fp32 and fp32->fp16 to simd-mappings Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * docs: update broken huggingface link for s390x Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix duplicate func names during compile Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: fix duplicate func names during compile" This reverts commit fbb733451f27677063b914d4f6c9a9841d45b38d. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml: refactor fp32->fp16 and fp16->fp32 simd to ggml-cpu" This reverts commit bd288e8fa52b5244f65cee21cb61062f1a9e0ca5. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: refactor fp16<->fp32 simd to ggml-cpu Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix missing simd-mappings.h import in quants.c Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix missing simd-mappings.h within repack Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix amx mmq missing simd-mappings.h Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: attempt at fixing loongarch failing build Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move nnpa together with other fp16<->fp32 simd Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: fix wrong refactor of ggml-base ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164176555 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: remove dependency on ggml-cpu from ggml-base Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: rename all fp16<->fp32 macros to prefix with ggml_cpu ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164449406 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: remove mistaken fallback macro fallback logic was already implemented but i was too sleepy to realise Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: move ggml_table_f32_f16 to ggml-cpu ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: move ggml_table_f32_f16 back to ggml-base due to ci failures" This reverts commit 32a3533564bdb7902cefb9c89b1c9e956a81ce29. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml: move ggml_table_f32_f16 to ggml-cpu" This reverts commit 9e40d984ad27d7b60392fb2b7548885201864fe4. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml: move ggml_table_f32_f16 to ggml-cpu ref: https://github.com/ggml-org/llama.cpp/pull/14317#discussion_r2164775006 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> (cherry picked from commit 9e40d984ad27d7b60392fb2b7548885201864fe4) * ggml: move ggml_table_f32_f16 to ggml-cpu.c Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: extern c ggml_table_f32_f16 + chore docs Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h we rely on the variable declaration in ggml-cpu.c instead Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: dedup ggml_table_f32_f16 from simd-mappings.h" This reverts commit f71b21d2f74f5e03ec0c2b4fefd3cbf395aecf16. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * ggml-cpu: bring back ggml_table_f32_f16 Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * Revert "ggml-cpu: bring back ggml_table_f32_f16" This reverts commit 2dce119178bed5ef5c8398c4230ddd14fef80e49. Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> * fix ggml time initialization * fix f32_f16 table init * remove extra line --------- Signed-off-by: Aaron Teo <aaron.teo1@ibm.com> Co-authored-by: slaren <slarengh@gmail.com>	2025-07-01 17:54:53 +03:00
Sigbjørn Skjæret	99764f5767	ggml : do not output unprintable characters on GGUF load failure (llama/14381)	2025-07-01 17:54:53 +03:00
Anton Mitkov	fc28594112	sycl: GGML_SYCL_DISABLE_OPT on by default for all Intel Devices (llama/13973)	2025-07-01 17:54:53 +03:00
lhez	acfbf2921b	opencl: ref count `ggml_backend_opencl_context` and refactor profiling (llama/14254) * Move profiling info into `ggml_backend_opencl_context` * Add `enqueue_ndrange_kernel` to launch kernel	2025-07-01 17:54:53 +03:00
uvos	6a1d12a8ea	CUDA/HIP: optimize mmv paths taken for HIP devices (llama/14324) Co-authored-by: Johannes Gäßler <johannesg@5d6.de>	2025-07-01 17:54:53 +03:00
Johannes Gäßler	06b01ba87b	CUDA: mul_mat_v support for batch sizes > 1 (llama/14262) * CUDA: mul_mat_v support for batch sizes > 1 * use 64 bit math for initial offset calculation	2025-07-01 17:54:53 +03:00
uvos	791201a974	HIP: enable vec fattn on RDNA4 (llama/14323)	2025-07-01 17:54:53 +03:00
Aman Gupta	abb650c0ec	CUDA: add mean operation (llama/14313) * CUDA: add mean operation * add back sum_rows_f32_cuda * Review: early exit if col!=0	2025-07-01 17:54:53 +03:00
Markus Tavenrath	e036676795	Add support for VK_EXT_debug_utils to add labels to Vulkan objects. (llama/13792) * Add support for VK_EXT_debug_utils to add labels to Vulkan objects. In step 1 compute pipelines are getting labeled. * remove #ifdef for debug utils and add queue marker.	2025-07-01 17:54:53 +03:00
Georgi Gerganov	c1418b9906	metal : fix thread-safety (llama/14300) ggml-ci	2025-07-01 17:54:53 +03:00
Acly	9d7cb80f04	ggml-cpu : "align corners" for bilinear upscale/downscale (ggml/1285) * add "align corners" mode for bilinear upscale, and allow downscaling * add ggml_interpolate, deprecate ggml_upscale_ext, pass in align-corners as bit-flag * test-backend-ops: replace ggml_upscale_ext with ggml_interpolate, add test cases for downscale and align-corners	2025-07-01 17:54:53 +03:00
Daniel Bevenius	515df20351	ggml-quants : rename best_mad to best_error (ggml/1283) This commit renames the variable `best_mad` to `best_error` in the `make_qkx2_quants` function. The motivation for this is that the name `best_mad` can be somewhat confusing if mean absolute deviation (MAD) is not in use.	2025-07-01 17:54:53 +03:00
Daniel Bevenius	c88ffbf9ba	ci : use selective copy for musa image (#3296 ) This commit modified the musa docker file to selectively copy directories needed for the container image. This commit also added a step to the docker workflow to free up disk space in attempt to make enough room for the large musa build containers. The motivation for this change is to reduce the size of the container image and try to avoid disk usage issues in CI.	2025-06-27 15:43:56 +02:00
Daniel Bevenius	7069394447	ci: set fail-fast to false in docker.yml (#3294 ) * ci: set fail-fast to false in docker.yml This commit modifies the GitHub Actions workflow for Docker builds to disable the fail-fast behavior. The motivation for this is that currently if one of the strategy jobs fails any other job that is in progress will be cancelled. There is no need for this as the jobs are independent. * ci : update docker.yml to use a single build This commit updates the docker job to only build the image once instead of twice (only happens when pushing to the master branch). Instead this will tag the image with the commit SHA when pushing to master. The motivation for this change is to reduce the time it takes to run this job and also it might help with the disk space issues we are experiencing for this job when it runs on pushes to master.	2025-06-27 09:55:56 +02:00
KITAITI Makoto	f8abbeb234	ruby : add Whisper::VERSION (#3292 ) * Add a test for segment * Check option existence * Use more proper variable to define build option * Assert Core ML enabled * Define Whisper::VERSION * Add test for Whisper::VERSION * Add signature of Whisper::VERSION	2025-06-27 04:41:26 +02:00
Daniel Bevenius	32cf4e2aba	whisper : add version function (#3289 ) * whisper : add version function This commit adds a version function to the whisper API. The motivation for this is that it might be convenient to have a way to programmatically check the version. Example usage: ```c++ printf("Using whisper version: %s\n", whisper_version()); ``` Will output: ```console Using whisper version: 1.7.6 ``` * examples : add version to android example CMakeLists.txt	2025-06-26 18:09:42 +02:00
Daniel Bevenius	35034c5aea	ci : add should_release variable (#3288 ) * ci : add should_release variable This commit adds a `should_release` variable to the GitHub Actions workflow to determine if a release should be created based on the tag or branch conditions. The motivation for this that it simplifies the logic for deciding whether to upload artifacts or not, making it easier to maintain if we need to change the conditions in the future. * ci : set release draft to true	2025-06-26 16:29:29 +02:00
toboil-features	897b071dc6	docs : add cmake "-j" flag in README.md (#3284 ) Make cmake commands encounter multithreading in README.md file.	2025-06-26 13:23:19 +02:00
Daniel Bevenius	4daf7050ca	ci : add support for tag-based releases (#3287 ) This commit modifies the GitHub Actions workflow to support tag-based releases. When a tag is pushed that starts with 'v', the workflow will use that tag name for the release process. I think this was the once the behavior, but it was lost in updates that I've made to the workflow. This commit restores that functionality.	2025-06-25 21:43:58 +02:00
Georgi Gerganov	a8d002cfd8	release : v1.7.6 v1.7.6	2025-06-25 16:47:03 +03:00
Georgi Gerganov	06bdaa6c0c	bench : update benches	2025-06-25 16:45:19 +03:00
Georgi Gerganov	dc8dda60ee	bench : print system info before ctx check	2025-06-25 16:01:32 +03:00
Daniel Bevenius	1ad258ca31	stream : add nullptr check of whisper_context (#3283 ) * stream : add nullptr check of whisper_context This commit adds a check to ensure that the `whisper_context` is not null after initialization. The motivation for this is that currently, if the initialization fails, the program continues to run leading to a segmentation fault. This sort of check is performed by others examples like whisper-cli. Refs: https://github.com/ggml-org/whisper.cpp/issues/3280#issuecomment-3003778035 * examples : add nullptr check for whisper_context	2025-06-25 14:16:31 +02:00
Daniel Bevenius	7dd2997a01	ci : enable main-cuda build (#3282 ) This commit re-enables the main-cuda Docker build in the CI workflow. The main-cuda Dockerfile has been updated to remove build artifacts and also print the size of the /app directory after the build. A similar change was recently made to the musa Dockerfile, and perhaps this job was also having similar disk space issues. The motivation for this change is that this configuration has been disabled for a while due to persistent build failures. However, the actual logs are now longer available. Resolves: https://github.com/ggml-org/whisper.cpp/issues/3040	2025-06-25 12:12:36 +02:00
Joas Dev	c85b1ae84e	bindings.java : update java example (#3281 ) This commit updates the example in the README.md file as the current Java example code is not working. Resolves: https://github.com/ggml-org/whisper.cpp/issues/2860	2025-06-25 06:35:38 +02:00
glaszig	0083335ba0	coreml : backport CoreML features to macos < 14 (#3255 )	2025-06-24 09:24:27 +02:00
Daniel Bevenius	9c47902308	ci : reduce musa image size (#3277 ) * ci : reduce musa image size This commit contains an attempt to reduce the size of the musa Docker image by copying only the necessary files from the build stage. The motivation for this is that the CI runs sometimes fail with out of memory errors. These seems to be able to pass for PRs, at least sometimes but fail upon push to the master branch. * ci : remove build time files instead of selective copying	2025-06-24 08:20:28 +02:00
Yukimasa Funaoka	a0d2c632e4	whisper : add .gitignore entries for OpenVINO support (#3276 )	2025-06-24 07:50:16 +02:00
Aaron Ang	4d6ae52ed3	command: output commands to text file (#3273 ) This commit implements code for the command line argument `-f --file FNAME` which is currently missing.	2025-06-24 06:41:21 +02:00
Daniel Bevenius	a422176937	ci : add apt-get clean to musa Dockerfile (#3275 ) * ci : add apt-get clean to musa Dockerfile This commit adds `apt-get clean` to the musa Dockerfile to reduce the image size by removing cached package files after installation. The motivation for this is to try to reduce the size of the Docker image and see if this can avoid the "no space left on device" error during the CI build process. Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/15815324254	2025-06-23 12:34:44 +02:00
KITAITI Makoto	cead8f5357	ruby : specify Apple frameworks explicitly on build (#3270 ) * Add Apple frameworks to $LDFLAGS when needed * Add utility method to Options * Remove unnecessary propaty date from gemspec * Add Apple frameworks for CoreML build * Add Accelerate framework only for Apple platform * Fix ZipURI#cache signature * Download test fixtures if needed	2025-06-23 06:34:05 +02:00
Georgi Gerganov	e6c10cf3d5	talk-llama : sync llama.cpp ggml-ci	2025-06-21 07:34:17 +03:00
Georgi Gerganov	d65a579a0a	sync : ggml ggml-ci	2025-06-21 07:34:17 +03:00

1 2 3 4 5 ...

2896 Commits