whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-03 10:38:16 +02:00

Files

Jeff Bolz 21b01a21b6 vulkan: Optimize contiguous copies (llama/10254)

* tests: Fix memory bandwidth calculation for perf tests

Add a flops calculation for flash attention.

Add one GGML_OP_CPY perf test.

* vulkan: Optimize contiguous copies

Add a variant of the copy shader for when the tensors are contiguous. Avoid
the complex addressing calculations, and do four elements per invocation
to hide some other overhead.

Apply similar changes to the scale shader, since scale is always contiguous.

Add a "progress bar" for shader compiles.

2024-11-15 15:21:04 +02:00

cmake

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

include

metal : optimize FA kernels (llama/10171)

2024-11-15 15:21:04 +02:00

src

vulkan: Optimize contiguous copies (llama/10254)

2024-11-15 15:21:04 +02:00

.gitignore

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00

CMakeLists.txt

metal : opt-in compile flag for BF16 (llama/10218)

2024-11-15 15:21:04 +02:00

ggml_vk_generate_shaders.py

whisper : reorganize source code + improve CMake (#2256 )

2024-06-26 19:34:09 +03:00