whisper.cpp/ggml
Ivan 2fc1d20f9e cuda: add q8_0->f32 cpy operation (llama/9571)
llama: enable K-shift for quantized KV cache
It will fail on unsupported backends or quant types.
2024-09-24 19:45:08 +03:00
..
cmake whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
include examples : adapt to ggml.h changes (ggml/0) 2024-09-24 19:45:08 +03:00
src cuda: add q8_0->f32 cpy operation (llama/9571) 2024-09-24 19:45:08 +03:00
.gitignore whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00
CMakeLists.txt cmake : do not hide GGML options + rename option (llama/9465) 2024-09-24 19:45:08 +03:00
ggml_vk_generate_shaders.py whisper : reorganize source code + improve CMake (#2256) 2024-06-26 19:34:09 +03:00