whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-11-07 16:44:13 +01:00

History

Dibakar Gope 5498b0e6c0 ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (llama/5780) * Arm AArch64: optimized GEMV and GEMM kernels for q4_0_q8_0, and q8_0_q8_0 quantization * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions * Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions * Arm AArch64: add copyright claim only to ggml-aarch64.cpp and ggml-aarch64.h files * Arm AArch64: minor code refactoring for rebase * Arm AArch64: minor code refactoring for resolving a build issue with cmake * Arm AArch64: minor code refactoring to split the Q4_0_AARC64 type into three separate types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 * Arm AArch64: minor code change for resolving a build issue with server-windows * retrigger checks * Arm AArch64: minor code changes for rebase * Arm AArch64: minor changes to skip the pr#7433 vec_dot code for arm cpus with SVE VL not equal to 256 bits * Arm AArch64: remove stale LLAMA_QKK_64 from CMakeLists.txt and delete build.zig * Arm AArch64: add reference scalar gemm and gemv, and avoid dynamic memory allocations during quantization for Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 * Arm AArch64: add multithreaded quantization support for the new types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 * Arm AArch64: minor code refactoring * Arm AArch64: simplify logic for calling gemm and gemv functions in ggml_compute_forward_mul_mat * Arm AArch64: minimize changes in ggml_compute_forward_mul_mat * Arm AArch64: minor code refactoring, and add reference scalar code to quantize routines for new quant types * Arm AArch64: minor code refactoring * Arm AArch64: minor code refactoring * Arm AArch64: minor code refactoring * rebase on the latest master commit 3fd62a6 and adapt to the new directory structure * Arm AArch64: remove a redundant comment * Arm AArch64: add pragma in ggml-aarch64.c to turn -Woverlength-strings warning off * Arm AArch64: use __aarch64__ check to guard 64-bit neon kernels * Arm AArch64: update docs/build.md README to include compile time flags for buiilding the Q4_0_4_4 quant type		2024-08-08 22:48:46 +03:00
..
ggml-alloc.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-backend.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-blas.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-cuda.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-kompute.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-metal.h	Removes multiple newlines at the end of files that is breaking the editorconfig step of CI. (llama/8258)	2024-07-08 14:53:55 +03:00
ggml-rpc.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-sycl.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml-vulkan.h	whisper : reorganize source code + improve CMake (#2256 )	2024-06-26 19:34:09 +03:00
ggml.h	ggml : add AArch64 optimized GEMV and GEMM Q4 kernels (llama/5780)	2024-08-08 22:48:46 +03:00