sync : ggml (#2001)

* sync : update scripts

* sync : ggml

* talk-llama : sync llama.cpp

* make : WHISPER_CUBLAS -> WHISPER_CUDA

* ci : try to fix sycl build

* talk-llama : fix make build
This commit is contained in:
Georgi Gerganov
2024-03-27 18:55:10 +02:00
committed by GitHub
parent 1558ec5a16
commit 2948c740a2
90 changed files with 15702 additions and 12476 deletions

View File

@@ -414,11 +414,11 @@ For more information about the Core ML implementation please refer to PR [#1037]
With NVIDIA cards the processing of the models is done efficiently on the GPU via cuBLAS and custom CUDA kernels.
First, make sure you have installed `cuda`: https://developer.nvidia.com/cuda-downloads
Now build `whisper.cpp` with cuBLAS support:
Now build `whisper.cpp` with CUDA support:
```
make clean
WHISPER_CUBLAS=1 make -j
WHISPER_CUDA=1 make -j
```
## OpenCL GPU support via CLBlast