* build: add dockerfile for ci
* ci: add action to build/push docker image
* fix: lowercase repository to fix ci
* ci: update cuBLAS flag
* build: install curl and ffmped in image
* docs: add docker section
* fix: improve args check when download model
* Create bench.py
* Various benchmark results
* Update benchmark script with hardware name, and file checks
* Remove old benchmark results
* Add git shorthash
* Round to 2 digits on calculated floats
* Fix the header reference when sorting results
* FIx order of models
* Parse file name
* Simplify filecheck
* Improve print run print statement
* Use simplified model name
* Update benchmark_results.csv
* Process single or lists of processors and threads
* Ignore benchmark results, dont check in
* Move bench.py to extra folder
* Readme section on how to use
* Move command to correct location
* Use separate list for models that exist
* Handle subprocess error in git short hash check
* Fix filtered models list initialization
* metal : init
* whisper : factor out graph builds
* whisper : allocate encoder and decoder using ggml-alloc
* whisper : ggml-alloc is now supported
* whisper : CoreML support ggml-alloc
* build : fix ggml-alloc
* ios : update submodule
* extra : update sync-ggml.sh script to also sync ggml-alloc
* ci : see if this is causing the crash
* whisper : refactor ggml-alloc init
* whisper.android : try to fix build
* whisper : initial Metal version
* ci : try to debug vmem issue
* metal : decoder works on GPU!
* metal : add multi-decoder support
* ggml : fix ggml_nbytes (probably temp solution)
* metal : run "cross" step on the GPU
* whisper : remove ggml_repeat in the encoder
* whisper : offload the Encoder to Metal
* ggml : use simpler ggml_bytes() implementation
* ggml-alloc : try to make CI happy by reducing vram to 128GB
* whisper : add whisper_allocr to wrap ggml_allocr
* whisper : factor out alloc init in a function
* cmake : update to support Metal build
* whisper : add <functional> header
* objc : fix build (no Metal yet)
* ios : add Metal support
* swiftui : fix build
* metal : speed-up KQ multiplication
* metal : sync latest llama.cpp kernels
* readme : add Metal info
* ios : update submodule
* coreml : add code to toggle Core ML config (CPU, ANE, GPU)
* bench : fix timings by running a pre-heat
* bench : start benching the decoder
* whisper : add ggml_mul_mat_pad
* bench : fix uninitialized vars
* whisper : add comment for disabling mul-mat padding
* whisper : add description of ggml_mul_mat_pad
* whisper : clean-up ggml_mul_mat_pad
* metal : remove the "concurrent" flag
* bench : variable n_past
* ios : update SPM package
* ggml : CLBlast support as in llama.cpp
Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.
Usage:
WHISPER_CLBLAST=1 make
* CMake/Makefile : CLBlast support as in llama.cpp
Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.
Usage:
```
Makefile:
cd whisper.cpp
WHISPER_CLBLAST=1 make
CMake:
cd whisper.cpp ; mkdir build ; cd build
cmake -DWHISPER_CLBLAST=ON ..
make
```
* Update README.md
Added OpenCL Build Instructions
* Instruction: Partial OpenCL GPU support via CLBlast
Added build instructions and examples for Make and CMake to support OpenCL enabled GPUs.
Users wanting to make use of this implementation of the whisper model with no prior knowledge of C/C++ may download the Whisper model but fail to use of the "make" command as specified given that they forgot or didn't know they needed to clone the repository first. Hope this modification clears things up.
The section of the readme file explaining `--print-colors` includes only a screenshot with directories that are inconsistent with other examples. This commit adds an example shell command, consistent with the remaining examples.