- search for ffmpeg libs/headers at cmake time
- added ffmpeg-transcode.cpp into libcommon if ffmpeg on
- hooked ffmpeg trancoding in common read_wav(...)
- passed test:
./main -m ggml-base.en.bin -f samples/jfk.mp3
* cmake : make WHISPER_NO_AVX512=ON disable all subsets of AVX-512
Previously it happened only for MSVC, but it makes sense to have the
same behavior for other compilers too.
* make : reorder x86 ISA extensions in chronological order
And update compiler flags at the end to ease modifying conditions.
* make : support WHISPER_NO_AVX512=1 for disabling all AVX-512 subsets.
That way you do not have to override each AVX-512 subset setting
individually if it has been turned on during autodetection.
* make : add AVX512 detection to Makefile and CMakeLists.txt
* make : autodetect more AVX512 instruction subsets
* cmake : do not default to AVX512, must be enabled explicitly
* cmake : enable a set of AVX512 subsets, when AVX512 is turned on
* make : consolidate AVX512 subsets, add AVX512 VBMI
* cmake : revert to NO AVX512 setting, add settings for AVX512 VNNI and VBMI
* make : re-introduce AVX512VNNI back
* cmake : remove superfluous comment line
* make : use pkg-config for finding CFLAGS & LDFLAGS needed by OpenBLAS
That way building on *nix like environments (including MSYS2 on Windows)
with WHISPER_OPENBLAS=1 works out of the box.
Fix handling of WHISPER_OPENBLAS, so that empty value or 0 won't be
misinterpreted by make as enabled. Mind that it's not intended to
detect CMake false constants (OFF NO FALSE N). make is not CMake.
By default OpenBLAS with 64-bit interface is used, but that can be
changed with `WHISPER_OPENBLAS_INTERFACE64=0` if 32-bit one is desired.
If OpenBLAS headers and library are respectively in include/ and lib/
subdirectories of given path, then you can specify it, e.g.
`OPENBLAS_PATH=/usr/local/openblas`, and this will take precedence over
any pkg-config file.
If there is no pkg-config file (.pc) for OpenBLAS and OPENBLAS_PATH is
empty, then headers are assumed to be in /usr/include/openblas and
library as assumed to be called 'openblas64' (or 'openblas' if
`WHISPER_OPENBLAS_INTERFACE64=0`). If different headers location should
be used, then it can be done, e.g.
`WHISPER_BLAS_CFLAGS=-I/usr/local/include/openblas`.
If different library should be used, it can be specified, e.g.
`WHISPER_BLAS_LIB=openblasp64` (pthreads version as seen on Fedora), or
you can provide LDFLAGS needed to link with OpenBLAS directly:
`WHISPER_BLAS_LDFLAGS="-L/usr/local/lib/openblas -lopenblas64"`.
Current solution is flexible enough to handle most cases out there
without needlessly hardcoding possible OpenBLAS installation details.
* cmake : fix how pkg-config is used for finding include dirs and libraries needed by OpenBLAS
That way building on *nix like environments (including MSYS2 on Windows)
with -DWHISPER_OPENBLAS=ON should work out of the box as long as you
have CMake 3.25 or newer.
Make OPENBLAS_PATH environment variable supported not only on Windows.
It sets OpenBLAS include dir to ${OPENBLAS_PATH}/include and library to
${WHISPER_BLAS_LIB} (name without prefixes and suffixes) in
${OPENBLAS_PATH}/lib and avoids further package finding.
By default OpenBLAS with 64-bit interface is used (equivalent to setting
`-DWHISPER_BLAS_LIB=openblas64`), but that can be changed with
`-DWHISPER_OPENBLAS_INTERFACE64=OFF` (equivalent to setting
`-DWHISPER_BLAS_LIB=openblas`) if 32-bit one is desired.
Turn on BLA_STATIC for FindBLAS only when WHISPER_STATIC is enabled.
BLA_STATIC may not work as expected for pkg-config based operation.
Get rid of supporting BLAS_HOME environment variable. If OPENBLAS_PATH
is insufficient in your case, there is no pkg-config file to rely on,
then you can manually specify include dir, e.g.
`-DBLAS_INCLUDE_DIRS=/usr/local/include/openblas`, and library, e.g.
`-DBLAS_LIBRARIES=/usr/local/lib/libopenblas.so`.
* make / cmake : use OpenBLAS with 32-bit interface by default.
OpenBLAS w/o INTERFACE64=1 vel USE_64BITINT=1 seems to be more common.
* cmake : hardcode "lib" prefix for OpenBLAS lib filename (even on Windows)
* cmake : hardcode OpenBLAS library name when building in MSVC (Windows)
Most *nix like environments (including MSYS2 on Windows) have OpenBLAS
packages that allow coexistence of OpenBLAS builds with 32-bit and
64-bit interface (w/o and w/ OPENBLAS_USE64BITINT defined) and they
differ by not having or having "64" suffix in their library filenames.
That's not the case for OpenBLAS prebuilt libraries for Windows.
* ggml : embed Metal library source (ggml-metal.metal) into binary
enable by setting WHISPER_EMBED_METAL_LIBRARY
* rename the build option
* rename the preprocessor directive
* generate Metal library embedding assembly on-fly during build process
Fix for problem:
"""
RuntimeError: Aborted(Stack overflow! Stack cookie has been overwritten at 0x12be2b10, expected hex dwords 0x89BACDFE and 0x2135467, but received 0x00000000 0x00000000)
"""
That appears when executing the WASM example with the newer versions.
* sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.)
* metal : allow env metal variable to override resource path (#1415)
* Allow env variable to override resource path
* Update ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* sync : restore common / main from `master`
* sync : restore whisper from `master`
* talk-llama : update to latest llama.cpp
* ruby : fix build
* ggml : fix 32-bit ARM build
* ggml : fix MIN / MAX macro collisions + update ios bindings
* ggml : fix ifdefs and MIN / MAX again
* exampels : fix Obj-C and Swift examples
* ggml : fix 32-bit ARM compatibility
* ggml : one more attempt to fix 32-bit ARM compat
* whisper : fix support for larger graphs
---------
Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>
* metal : init
* whisper : factor out graph builds
* whisper : allocate encoder and decoder using ggml-alloc
* whisper : ggml-alloc is now supported
* whisper : CoreML support ggml-alloc
* build : fix ggml-alloc
* ios : update submodule
* extra : update sync-ggml.sh script to also sync ggml-alloc
* ci : see if this is causing the crash
* whisper : refactor ggml-alloc init
* whisper.android : try to fix build
* whisper : initial Metal version
* ci : try to debug vmem issue
* metal : decoder works on GPU!
* metal : add multi-decoder support
* ggml : fix ggml_nbytes (probably temp solution)
* metal : run "cross" step on the GPU
* whisper : remove ggml_repeat in the encoder
* whisper : offload the Encoder to Metal
* ggml : use simpler ggml_bytes() implementation
* ggml-alloc : try to make CI happy by reducing vram to 128GB
* whisper : add whisper_allocr to wrap ggml_allocr
* whisper : factor out alloc init in a function
* cmake : update to support Metal build
* whisper : add <functional> header
* objc : fix build (no Metal yet)
* ios : add Metal support
* swiftui : fix build
* metal : speed-up KQ multiplication
* metal : sync latest llama.cpp kernels
* readme : add Metal info
* ios : update submodule
* coreml : add code to toggle Core ML config (CPU, ANE, GPU)
* bench : fix timings by running a pre-heat
* bench : start benching the decoder
* whisper : add ggml_mul_mat_pad
* bench : fix uninitialized vars
* whisper : add comment for disabling mul-mat padding
* whisper : add description of ggml_mul_mat_pad
* whisper : clean-up ggml_mul_mat_pad
* metal : remove the "concurrent" flag
* bench : variable n_past
* ios : update SPM package