This change makes it possible to build ggml-cuda.cu and ggml-metal.m as
independent dynamic shared objects, that may be conditionally linked at
runtime in a multiplatform binary. It introduces a GGML_CALL annotation
that documents which functions have a cyclic call relationship, between
the application code and GPU modules.
This change does nothing, unless the build defines -DGGML_MULTIPLATFORM
which causes back-references and function pointers to conform to MS ABI
which is supported by NVCC, ROCm, XCode, GCC and Clang across platforms
* whisper : migrate to ggml-backend
* whisper : fix logit reading
* whisper : fix tensor allocation during load
* whisper : fix beam-search with CUDA
* whisper : free backends + fix compile warning
* whisper : print when CUDA is enabled
* whisper : fix CoreML
* make : clean-up
* talk : fix compile warning
* whisper : support ggml_conv with CUDA and Metal (#1473)
* ggml : add CUDA support for ggml_conv
* whisper : remove ggml_repeat for conv bias + single backend
* cuda : fix im2col kernel
* metal : add im2col support + mul mat-vec f16 x f16
* bench-all : add q4 models
* whisper : clean-up
* quantize-all : fix
* ggml : im2col opts
* whisper : avoid whisper_model_data wrapper
* whisper : add note that ggml_mul_mat_pad does not work with CUDA
* whisper : factor out graph compute in common function
* whisper : fixes
* whisper : fix UB with measure buffers
* whisper : try to fix the parallel whisper_state functionality (#1479)
* whisper : try to fix the parallel whisper_state functionality
* whisper : fix multi-state Metal
* whisper : free backend instances in whisper_state
* sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.)
* metal : allow env metal variable to override resource path (#1415)
* Allow env variable to override resource path
* Update ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* sync : restore common / main from `master`
* sync : restore whisper from `master`
* talk-llama : update to latest llama.cpp
* ruby : fix build
* ggml : fix 32-bit ARM build
* ggml : fix MIN / MAX macro collisions + update ios bindings
* ggml : fix ifdefs and MIN / MAX again
* exampels : fix Obj-C and Swift examples
* ggml : fix 32-bit ARM compatibility
* ggml : one more attempt to fix 32-bit ARM compat
* whisper : fix support for larger graphs
---------
Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>