whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-11-07 08:34:37 +01:00

Author	SHA1	Message	Date
Georgi Gerganov	1d5752fa42	make : fix GGML_VULKAN=1 build (#2485 )	2024-10-16 18:42:47 +03:00
Georgi Gerganov	396089f3cf	whisper : revert mel-related changes (#0 ) too much extra logic and complexity for small benefit	2024-10-05 15:23:51 +03:00
Georgi Gerganov	941912467d	whisper : adapt to latest ggml (skip) (#0 )	2024-10-05 15:23:51 +03:00
Georgi Gerganov	2ef717b293	whisper : add large-v3-turbo (#2440 )	2024-10-01 15:57:06 +03:00
Georgi Gerganov	8feb375fbd	tests : remove test-backend-ops (#2434 )	2024-09-27 11:49:01 +03:00
Georgi Gerganov	451e9ee92c	make : remove "talk" target until updated	2024-09-24 19:45:08 +03:00
Georgi Gerganov	fe18c29ab8	talk-llama : sync llama.cpp	2024-09-24 19:45:08 +03:00
hsinhoyeh	c4e1861d2c	go : add beamsize/entropythold/maxcontext to context interface (#2350 ) * feat(go binding): add beamsize/entropythold/maxcontext to context interface fixes: #2349 * fix go building build * fix dynamic link .so and header.h * remove LD_LIBRARY_PATH * remove ggml obj from whisper dynamic lib * drop LIB_GGML	2024-08-28 17:09:01 +03:00
Georgi Gerganov	22058f2dbc	talk-llama : sync llama.cpp	2024-08-08 22:48:46 +03:00
Georgi Gerganov	5226c3d45c	make : remove llama prints [no ci] (#2265 )	2024-07-08 14:53:55 +03:00
Georgi Gerganov	bb3dd45524	make : disable CUDA graphs	2024-06-26 23:20:13 +03:00
Georgi Gerganov	9f7f36d4c9	make : disable CUDA mel build	2024-06-26 22:25:25 +03:00
Georgi Gerganov	0a55a70b9b	make : fix missing -O3 same as https://github.com/ggerganov/llama.cpp/pull/8143	2024-06-26 21:21:12 +03:00
Georgi Gerganov	e30c679928	whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]	2024-06-26 19:34:09 +03:00
Georgi Gerganov	5181494e9f	build : update make / cmake	2024-06-18 09:39:40 +03:00
Georgi Gerganov	3b1ac03828	ggml : remove OpenCL (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	6975600b4b	cuda : enable CUDA graphs (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	4942b1b428	cmake : fix CUDA build (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	420b6abc54	cuda : fix HIPBLAS build (#2234 )	2024-06-11 19:14:38 +03:00
Borislav Stanimirov	ffef323c4c	whisper : add CUDA-specific computation mel spectrograms (#2206 ) * whisper : use polymorphic class to calculate mel spectrogram * whisper : add cuda-specific mel spectrogram calculation * whisper : conditionally compile cufftGetErrorString to avoid warnings * build : add new files to makefile * ruby : add new files to conf script * build : fix typo in makefile * whisper : suppress cub warning for deprecated C++ std in whisper-mel-cuda	2024-06-04 09:32:23 +03:00
Przemysław Pawełczyk	b6680fab50	build : improve disabling AVX-512 (#2129 ) * cmake : make WHISPER_NO_AVX512=ON disable all subsets of AVX-512 Previously it happened only for MSVC, but it makes sense to have the same behavior for other compilers too. * make : reorder x86 ISA extensions in chronological order And update compiler flags at the end to ease modifying conditions. * make : support WHISPER_NO_AVX512=1 for disabling all AVX-512 subsets. That way you do not have to override each AVX-512 subset setting individually if it has been turned on during autodetection.	2024-05-08 18:32:43 +03:00
Przemysław Pawełczyk	8fac6455ff	make : change GNU make default CXX from g++ to c++ (#2100 )	2024-04-28 22:54:21 +01:00
Didzis Gosko	08d3eef97d	build : fix embedded Metal library generation (#2045 )	2024-04-15 20:23:05 +03:00
Didzis Gosko	c7f95b7ca2	build : detect AVX512 in Makefile, add AVX512 option in CMake (#2043 ) * make : add AVX512 detection to Makefile and CMakeLists.txt * make : autodetect more AVX512 instruction subsets * cmake : do not default to AVX512, must be enabled explicitly * cmake : enable a set of AVX512 subsets, when AVX512 is turned on * make : consolidate AVX512 subsets, add AVX512 VBMI * cmake : revert to NO AVX512 setting, add settings for AVX512 VNNI and VBMI * make : re-introduce AVX512VNNI back * cmake : remove superfluous comment line	2024-04-15 20:02:09 +03:00
Przemysław Pawełczyk	1e8f28c42a	build : use pkg-config for OpenBLAS (#1778 ) * make : use pkg-config for finding CFLAGS & LDFLAGS needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with WHISPER_OPENBLAS=1 works out of the box. Fix handling of WHISPER_OPENBLAS, so that empty value or 0 won't be misinterpreted by make as enabled. Mind that it's not intended to detect CMake false constants (OFF NO FALSE N). make is not CMake. By default OpenBLAS with 64-bit interface is used, but that can be changed with `WHISPER_OPENBLAS_INTERFACE64=0` if 32-bit one is desired. If OpenBLAS headers and library are respectively in include/ and lib/ subdirectories of given path, then you can specify it, e.g. `OPENBLAS_PATH=/usr/local/openblas`, and this will take precedence over any pkg-config file. If there is no pkg-config file (.pc) for OpenBLAS and OPENBLAS_PATH is empty, then headers are assumed to be in /usr/include/openblas and library as assumed to be called 'openblas64' (or 'openblas' if `WHISPER_OPENBLAS_INTERFACE64=0`). If different headers location should be used, then it can be done, e.g. `WHISPER_BLAS_CFLAGS=-I/usr/local/include/openblas`. If different library should be used, it can be specified, e.g. `WHISPER_BLAS_LIB=openblasp64` (pthreads version as seen on Fedora), or you can provide LDFLAGS needed to link with OpenBLAS directly: `WHISPER_BLAS_LDFLAGS="-L/usr/local/lib/openblas -lopenblas64"`. Current solution is flexible enough to handle most cases out there without needlessly hardcoding possible OpenBLAS installation details. cmake : fix how pkg-config is used for finding include dirs and libraries needed by OpenBLAS That way building on nix like environments (including MSYS2 on Windows) with -DWHISPER_OPENBLAS=ON should work out of the box as long as you have CMake 3.25 or newer. Make OPENBLAS_PATH environment variable supported not only on Windows. It sets OpenBLAS include dir to ${OPENBLAS_PATH}/include and library to ${WHISPER_BLAS_LIB} (name without prefixes and suffixes) in ${OPENBLAS_PATH}/lib and avoids further package finding. By default OpenBLAS with 64-bit interface is used (equivalent to setting `-DWHISPER_BLAS_LIB=openblas64`), but that can be changed with `-DWHISPER_OPENBLAS_INTERFACE64=OFF` (equivalent to setting `-DWHISPER_BLAS_LIB=openblas`) if 32-bit one is desired. Turn on BLA_STATIC for FindBLAS only when WHISPER_STATIC is enabled. BLA_STATIC may not work as expected for pkg-config based operation. Get rid of supporting BLAS_HOME environment variable. If OPENBLAS_PATH is insufficient in your case, there is no pkg-config file to rely on, then you can manually specify include dir, e.g. `-DBLAS_INCLUDE_DIRS=/usr/local/include/openblas`, and library, e.g. `-DBLAS_LIBRARIES=/usr/local/lib/libopenblas.so`. make / cmake : use OpenBLAS with 32-bit interface by default. OpenBLAS w/o INTERFACE64=1 vel USE_64BITINT=1 seems to be more common. * cmake : hardcode "lib" prefix for OpenBLAS lib filename (even on Windows) * cmake : hardcode OpenBLAS library name when building in MSVC (Windows) Most *nix like environments (including MSYS2 on Windows) have OpenBLAS packages that allow coexistence of OpenBLAS builds with 32-bit and 64-bit interface (w/o and w/ OPENBLAS_USE64BITINT defined) and they differ by not having or having "64" suffix in their library filenames. That's not the case for OpenBLAS prebuilt libraries for Windows.	2024-03-29 15:53:26 +02:00
Georgi Gerganov	9fb308d90f	make : add grammar parser to common objects	2024-03-28 11:59:48 +02:00
Georgi Gerganov	2948c740a2	sync : ggml (#2001 ) * sync : update scripts * sync : ggml * talk-llama : sync llama.cpp * make : WHISPER_CUBLAS -> WHISPER_CUDA * ci : try to fix sycl build * talk-llama : fix make build	2024-03-27 18:55:10 +02:00
Georgi Gerganov	de4d067f1e	talk-llama : sync llama.cpp	2024-03-15 14:21:59 +02:00
LBlue	276615d708	make : fix CUBLAS link with WSL (#1878 )	2024-02-20 12:05:38 +02:00
Georgi Gerganov	65faae0b6a	build : update CBLAS flags + fix unused var warning (#0 )	2024-02-19 14:44:46 +02:00
Didzis Gosko	163e74b6c3	metal : option to embed MSL source into compiled binary (#1842 ) * ggml : embed Metal library source (ggml-metal.metal) into binary enable by setting WHISPER_EMBED_METAL_LIBRARY * rename the build option * rename the preprocessor directive * generate Metal library embedding assembly on-fly during build process	2024-02-11 16:41:41 +02:00
Didzis Gosko	0f80e5a80a	whisper : expose CUDA device setting in public API (#1840 ) * Makefile : allow to override CUDA_ARCH_FLAG * whisper : allow to select GPU (CUDA) device from public API	2024-02-09 17:27:47 +02:00
Didzis Gosko	b6559333ff	make : add macOS deployment target option (#1839 )	2024-02-09 17:26:29 +02:00
jwijffels	3e6fad07aa	make : update MSYS_NT (#1813 ) I just upgraded the R wrapper at https://github.com/bnosac/audio.whisper to use whisper.cpp 1.5.4 I'm working on Windows and noticed while doing that that it did not pick up the relevant CFLAGS/CXXFLAGS as my system showed ``` I whisper.cpp build info: I UNAME_S: MSYS_NT-10.0-19045 I UNAME_P: unknown I UNAME_M: x86_64 ``` Many thanks for all the tremendous hard work on maintaining whisper.cpp!	2024-01-30 14:13:49 +02:00
Przemysław Pawełczyk	f5f159c320	server : fix building and simplify lib deps on Windows (#1772 ) * make : fix server example building on MSYS2 environments (Windows) It was not working since commit `eff3570f78` when server was introduced. * cmake : simplify server example lib deps on Windows server uses httplib::Server, not httplib::SSLServer, so there is no need to mention cryptographic libraries in target_link_libraries. Winsock (ws2_32) suffices here. Also use plain library names like we use in other places.	2024-01-15 15:48:13 +02:00
Georgi Gerganov	e77b27c331	sync : ggml (VMM, sync-ggml-am, dotprod ARM fixes, CUDA fixes) (#1691 ) * scripts : add sync-ggml-am.sh * sync : ggml (VMM, ARM dot prod fix, etc.) * build : fix CUDA build * ggml : fix some mul mat cases + add tests for src1 F16 `dbd02958fa`	2023-12-29 11:30:47 +02:00
Felix	eff3570f78	server : add a REST Whisper server example with OAI-like API (#1380 ) * Add first draft of server * Added json support and base funcs for server.cpp * Add more user input via api-request also some clean up * Add reqest params and load post function Also some general clean up * Remove unused function * Add readme * Add exception handlers * Update examples/server/server.cpp * make : add server target * Add magic curl syntax Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-20 21:40:24 +02:00
Georgi Gerganov	bfbaa4dce5	whisper : make large version explicit + fix data size units (#1493 )	2023-11-15 19:42:25 +02:00
Evan Jones	3e5c7feeff	whisper : add grammar-based sampling (#1229 ) * whisper : add grammar-based sampling * build : fix after master merge * command : fix exception when recognizing the command * whisper : fine-tuning grammar functionality * command : grammar-related improvements - option to read grammar from file - add sample grammars for colors and chess moves - fine-tune the performance further * grammars : add assistant + update comments * command : enable beam-search, add "no_timestamps", add "context", add p * whisper : remove comment --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-11-13 10:51:34 +02:00
Georgi Gerganov	b0502836b8	whisper : add full CUDA and Metal offloading (#1472 ) * whisper : migrate to ggml-backend * whisper : fix logit reading * whisper : fix tensor allocation during load * whisper : fix beam-search with CUDA * whisper : free backends + fix compile warning * whisper : print when CUDA is enabled * whisper : fix CoreML * make : clean-up * talk : fix compile warning * whisper : support ggml_conv with CUDA and Metal (#1473) * ggml : add CUDA support for ggml_conv * whisper : remove ggml_repeat for conv bias + single backend * cuda : fix im2col kernel * metal : add im2col support + mul mat-vec f16 x f16 * bench-all : add q4 models * whisper : clean-up * quantize-all : fix * ggml : im2col opts * whisper : avoid whisper_model_data wrapper * whisper : add note that ggml_mul_mat_pad does not work with CUDA * whisper : factor out graph compute in common function * whisper : fixes * whisper : fix UB with measure buffers * whisper : try to fix the parallel whisper_state functionality (#1479) * whisper : try to fix the parallel whisper_state functionality * whisper : fix multi-state Metal * whisper : free backend instances in whisper_state	2023-11-12 15:31:08 +02:00
Georgi Gerganov	2cdfc4e025	whisper : add support for large v3 (#1444 ) * whisper : add support for large v3 * bench : fix build + fix go bindings * bench : fix n_mels * models : update readme	2023-11-07 15:30:18 +02:00
Georgi Gerganov	f96e1c5b78	sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 ) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>	2023-11-03 21:35:05 +02:00
Georgi Gerganov	b8432f28f4	metal : add F32 support + update bench output	2023-09-15 13:56:08 +03:00
Georgi Gerganov	93935980f8	whisper : Metal and ggml-alloc support (#1270 ) * metal : init * whisper : factor out graph builds * whisper : allocate encoder and decoder using ggml-alloc * whisper : ggml-alloc is now supported * whisper : CoreML support ggml-alloc * build : fix ggml-alloc * ios : update submodule * extra : update sync-ggml.sh script to also sync ggml-alloc * ci : see if this is causing the crash * whisper : refactor ggml-alloc init * whisper.android : try to fix build * whisper : initial Metal version * ci : try to debug vmem issue * metal : decoder works on GPU! * metal : add multi-decoder support * ggml : fix ggml_nbytes (probably temp solution) * metal : run "cross" step on the GPU * whisper : remove ggml_repeat in the encoder * whisper : offload the Encoder to Metal * ggml : use simpler ggml_bytes() implementation * ggml-alloc : try to make CI happy by reducing vram to 128GB * whisper : add whisper_allocr to wrap ggml_allocr * whisper : factor out alloc init in a function * cmake : update to support Metal build * whisper : add <functional> header * objc : fix build (no Metal yet) * ios : add Metal support * swiftui : fix build * metal : speed-up KQ multiplication * metal : sync latest llama.cpp kernels * readme : add Metal info * ios : update submodule * coreml : add code to toggle Core ML config (CPU, ANE, GPU) * bench : fix timings by running a pre-heat * bench : start benching the decoder * whisper : add ggml_mul_mat_pad * bench : fix uninitialized vars * whisper : add comment for disabling mul-mat padding * whisper : add description of ggml_mul_mat_pad * whisper : clean-up ggml_mul_mat_pad * metal : remove the "concurrent" flag * bench : variable n_past * ios : update SPM package	2023-09-15 12:18:18 +03:00
Przemysław Pawełczyk	b55b505690	build : do not use _GNU_SOURCE gratuitously (#1129 ) * Do not use _GNU_SOURCE gratuitously. What is needed to build whisper.cpp and examples is availability of stuff defined in The Open Group Base Specifications Issue 6 (https://pubs.opengroup.org/onlinepubs/009695399/) known also as Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions, plus some stuff from BSD that is not specified in POSIX.1. Well, that was true until NUMA support was added recently in ggml, so enable GNU libc extensions for Linux builds to cover that. There is no need to penalize musl libc which simply follows standards. Not having feature test macros in source code gives greater flexibility to those wanting to reuse it in 3rd party app, as they can build it with minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs. It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2. * examples : include SDL headers before other headers Avoid macOS build error when _DARWIN_C_SOURCE is not defined, brought by SDL2 relying on Darwin extension memset_pattern4/8/16 (from string.h). * make : enable BSD extensions for DragonFlyBSD to expose RLIMIT_MEMLOCK * make : use BSD-specific FTMs to enable alloca on BSDs * make : fix OpenBSD build by exposing newer POSIX definitions * cmake : follow recent FTM improvements from Makefile	2023-09-07 12:36:14 +03:00
Didzis Gosko	64cb45fd79	make : fix detection of AVX2 on macOS (#1250 )	2023-09-06 18:22:21 +03:00
Przemysław Pawełczyk	ba3c333611	make : improve cpuinfo handling on x86 hosts (#1238 ) * make : simplify and correct x86 ISA extensions detection on the host It got broken in commit `c5f9acf4b7` for Haiku and Mac OS (Intel), which report CPU features in upper case. Now we're finding the names in case-insensitive manner and as words. SSE3 detection has been corrected for Linux, which uses PNI for that (Prescott New Instructions). * make : use dmesg.boot in FreeBSD/DragonFlyBSD to detect x86 ISA extensions on the host * make : enable x86 ISA extensions on the host both in CFLAGS and CXXFLAGS * make : correct AVX x86 ISA extension detection on macOS (Intel) host It got broken in commit `c5f9acf4b7`. macOS calls it AVX1.0.	2023-09-05 14:58:47 +03:00
Przemysław Pawełczyk	8e46ba80d3	make : use cpuinfo in MSYS2 to enable x86 ISA extensions on the host (#1216 )	2023-08-28 13:28:26 +03:00
Przemysław Pawełczyk	b0d35995c4	make : add support for building on DragonFlyBSD/NetBSD/OpenBSD (#1212 )	2023-08-27 21:38:46 +03:00
Przemysław Pawełczyk	601c2d2181	ggml : detect SSSE3 (#1211 ) * ggml : add ggml_cpu_has_ssse3 * whisper : show SSSE3 in system info * make : detect SSSE3 via cpuinfo	2023-08-27 21:36:41 +03:00

1 2 3

125 Commits