Georgi Gerganov
794b162a46
whisper : add integer quantization support ( #540 )
...
* whisper : add integer quantization support
* examples : add common-ggml + prepare to add "quantize" tool
* whisper : quantization tool ready
* whisper : fix F32 support
* whisper : try to fix shared lib linkage
* wasm : update quantized models to Q5
* bench.wasm : remove "medium" button
* bench.wasm : fix custom model button
* ggml : add Q5_0 and Q5_1 WASM SIMD
* wasm : add quantized models to all WASM examples
* wasm : bump DB version number to 2
* talk-llama : update example to latest llama.cpp
* node : increase test timeout to 10s
* readme : add information for model quantization
* wasm : add links to other examples
2023-04-30 18:51:57 +03:00
Georgi Gerganov
5fd1bdd7fc
whisper : add GPU support via cuBLAS ( #834 )
...
* make : add WHISPER_CUBLAS
* make : fix CUBLAS build
* whisper : disable Flash Attention + adjust memory buffers
* whisper : remove old commented code
* readme : add cuBLAS instructions
* cmake : add WHISPER_CUBLAS option
* gitignore : ignore build-cublas
2023-04-30 12:14:33 +03:00
Georgi Gerganov
3efb81dec6
build : add WHISPER_COREML_ALLOW_FALLBACK to make / CMake ( #812 )
2023-04-29 10:55:24 +03:00
Georgi Gerganov
5e47e223bd
whisper : add Core ML support ( #566 )
...
* coreml : use Core ML encoder inference
* coreml : simlpify whisper_encode + log messages
* whisper : resolve rebase conflicts
* coreml : add scripts for CoreML model generation
* bench-all : recognize COREML flag
2023-04-15 13:21:27 +03:00
duthils
5f16420333
make : disable avx in case f16c is not available ( #706 )
...
Why:
* ggml.c does not support AVX without F16C
2023-04-14 19:31:51 +03:00
Georgi Gerganov
2f889132c6
ggml : sync latest changes from ggml and llama.cpp
2023-04-13 18:53:44 +03:00
clach04
aac1710afb
make : 32-bit ARM flags ( #486 )
...
* issue #470 - working 32-bit ARM
* Update Makefile
* Update Makefile
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-29 23:11:35 +03:00
Georgi Gerganov
4a0deb8b1e
talk-llama : add new example + sync ggml from llama.cpp ( #664 )
...
* talk-llama : talk with LLaMA AI
* talk.llama : disable EOS token
* talk-llama : add README instructions
* ggml : fix build in debug
2023-03-27 21:00:32 +03:00
Georgi Gerganov
4aa3bcf8a4
make : fix MUSL Linux build ( #576 )
2023-03-22 20:51:42 +02:00
Georgi Gerganov
af005d573f
make : add -DNDEBUG compile flag
2023-02-28 23:27:54 +02:00
FlippFuzz
f420de1322
make : add "-mcpu=native" when building for aarch64 ( #532 )
2023-02-27 21:04:16 +02:00
Georgi Gerganov
09d7d2b68e
examples : refactor in order to reuse code and reduce duplication ( #482 )
...
* examples : refactor common code into a library
* examples : refactor common SDL code into a library
* make : update Makefile to use common libs
* common : fix MSVC M_PI ..
* addon.node : link common lib
2023-02-15 19:28:10 +02:00
fitzsim
ae16c21e9c
whisper : PPC64 big-endian support ( #398 )
...
* ggml : set cache line size to 128 on POWER9
* whisper : add PPC64 big endian support
2023-01-23 20:48:10 +02:00
Georgi Gerganov
1290fc6457
bench : add memcpy and ggml_mul_mat benchmarks
2023-01-18 20:31:46 +02:00
David Thorpe
322f4e6c4e
go : bindings updated so they can be used in third party packages. ( #379 )
...
* Updated bindings so they can be used in third pary packages.
* Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin
2023-01-06 19:32:28 +02:00
Abitofevrything
a62170c656
ggml : add SSE3 and fp16 conversion lookup table ( #368 )
...
* Improves WASM performance:
On MacBook M1 Pro, I observe 25% faster using Firefox and 35% faster using Chrome
* Add support for SSE3 SIMD
* Add SSE3 to system information
* Add Imath support for fp16-fp32 conversions
* Add Imath to system information
* Wrap Imath calls to avoid static function warnings
* Drop Imath; Add lookup table for f16 -> f32 conversions
* Remove TODO comments
* Update SSE3 to new macro arguments
* Correct updated macro definitions
* Prefer static inline where possible
* ggml : static inlines + add public f16 <-> f32 conversions
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-06 18:45:59 +02:00
Georgi Gerganov
196d738974
minor : close #370 + Makefile build info print change
2023-01-05 21:35:45 +02:00
Georgi Gerganov
0be6a1afd9
make : print build information
2023-01-02 13:35:26 +02:00
Georgi Gerganov
9a8ad3db69
make : add i686 arch ( close #329 )
2022-12-29 13:58:55 +02:00
Thomas Fitzsimmons
466ceebb78
ggml : add f16 acceleration for POWER9 ppc64le
2022-12-23 13:23:58 +02:00
Georgi Gerganov
1eb81f863f
make : revert accidental change of optimization flags
2022-12-17 18:57:42 +02:00
Georgi Gerganov
32fbc8cd04
main : add option to print the progress ( #276 )
2022-12-16 20:20:43 +02:00
Georgi Gerganov
3b1aacbe6d
talk : talk with AI in the terminal
2022-12-10 16:51:58 +02:00
Georgi Gerganov
832b4f34c9
make : indentation + .gitignore
2022-12-08 19:42:06 +02:00
Reinis Muiznieks
0f98755fc5
Flag for Position Independent Code
2022-12-08 19:41:01 +02:00
Al Hoang
04a16bbf11
fix compilation on haiku
2022-12-08 09:20:57 +02:00
Georgi Gerganov
9fe7306f4b
models : add the new "large" model release by OpenAI
...
The old "large" model is now renamed "large-v1".
If you have been using it, make sure to rename it and download the new
"large" model for best results.
2022-12-06 18:48:57 +02:00
Georgi Gerganov
9b7df68753
tests : adding transcription tests
2022-12-02 21:40:02 +02:00
Tienshiao Ma
e7f09a0a61
Fix Darwin flags - was incorrectly always using the Linux else clause
2022-12-01 19:17:04 +02:00
Georgi Gerganov
bc88eb13c6
examples : add "command" tool ( #171 )
2022-11-25 19:36:57 +02:00
vicalloy
fd113687aa
correct model name display on running samples
2022-11-25 07:17:02 +02:00
katsu560
4b2f51b479
add gprof option
2022-11-23 22:16:33 +02:00
katsu560
800ae5b808
fix AVX,AVX2,FMA,F16C detection on Linux and add flags for OpenBLAS
2022-11-23 22:16:33 +02:00
Georgi Gerganov
41b48ab7f1
make : add libwhisper.so target ( #144 )
2022-11-13 09:09:48 +02:00
Chidi Williams
9e700e1821
Check for AVX and AVX2 on Darwin
2022-11-09 18:49:55 +02:00
Syed Jafri
c63ce24834
Allow building with Accelerate for x86_64 Macs ( #123 )
...
* Cross compile windows
* set env properly
* rm log
* fix review
* Add back space
* Don't force architecture
* Allow building x86_64 with accelerate
2022-11-02 18:00:19 +02:00
Syed Jafri
24cd12f647
Cross compilation ( #121 )
...
* Cross compile windows
* set env properly
* rm log
* fix review
* Add back space
2022-11-02 08:46:49 +02:00
Georgi Gerganov
c6710efde2
refactoring : move main + stream in examples + other stuff
2022-10-25 20:53:48 +03:00
Georgi Gerganov
6b45e37b2b
Update README.md and finalize the whisper.wasm example
2022-10-22 18:54:01 +03:00
undef
19a780afe5
added handling for falsely as x86_64 announced ARM Macs
2022-10-19 01:01:53 +02:00
Georgi Gerganov
b81a81d543
Link Accelerate framework to "stream" example
2022-10-18 00:12:51 +03:00
Georgi Gerganov
72d967bce4
Use Accelerate framework on Apple silicon
...
Huge performance improvement in the Encode (almost x2 on MacBook M1 Pro)
Also various extra optimizations:
- Multi-threaded NORM operator
- Faster GELU via F16 cast
2022-10-18 00:12:51 +03:00
Georgi Gerganov
0e858f080d
close #56 : build on FreeBSD
...
Thanks to @abelbabel for the contribution
2022-10-17 18:10:16 +03:00
0/0
64752acd27
add static library make target
2022-10-09 19:16:42 -06:00
Georgi Gerganov
5e563ef635
Fix Makefile for MacBook Intel
2022-10-08 17:35:55 +03:00
Georgi Gerganov
167324584b
wip : rpi4 support
2022-10-05 23:03:46 +03:00
Georgi Gerganov
ce1fe95902
wip : improve makefile
2022-10-05 23:03:46 +03:00
Georgi Gerganov
6b77124e01
Initial C-style interface for whisper.cpp
2022-10-04 23:18:15 +03:00
Georgi Gerganov
b6bf906730
ref #10 : quick-and-dirty attempt for real-time audio transciption
...
- Processes input in chunks of 3 seconds.
- Padding audio with silence
- Uses 1 second audio from previous pass
- No text context
2022-10-02 17:55:45 +03:00
Georgi Gerganov
3bcdbdfc32
Reduce memory usage even more + better sampling
...
- The encode/decode memory buffers are now reused
- If the 30-sec segment goes for too long without a timestamp token, we
force one. Improves transcription for large model
- Stereo support
- Add "micro-machines.wav" sample
2022-09-30 19:35:27 +03:00