Commit Graph

888 Commits

Author SHA1 Message Date
Georgi Gerganov
72deb41eb2 whisper : split_on_word no longer trims (#1046) 2023-06-25 23:51:01 +03:00
Przemysław Pawełczyk
3f7a03ebe3 ggml : do not use _GNU_SOURCE gratuitously (#1027)
* Do not use _GNU_SOURCE gratuitously.

What is needed to build whisper.cpp and examples is availability of
stuff defined in The Open Group Base Specifications Issue 6
(https://pubs.opengroup.org/onlinepubs/009695399/) known also as
Single Unix Specification v3 (SUSv3) or POSIX.1-2001 + XSI extensions.

There is no need to penalize musl libc which simply follows standards.

Not having feature test macros in source code gives greater flexibility
to those wanting to reuse it in 3rd party app, as they can build it with
minimal FTM (_XOPEN_SOURCE=600) or other FTM depending on their needs.

It builds without issues in Alpine (musl libc), Ubuntu (glibc), MSYS2.

* examples : include SDL headers before other headers

This is an attempt at fixing macOS build error coming from SDL2 relying
on Darwin extension memset_pattern4/8/16 coming from Apple's string.h.
2023-06-25 16:34:30 +03:00
Przemysław Pawełczyk
62642bb61c talk-llama : fix build after ggml sync (#1049)
sed -i 's,GGML_BACKEND_CUDA,GGML_BACKEND_GPU,g' examples/talk-llama/llama.cpp
2023-06-25 16:13:50 +03:00
Georgi Gerganov
f1c9df5806 metal : sync ggml-metal (ref #1047) 2023-06-25 15:40:39 +03:00
Georgi Gerganov
6c25fae1c4 opencl : sync latest ggml-opencl 2023-06-25 15:38:30 +03:00
Philippe Normand
44cb044e66 whisper : fix build with -Werror=undef (#1045) 2023-06-25 15:30:39 +03:00
Simon Moisselin
6c68218e3c models : add ggml_to_pt script (#1042)
* adding ggml_to_pt

* typo sys too many args

* fixing swap errors dimensions

---------

Co-authored-by: simonMoisselin <simon.moisselin@gmail.com>
2023-06-25 15:29:54 +03:00
Roddur Dasgupta
f11f33f1c0 models : cd statements are quoted to allow spaces in path (#1041) 2023-06-25 15:27:28 +03:00
Georgi Gerganov
8ac23c9f77 models : handle paths with spaces in download script (close #1038) 2023-06-25 15:23:23 +03:00
Colin
14baf2e7f3 main : add diarization support for all current output types (#1031)
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-06-25 15:07:57 +03:00
GiviMAD
bc2dcf85fe readme : add java alternative binding (#1029)
Signed-off-by: Miguel Álvarez <miguelwork92@gmail.com>
2023-06-25 14:46:07 +03:00
Jay Binks
1e45911f1a go : add support for whisper_full_lang_id() (#1010)
* * Add support for whisper_full_lang_id() to go bindings

* Expose token.id so we can test beg, eot etc

---------

Co-authored-by: Jay Binks <jay.binks@overthewire.com.au>
2023-06-25 14:45:33 +03:00
Georgi Gerganov
67564201ec go : fix "cb" -> "callNewSegment" 2023-06-25 14:34:10 +03:00
Georgi Gerganov
5feb0dffba ggml : sync latest ggml lib 2023-06-25 14:30:44 +03:00
Bo-Yi Wu
7dfc11843c go : improve progress reporting and callback handling (#1024)
- Rename `cb` to `callNewSegment` in the `Process` function
- Add `callProgress` as a new parameter to the `Process` function
- Introduce `ProgressCallback` type for reporting progress during processing
- Update `Whisper_full` function to include `progressCallback` parameter
- Add `registerProgressCallback` function and `cbProgress` map for handling progress callbacks

Signed-off-by: appleboy <appleboy.tw@gmail.com>
2023-06-25 14:07:55 +03:00
byte-6174
6a7f3b8db2 make : update cuBLAS build both x86 and aarch64 (#1015)
make cuBLAS compilation compatible with x86 as well as aarch64.
2023-06-25 13:59:48 +03:00
KP Kaiser
207a12f5bc make : fix for CUDA native not working as an option on Ubuntu (#1012) 2023-06-25 13:57:18 +03:00
faker
26b70395ff main : exit gracefully when invalid params are passed
* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior
2023-06-25 13:52:29 +03:00
faker
598f607e28 main : gracefully exit when invalid params are passed (#1002)
* Refactor whisper_params_parse to return false on failure

* Updated help flag behavior
2023-06-25 13:51:59 +03:00
Akash Mahajan
3ec7bfffe0 py : make convert-pt-to-ggml.py backwards compatible with older vocab.json tokenizer files (#1001)
* patch checkpoint convert script to keep compatibility with older hf_transformers whisper tokenizer

* typo fix
2023-06-25 13:50:14 +03:00
Larry Battle
a7f822ef59 readme : corrected syntax for markdown link (#995) 2023-06-25 13:46:44 +03:00
Nicholas Albion
57543c169e updated java README 2023-06-06 10:27:26 +10:00
Nicholas Albion
5b9e59bc07 speak scripts for Windows 2023-06-01 22:45:00 +10:00
Nicholas Albion
3f7436e8a0 updated README for java 2023-06-01 16:55:48 +10:00
geniusnut
ce6f747064 whisper.android : support decode wav file has 2 channels (#972) 2023-05-31 10:13:14 +03:00
Nicholas Albion
d7c936b44a Feature/java bindings2 (#944)
* Java needs to call `whisper_full_default_params_by_ref()`, returning struct by val does not seem to work.
* added convenience methods to WhisperFullParams
* Remove unused WhisperJavaParams
2023-05-29 09:38:58 +10:00
genevera (she/her)
9b926844e3 models : fix README.md (#964)
Fixes typo on line 76 of models/README.md
2023-05-27 10:40:28 +03:00
DGdev91
5e2b3407ef examples : update elevenlabs scripts to use official python API (#837)
* Update elevenlabs example to use ufficial python API

* Update elevenlabs example to use official python API
2023-05-24 21:11:01 +03:00
0xsourcecode
4e16a8fb63 readme : highlight OpenBLAS support (#956)
* highlight openblas support

* Update README.md
2023-05-24 11:23:51 +03:00
Georgi Gerganov
77eab3fbfe talk-llama : sync latest llama.cpp (close #922, close #954) 2023-05-23 14:04:39 +03:00
Alexey Kharlamov
041be06d58 cmake : build with any BLAS compatible library (#927)
* Build with any BLAS library

* ci: Removed explicit CUDA nvcc path
2023-05-20 21:23:45 +03:00
Georgi Gerganov
429b9785c0 ggml : update WASM SIMD 2023-05-20 20:00:06 +03:00
Georgi Gerganov
e410cfc3ce ggml : sync latest ggml repo
- new Q4 and Q8 quantization
- updated CUDA
2023-05-20 18:56:30 +03:00
Nicholas Albion
bc89f285d8 bindings : add java bindings (#931)
* WIP - java bindings

* updated README

* failed attempt at JNI

* fullTranscribe() test passes

* tested on Ubuntu 20

* link to Java bindings
2023-05-20 18:25:02 +03:00
Elkana Bardugo
56a87ba45d whisper : fix hebrew language code (#935) 2023-05-20 18:17:54 +03:00
Ahmad Bilal
95b02d76b0 coreml : add support of large-v1 model (#926) 2023-05-15 18:36:06 +03:00
Georgi Gerganov
a5defbc1b9 release : v1.4.2 v1.4.2 2023-05-14 19:06:45 +03:00
Georgi Gerganov
aaf0d41c7c ggml : add AVX dot products 2023-05-14 18:56:46 +03:00
Georgi Gerganov
0cb820e0f9 talk-llama : fix build + sync latest llama.cpp 2023-05-14 18:46:42 +03:00
Jhen-Jie Hong
16564f554f readme : improve Core ML model conversion guidance (#915) 2023-05-14 18:11:08 +03:00
Georgi Gerganov
fd01209d09 coreml : support quantized model files 2023-05-14 18:09:44 +03:00
Georgi Gerganov
e693074aa6 ggml : sync latest ggml
- New Q4 and Q5 formats
- Various improvements
2023-05-14 18:04:23 +03:00
Rich Jones
d652cf12ec main : fix help for --no-timestamps arg (#908) 2023-05-14 17:54:57 +03:00
Georgi Gerganov
2b6a074305 extra : update ggml sync script 2023-05-14 10:01:52 +03:00
Jhen-Jie Hong
5300117471 whisper.objc : enable Core ML in example & fix segmentation fault (#910)
* coreml : update endcoder header import path

* coreml : force objc_arc in whisper-encoder.mm

* whisper.objc : create coreml/ group link

* whisper.objc : add coreml model link

* whisper.objc : update readme

* coreml : use -fobjc-arc for coreml/whisper-encoder.mm

* ci: create dummy .mlmodelc for pass ios build

* whisper.objc : update readme

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-05-14 09:47:02 +03:00
Georgi Gerganov
70af52a316 coreml : fix seg fault, double free (#919, #917, #899) 2023-05-14 09:42:19 +03:00
Georgi Gerganov
1d17cd5bb3 coreml : fix memory leak (#899) 2023-05-09 18:38:12 +03:00
Jonathan Soo
bf2449dfae cmake : fix define used for COREML_ALLOW_FALLBACK (#893) 2023-05-08 21:08:09 +03:00
Luis Herrera
4e4d00c67a talk-llama : only copy used KV cache in get / set state (#890)
---------

Co-authored-by: ejones <evan.q.jones@gmail.com>
2023-05-08 20:59:21 +03:00
Clifford Heath
9931d66400 readme : add instructions on converting to GGML + "--no-config" to wget (#874) 2023-05-08 20:58:36 +03:00