Commit Graph

2328 Commits

Author SHA1 Message Date
Georgi Gerganov
d375d73b2e bench : improve benchmarks 2023-05-01 14:44:39 +03:00
Georgi Gerganov
7765770f89 whisper : add memory sizes for Q8_0 (close #846) 2023-05-01 10:03:56 +03:00
Baffin Lee
872a85ae94 whisper.wasm : fix typo in readme (#832) 2023-05-01 09:28:05 +03:00
Georgi Gerganov
9c61f5f585 release : v1.4.1 v1.4.1 2023-04-30 22:57:42 +03:00
Georgi Gerganov
c94c469592 whisper : fix quantize bug (#842)
* whisper : debug

* whisper : fix bug during quantization
2023-04-30 22:50:04 +03:00
Georgi Gerganov
feac80dd3f ggml : fix UB (int << 31) 2023-04-30 22:27:30 +03:00
Georgi Gerganov
fa8dbdc888 release : v1.4.0 v1.4.0 2023-04-30 19:23:37 +03:00
Georgi Gerganov
4a7d49af95 examples : fix + refactor Levenshtein distance 2023-04-30 19:12:49 +03:00
Georgi Gerganov
794b162a46 whisper : add integer quantization support (#540)
* whisper : add integer quantization support

* examples : add common-ggml + prepare to add "quantize" tool

* whisper : quantization tool ready

* whisper : fix F32 support

* whisper : try to fix shared lib linkage

* wasm : update quantized models to Q5

* bench.wasm : remove "medium" button

* bench.wasm : fix custom model button

* ggml : add Q5_0 and Q5_1 WASM SIMD

* wasm : add quantized models to all WASM examples

* wasm : bump DB version number to 2

* talk-llama : update example to latest llama.cpp

* node : increase test timeout to 10s

* readme : add information for model quantization

* wasm : add links to other examples
2023-04-30 18:51:57 +03:00
Georgi Gerganov
5fd1bdd7fc whisper : add GPU support via cuBLAS (#834)
* make : add WHISPER_CUBLAS

* make : fix CUBLAS build

* whisper : disable Flash Attention + adjust memory buffers

* whisper : remove old commented code

* readme : add cuBLAS instructions

* cmake : add WHISPER_CUBLAS option

* gitignore : ignore build-cublas
2023-04-30 12:14:33 +03:00
Georgi Gerganov
0ccd6746c9 ggml : fix WASM build 2023-04-29 21:37:23 +03:00
Georgi Gerganov
d9b550c0a1 ggml : fix 32-bit ARM NEON (#836)
* ggml : add support for 32-bit ARM

* ggml : fix

* ggml : fix
2023-04-29 21:33:33 +03:00
Georgi Gerganov
e9b091c92a ggml : use vzip instead of vuzp for consistency 2023-04-29 21:14:09 +03:00
Georgi Gerganov
1f30b99208 ggml : fix WASM build 2023-04-29 20:21:25 +03:00
Georgi Gerganov
05c3ea3bc8 ggml : sync with ggml repo (warning fixes + asserts) 2023-04-29 19:33:28 +03:00
Thijs Raymakers
6108d3cc58 whisper : use correct seek_end when offset is used (#833)
Whenever an `offset_ms` is provided, the value of `seek_end` is
calculated incorrectly. This causes Whisper to keep transcribing
after the end of the file.

The current behavior looks like
```
[00:34:40.000 --> 00:34:47.000]   This is an example audio file.
[00:34:47.000 --> 00:34:49.000]   The text has been redacted
[00:34:49.000 --> 00:34:51.000]   This is the end of the audio.
[00:34:51.000 --> 00:34:52.000]   ***
[00:34:52.000 --> 00:34:53.000]   ***
[00:34:53.000 --> 00:34:54.000]   ***
[00:34:55.000 --> 00:34:56.000]   ***
...
```

The expected behavior should be
```
[00:34:40.000 --> 00:34:47.000]   This is an example audio file.
[00:34:47.000 --> 00:34:49.000]   The text has been redacted
[00:34:49.000 --> 00:34:51.000]   This is the end of the audio.
- end of program -
```

This commit changes the calculation of the `seek_end` variable to
only add `seek_start` if a custom `duration_ms` is provided.
Otherwise, it defaults to the end of the file.

Signed-off-by: Thijs Raymakers <thijs@raymakers.nl>
2023-04-29 18:55:37 +03:00
Georgi Gerganov
bab97c83d0 tests : add "threads" to run-tests.sh 2023-04-29 12:32:28 +03:00
Georgi Gerganov
3eaeb030ff extra : add sync-ggml.sh script 2023-04-29 12:32:28 +03:00
Georgi Gerganov
acec73ab6e ggml : sync latest ggml + llama.cpp updates (quantization) 2023-04-29 12:32:28 +03:00
Zollner
5cc17418c7 whisper.android : add some tips (#816) 2023-04-29 11:00:20 +03:00
Georgi Gerganov
3efb81dec6 build : add WHISPER_COREML_ALLOW_FALLBACK to make / CMake (#812) 2023-04-29 10:55:24 +03:00
Canis Lupus
94a7cd2a07 whisper : allow non-CoreML fallback when Core ML cannot be loaded (#812)
if the Core ML model cannot be loaded, continue without Core ML instead of
returning. This allows a single build to transcribe using Core ML models
where available, and regular models when not.
2023-04-29 10:49:02 +03:00
Georgi Gerganov
3e82ff4747 whisper : fix bug from previous commit 2023-04-29 10:42:14 +03:00
Georgi Gerganov
b5bd2f43c5 whisper : avoid designated initializers 2023-04-29 10:36:50 +03:00
AsukaMinato
94aa56f19e minor : improve C++ and Python style (#768)
* use some STL functions

* use self.field than setattr, use pathlib.Path

* recover some format

* const some iter

* Keep the original

* 2 space
2023-04-29 10:06:25 +03:00
Georgi Gerganov
4d89ee2e59 readme : add logo 2023-04-28 22:41:29 +03:00
Laytan Laats
70567eff23 main : escape quotes in csv output (#815) 2023-04-23 19:01:59 +03:00
Taras Glek
02ec83c5d5 stream : flush upon finishing inference (#811) 2023-04-23 17:00:30 +03:00
Philipp Zabel
2bd4b8d577 examples : add missing #include <cstdint> (#798)
common.cpp uses uint8_t and uint64_t, which are defined in <cstdint>.
2023-04-23 16:52:52 +03:00
Tauseef Mohiuddin
eecf2c3d41 main : update escape_double_quotes() function (#776)
Updated the escape_double_quotes() function such that the function now escapes both double quotes and backslashes in the input string.

Changes Made:

- Renamed the function to escape_quotes_and_backslashes

- Modified the condition in the first loop to increment the value of 'escaped_length' for both double quotes and backslashes.

- Modified the condition in second loop to add a backslash before the current character if it is a double quote or a backslash.

Resolves: #769
2023-04-23 16:47:30 +03:00
Georgi Gerganov
c23588cc4b release : v1.3.0 v1.3.0 2023-04-15 17:30:44 +03:00
Georgi Gerganov
5108b30e6d whisper : pad audio instead of spectrogram (#579)
Also, fallback only if more temperatures are available and if we are
at least 3 seconds before the end of the audio
2023-04-15 17:19:19 +03:00
Georgi Gerganov
f19e23fbd1 whisper : restore decoder temperature fallbacks
I disabled this because there were many complaints about slow decoding.
The current implementation does not allow batching the decoders when
using the "best of" or "beam size" parameters, so the decoding time is
proportional to the number of decoders, which is obviously not great.

However, now there are even more complaints about wrong decodings and
repetition.

So, making a compromise by re-enabling the fallbacks, but defaulting to
just 2 "best of" / "beam size" decoders. Also, the temperature step is
increased from 0.2 to 0.4 - i.e. from maximum of 5 fallbacks to maximum
of 2.

Also, the stream example now has fallbacks enabled by default.

close #471 #477 #508 #612 #719 #731
2023-04-15 16:12:55 +03:00
Jhen-Jie Hong
ea1f8a50d4 ggml, ci : fix build on whisper.android (ARM_NEON) + add CI (#764)
* ggml : fix undefined symbol by remove inline handle

* ggml : make own ggml_aligned_malloc function

* ci: add ios/android build
2023-04-15 14:21:58 +03:00
Georgi Gerganov
3dead611bb whisper : slightly faster Log Mel computation + n-1 FFT threads (#568) 2023-04-15 14:18:46 +03:00
Georgi Gerganov
355da83690 readme : fix link 2023-04-15 13:30:36 +03:00
Georgi Gerganov
3e5c49e59a readme : add usage instructions for Core ML 2023-04-15 13:30:07 +03:00
Georgi Gerganov
5e47e223bd whisper : add Core ML support (#566)
* coreml : use Core ML encoder inference

* coreml : simlpify whisper_encode + log messages

* whisper : resolve rebase conflicts

* coreml : add scripts for CoreML model generation

* bench-all : recognize COREML flag
2023-04-15 13:21:27 +03:00
Maximiliano Levi
794ff3074a whisper : do not launch log_mel threads when n_thread is 1 (#763) 2023-04-14 22:35:34 +03:00
AfryMask
7e2afa4384 whisper : fix the bug related to word splitting errors in the "tokenize" function. (#760)
Co-authored-by: AfryMask <afrymask@gmail.com>
2023-04-14 20:35:03 +03:00
Aaron Taylor
1c5edc3cb3 readme : add SwiftWhisper to listed bindings (#755) 2023-04-14 20:24:00 +03:00
Georgi Gerganov
34b772727d gitignore : add .test 2023-04-14 20:13:47 +03:00
Bader-eddine Ouaich
2c856fb9e5 whisper : fix potential memory leaks (#740)
* fix potential memory leak if whisper_init_state failed

* fix potential memory leak if gpt2_init failed
2023-04-14 20:05:56 +03:00
Anton Kostin
7727a40dc9 license : update year (#739) 2023-04-14 20:04:42 +03:00
GitAritron
b5639ed313 whisper : fix typos in whisper.h (#737)
Fixed a couple of typos (in comments, so nothing major). Keep up the great work 😄
2023-04-14 20:03:16 +03:00
Ali Alameh
2c4ac2627d stream : support language auto-detect (#501)
#445  fix Language auto-detect "auto" flag does not work using the stream tool
2023-04-14 20:02:18 +03:00
Alex Evgrashin
674a8e579b readme : add unity bindings (#733) 2023-04-14 19:59:44 +03:00
DGdev91
001083a769 talk, talk-llama : add basic example script for eleven-labs tts (#728) 2023-04-14 19:53:58 +03:00
Ivan Gorin
62b51c3070 models : change convert-pt-to-ggml to use .tiktoken tokenizer files (#725) 2023-04-14 19:50:39 +03:00
LittleLoli
61128870b8 cmake : add msvc compiler args /utf-8 fix error C3688 (#721)
* force msvc compiler use utf-8 encode

* only enable on msvc
2023-04-14 19:36:38 +03:00