Commit Graph

955 Commits

Author SHA1 Message Date
Qianhe Chen
ab1916fc59 ci : add node addon test and optimize compilation configuration (#468)
* addon: implement node addon call whisper through cpp

* addon: modify the license to MIT

* addon: remove iostream

* addon: rename dir

* addon: fix typo

* addon: configure cmake to build when cmake-js is used

* ci: add addon.node test ci

* addon: remove build WHISPER_BUILD_TESTS

* addon: update build command

* addon: add test

* addon: add test file

* addon: adapt to compile on Windows

* addon: fix typo

* addon: reuse jfk.wav

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* addon: reuse jfk.wav

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-02-05 15:02:08 +02:00
kamranjon
a1c1583cc7 whisper : add whisper_full_lang_id() for getting the context lang (#461) 2023-02-05 14:46:26 +02:00
Matija Pevec
d012b5c7e4 whisper : add "split_on_word" flag when using using "max_len" option (#455)
* Update whisper.cpp

* fix: trim function

* feat: added flag to split on word

* fix: arguments for main
2023-02-05 14:44:23 +02:00
Georgi Gerganov
b2083c5d02 release : v1.2.0 v1.2.0 2023-02-04 09:49:49 +02:00
Georgi Gerganov
f3ee4a9673 whisper : reduce memory usage during inference (#431)
* ggml : add "scratch" buffer support

* ggml : support for scratch ring-buffer

* ggml : bug fix in ggml_repeat()

* ggml : error on scratch buffer overflow

* whisper : use scratch buffers during inference (base model only)

* whisper : update memory usage for all models

* whisper : fix encoder memory usage

* whisper : use whisper_context functions instead of macros

* whisper : fix FF + remove it from README

* ggml : reuse ggml_new_i32

* ggml : refactor the scratch buffer storage

* whisper : reorder scratch buffers in the decoder

* main : add option to disable temp fallback

* Update README.md
2023-02-04 09:45:52 +02:00
Qianhe Chen
c306a7fd89 addon.node : using whisper as a Node.js addon (#443)
* addon: implement node addon call whisper through cpp

* addon: modify the license to MIT

* addon: remove iostream

* addon: rename dir

* addon: fix typo

* addon: configure cmake to build when cmake-js is used
2023-02-04 09:10:25 +02:00
polarmoon
b2fc4c7010 go : support "auto" as an option when set language (#462)
Co-authored-by: Ming <ming@localhost>
2023-02-04 09:09:27 +02:00
Georgi Gerganov
291980369c whisper : suppress task tokens (#442) 2023-02-04 09:03:14 +02:00
Taisei Mima
86ef64a855 wasm : fix typo in helper.js (#459) 2023-02-04 08:49:15 +02:00
Alex Bacart
3b1960520a main : CSV format export trimmed spaces fix (#444)
* Update main.cpp

Removed string trimming

* Update main.cpp

* Update main.cpp

* Revert "Update main.cpp"

This reverts commit d8924fdcfe.

* Revert "Update main.cpp"

This reverts commit 252e508d85.
2023-02-04 08:48:35 +02:00
Lukas Rist
2bee2650c6 go : add wrapper for system info (#456) 2023-01-28 18:44:56 +02:00
Robin
beb9512be3 go : add WhisperLangAutoDetect method to go binding (#451) 2023-01-27 01:14:20 +02:00
Eric Tendian
47737b2e82 livestream.sh : run main with model arg instead of default (#453)
Actually utilizes the $model var when calling ./main.
2023-01-27 01:13:31 +02:00
Georgi Gerganov
b992f3709e whisper : do not provide past prompt when n_max_text_ctx == 0 2023-01-25 20:01:00 +02:00
Georgi Gerganov
60337f5306 wasm : check if navigator.storage.estimate() is available
Safari does not support it
2023-01-25 20:00:59 +02:00
Lukas Rist
02c7516c57 go : added wrappers to reset and print timings (#436) 2023-01-25 18:57:30 +02:00
Georgi Gerganov
411ea9b833 ci : run workflows on pull requests + bindings depend on .h (#446) 2023-01-25 18:50:50 +02:00
Ondrej Kokes
11f61cecd6 whisper.wasm : add labels for easier radio selection (#435) 2023-01-23 20:49:00 +02:00
Georgi Gerganov
b5ddb16ec7 whisper : condition timestamps to be monotonically increasing (#425) 2023-01-23 20:48:26 +02:00
fitzsim
ae16c21e9c whisper : PPC64 big-endian support (#398)
* ggml : set cache line size to 128 on POWER9

* whisper : add PPC64 big endian support
2023-01-23 20:48:10 +02:00
Georgi Gerganov
2c3f50a021 release : v1.1.1 v1.1.1 2023-01-23 20:23:44 +02:00
Georgi Gerganov
9a65269a20 .gitignore : add arm_neon.h 2023-01-23 20:19:04 +02:00
Georgi Gerganov
78f166174f whisper : fix condition for providing past prompt (critical)
This bug has been present since v1.1.0.

Effectively, the past transcribed text wasn't being used for following
transcriptions, which likely significantly reduces the transcription
quality.

Likely related to #419
2023-01-22 10:47:01 +02:00
Georgi Gerganov
21c569ba4a whisper : extend information in whisper_print_timings() 2023-01-19 18:50:33 +02:00
Georgi Gerganov
1a91c19af9 whisper : perform entropy check only when we have at least 32 tokens (#412) 2023-01-18 22:52:18 +02:00
Georgi Gerganov
f583e2d2f5 main : we had accidentally disabled the temperature fallback .. (#291) 2023-01-18 22:51:41 +02:00
Georgi Gerganov
206fc93396 whisper.wasm : add small and small.en models 2023-01-18 21:58:55 +02:00
Georgi Gerganov
a6cf6f4c4a bench : minor fixes 2023-01-18 21:40:10 +02:00
Chia-Hsiang Cheng
472a473fd1 main : add an option to accept optional output filenames (#424)
* Add an option to accept optional output filenames

* Format the file

Co-authored-by: Chia-Hsiang Cheng <gary.chiahsiang.cheng@gmail.com>
2023-01-18 21:26:31 +02:00
Georgi Gerganov
9ba66c2fad stream : fix handling of --step == --length (#416) 2023-01-18 21:22:52 +02:00
Georgi Gerganov
1ccb8a46a5 bench : fix Windows linkage by moving ggml benches in whisper lib .. 2023-01-18 21:19:50 +02:00
Georgi Gerganov
1290fc6457 bench : add memcpy and ggml_mul_mat benchmarks 2023-01-18 20:31:46 +02:00
Digipom
49b529ba74 whisper.android : add support for loading directly from asset in C (#415) 2023-01-16 21:57:35 +02:00
Georgi Gerganov
8088a977af whisper : fix possible uninitialized variables (#291) 2023-01-16 21:44:40 +02:00
Georgi Gerganov
c9aeb33676 stream : fix --keep_context argument to be used correctly (#354) 2023-01-16 19:37:40 +02:00
Damian Czaja
4a3f0d3fe9 go : remove sample_best and sample_timestamp bindings (#409) 2023-01-16 19:18:10 +02:00
Georgi Gerganov
874bde887e Update README.md 2023-01-16 18:47:31 +02:00
Georgi Gerganov
8738427dd6 cmake : bump version to 1.1.0 v1.1.0 2023-01-15 14:33:13 +02:00
Georgi Gerganov
c3991bbb24 Update README.md 2023-01-15 14:08:12 +02:00
Georgi Gerganov
00ea21668b whisper : account speed_up flag for short audio (close #405) 1.1.0 2023-01-15 12:42:15 +02:00
Georgi Gerganov
0b85e8c401 Update README.md 2023-01-15 11:36:20 +02:00
Georgi Gerganov
fafd78945d bench.wasm : print system info 2023-01-15 11:34:03 +02:00
Georgi Gerganov
8de452c18b Improve decoding (#291)
* whisper : prepare infra for new decoding strategies

* whisper : apply logit filters and compute logprobs

* whisper : add whisper_get_logits()

* whisper : separate self and cross attention memory

Initial step needed for supporting parallel decoders

* whisper : move probs_id buffer to whisper_context

* whisper : refactor kv cache into separate struct

* whisper : move self-attention kv cache to whisper_decoder

* whisper : wip decoding parameters + strategies

* whisper : wip decoding parameters + strategies (part 2)

* whisper : wip decoding parameters + strategies (part 3)

* whisper : wip decoding parameters + strategies (part 4)

* whisper : fix prompt_past update to not include prompt_init

* whisper : temperature + best_of support

* whisper : support for compression_ration_threshold

We actually use entropy, but it is similar

* command : fix example to use logits instead of obsolete probs

* whisper : handle empty sequence ranking

* whisper : add WHISPER_DEBUG + diagnostic prints + new main args

* whisper : minor fixes

* whisper : add beam-search support

* whisper : bug fix when there no previous context

* whisper : add comments

* stream : disable temperature fallback

For real-time processing, we always want a single decoder running at T=0

* whisper.swiftui : update example - fix paths + add empty folders
2023-01-15 11:29:57 +02:00
Georgi Gerganov
a6dbd9188b stream : fix a bug that inserted a lot of empty audio at the start
The quality was terrible due to this
2023-01-14 19:20:47 +02:00
Georgi Gerganov
4ef3398e8f ggml : remove obsolete zeroing + comment fixes (#390) 2023-01-08 20:21:03 +02:00
Ian Bicking
5e9f33596f readme : clarify main and stream usage (#391)
Give an example of ./main that uses a sample file that's already there, and make the stream example clarify you need `make stream`
2023-01-08 20:18:41 +02:00
Abitofevrything
8d7b29cedd ggml : correct behaviour of ggml_vec_sum_f32 (#390) 2023-01-08 20:06:09 +02:00
boolemancer
08dc705a69 whisper : fix sample_to_timestamp calculation with 64 bit precision to avoid overflow (#388)
* Do calculation with 64 bit precision to avoid overflow

* Update whisper.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 15:08:45 +02:00
Syahmi Azhar
1512545149 whisper : add loader class to allow loading from buffer and others (#353)
* whisper : add loader to allow loading from other than file

* whisper : rename whisper_init to whisper_init_from_file

* whisper : add whisper_init_from_buffer

* android : Delete local.properties

* android : load models directly from assets

* whisper : adding <stddef.h> needed for size_t + code style

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 13:03:33 +02:00
Georgi Gerganov
52a3e0c92a ggml : improve vec_dot_f16 unrolling in flash_attn_f16 2023-01-08 11:41:18 +02:00