Commit Graph

613 Commits

Author SHA1 Message Date
Georgi Gerganov
8de452c18b
Improve decoding (#291)
* whisper : prepare infra for new decoding strategies

* whisper : apply logit filters and compute logprobs

* whisper : add whisper_get_logits()

* whisper : separate self and cross attention memory

Initial step needed for supporting parallel decoders

* whisper : move probs_id buffer to whisper_context

* whisper : refactor kv cache into separate struct

* whisper : move self-attention kv cache to whisper_decoder

* whisper : wip decoding parameters + strategies

* whisper : wip decoding parameters + strategies (part 2)

* whisper : wip decoding parameters + strategies (part 3)

* whisper : wip decoding parameters + strategies (part 4)

* whisper : fix prompt_past update to not include prompt_init

* whisper : temperature + best_of support

* whisper : support for compression_ration_threshold

We actually use entropy, but it is similar

* command : fix example to use logits instead of obsolete probs

* whisper : handle empty sequence ranking

* whisper : add WHISPER_DEBUG + diagnostic prints + new main args

* whisper : minor fixes

* whisper : add beam-search support

* whisper : bug fix when there no previous context

* whisper : add comments

* stream : disable temperature fallback

For real-time processing, we always want a single decoder running at T=0

* whisper.swiftui : update example - fix paths + add empty folders
2023-01-15 11:29:57 +02:00
Georgi Gerganov
a6dbd9188b
stream : fix a bug that inserted a lot of empty audio at the start
The quality was terrible due to this
2023-01-14 19:20:47 +02:00
Georgi Gerganov
4ef3398e8f
ggml : remove obsolete zeroing + comment fixes (#390) 2023-01-08 20:21:03 +02:00
Ian Bicking
5e9f33596f
readme : clarify main and stream usage (#391)
Give an example of ./main that uses a sample file that's already there, and make the stream example clarify you need `make stream`
2023-01-08 20:18:41 +02:00
Abitofevrything
8d7b29cedd
ggml : correct behaviour of ggml_vec_sum_f32 (#390) 2023-01-08 20:06:09 +02:00
boolemancer
08dc705a69
whisper : fix sample_to_timestamp calculation with 64 bit precision to avoid overflow (#388)
* Do calculation with 64 bit precision to avoid overflow

* Update whisper.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 15:08:45 +02:00
Syahmi Azhar
1512545149
whisper : add loader class to allow loading from buffer and others (#353)
* whisper : add loader to allow loading from other than file

* whisper : rename whisper_init to whisper_init_from_file

* whisper : add whisper_init_from_buffer

* android : Delete local.properties

* android : load models directly from assets

* whisper : adding <stddef.h> needed for size_t + code style

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-08 13:03:33 +02:00
Georgi Gerganov
52a3e0c92a
ggml : improve vec_dot_f16 unrolling in flash_attn_f16 2023-01-08 11:41:18 +02:00
Georgi Gerganov
d1ea1220ff
command : clean-up / refactoring / formatting (#383) 2023-01-07 21:43:24 +02:00
David
9c4a1522f6
command : always-prompt mode (#383) 2023-01-07 21:41:11 +02:00
David Thorpe
f078a6f20e
go : adding features to the go-whisper example, go ci, etc (#384)
* Updated bindings so they can be used in third pary packages.

* Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin

* Added test script

* Changes for examples

* Reverted

* Made the NewContext method private
2023-01-07 21:21:43 +02:00
Georgi Gerganov
f30b5d322c
ggml : fix bug in new soft max computation 2023-01-07 21:00:07 +02:00
Georgi Gerganov
44efbf7ff1
cmake : add -Wno-unused-function + update whisper.js 2023-01-07 20:18:34 +02:00
Georgi Gerganov
d347a59a5f
ggml : when using BLAS start only 1 CPU thread 2023-01-07 19:48:56 +02:00
Georgi Gerganov
6394c906af
ggml : fix running tasks with variable number of threads 2023-01-07 19:20:18 +02:00
Georgi Gerganov
74ffa14e1d
ggml : unroll ggml_vec_dot_f16 in ggml_compute_forward_flash_attn_f16 2023-01-07 19:19:40 +02:00
Georgi Gerganov
65fdcbbbbb
whisper : revert accidental MB change 2023-01-07 16:18:21 +02:00
Georgi Gerganov
d61d55cd4b
ggml : speed-up soft max via Accelerate + unroll 2023-01-07 16:16:42 +02:00
Georgi Gerganov
d51fc3ee0a
ggml : use vDSP_sve and vDSP_maxv from Accelerate 2023-01-07 16:10:16 +02:00
Georgi Gerganov
f82a7dd019
ggml : make gcc happy (minor) 2023-01-07 09:34:39 +02:00
Georgi Gerganov
87dd4a3081
talk.wasm : bump memory usage + update whisper.js 2023-01-06 21:13:44 +02:00
m.bell
41e05c6b1b
cmake : support AVX2 in Windows better (#381) 2023-01-06 19:36:33 +02:00
Georgi Gerganov
fa379cb22a
Revert "tmp"
This reverts commit 1652965529.
2023-01-06 19:33:09 +02:00
David Thorpe
322f4e6c4e
go : bindings updated so they can be used in third party packages. (#379)
* Updated bindings so they can be used in third pary packages.

* Updated makefiles to set FMA flag on optionally, for xeon E5 on Darwin
2023-01-06 19:32:28 +02:00
Georgi Gerganov
1652965529
tmp 2023-01-06 19:32:12 +02:00
Georgi Gerganov
6042c7a3be
cmake : change min required version to 3.0 (#351)
We increase the min version only when want to use particular
functionality that is available in the newer version
2023-01-06 19:25:28 +02:00
Georgi Gerganov
6b351bb669
command : add "guided-mode" video demo in the README.md 2023-01-06 18:59:26 +02:00
Abitofevrything
a62170c656
ggml : add SSE3 and fp16 conversion lookup table (#368)
* Improves WASM performance:
  On MacBook M1 Pro, I observe 25% faster using Firefox and 35% faster using Chrome

* Add support for SSE3 SIMD

* Add SSE3 to system information

* Add Imath support for fp16-fp32 conversions

* Add Imath to system information

* Wrap Imath calls to avoid static function warnings

* Drop Imath; Add lookup table for f16 -> f32 conversions

* Remove TODO comments

* Update SSE3 to new macro arguments

* Correct updated macro definitions

* Prefer static inline where possible

* ggml : static inlines + add public f16 <-> f32 conversions

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-01-06 18:45:59 +02:00
Thomas Fitzsimmons
1944e7c33e whisper : document POWER VSX support 2023-01-05 23:53:00 +02:00
Thomas Fitzsimmons
49a8dd6732 ggml : reorganize POWER9 ppc64le SIMD code 2023-01-05 23:53:00 +02:00
Thomas Fitzsimmons
8c7f642286 ggml : change f16 load and store macro arguments 2023-01-05 23:53:00 +02:00
Georgi Gerganov
ad2a4ffa03
whisper : do not use F16 tensors when in F32 mode (#369) 2023-01-05 22:56:25 +02:00
Georgi Gerganov
b3c865083e
ci : add emscripten build 2023-01-05 22:10:20 +02:00
Georgi Gerganov
a0d4f8e65c
main : make whisper_print_segment_callback() more readable (close #371) 2023-01-05 21:45:05 +02:00
Georgi Gerganov
4a214d2f07
cmake : add CMAKE_RUNTIME_OUTPUT_DIRECTORY
Currently needed by the wasm examples
2023-01-05 21:40:59 +02:00
Georgi Gerganov
0a0cfa7985
ggml : add void to argument-less functions 2023-01-05 21:40:38 +02:00
Georgi Gerganov
196d738974
minor : close #370 + Makefile build info print change 2023-01-05 21:35:45 +02:00
Andy Maloney
84c6b42e65
cmake : update to 3.19 (#351)
- update from 3.0 (from 2014) to 3.19 (from 2020)
- move some global setting onto the targets (through a cmake include)
2023-01-05 21:22:48 +02:00
Andy Maloney
dd6d582977 whisper : use ranged-based for loops for readability 2023-01-05 21:20:44 +02:00
Georgi Gerganov
d51c5eb906
ggml : define MIN / MAX only if not defined (minor) 2023-01-05 21:16:52 +02:00
Georgi Gerganov
0be6a1afd9
make : print build information 2023-01-02 13:35:26 +02:00
Georgi Gerganov
a466c3404d
stream : fix data race on bool + avoid division-by-zero 2023-01-02 10:20:50 +02:00
Georgi Gerganov
d629c034a4
models : fix HF model URL (close #356) 2023-01-02 09:54:43 +02:00
Andy Maloney
f00509d57c
command : refactor to split command list & general transcription modes (#331)
This makes it easier to understand if you're looking for only one of the capabilities.
2022-12-31 14:08:57 +02:00
Thomas Fitzsimmons
424c410c42 ggml : improve f16 acceleration for POWER9 ppc64le 2022-12-31 10:02:19 +02:00
Georgi Gerganov
d97e6005e9
whisper : add whisper_n_audio_ctx and check for invalid audio_ctx
closes #344
2022-12-31 09:57:19 +02:00
Ikko Ashimine
3467230a77 models : fix typo in convert-h5-to-ggml.py
signficant -> significant
2022-12-31 09:49:01 +02:00
Avik Sengupta
a091581eb3
cmake : add runtime destination install (#345)
needed for mingw32 build to successfully install the dlls in the correct location
2022-12-31 09:48:00 +02:00
Georgi Gerganov
68daf6e487
whisper : avoid some memory allocations 2022-12-30 13:43:48 +02:00
Niels Mayer
a593b932e4
main : add -ocsv, aka --output-csv to output a CSV file
Adds -ocsv, aka --output-csv feature to examples/main, which outputs a CSV file containing lines formatted as follows <startTime-in-integer-milliseconds>, <endTime-in-integer-milliseconds>, "<transcript-line-including-commas>".
2022-12-29 14:04:00 +02:00