Georgi Gerganov
|
7094ea5e75
|
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder
* whisper : add kv_pad
* whisper : remove extra backend instance (huh?)
* whisper : use FA for cross-attention
* whisper : use FA for self-attention
* whisper : simplify encoder FA
* whisper : add flash_attn runtime parameter
* scripts : add bench log
* scripts : add M1 Pro bench log
|
2024-05-15 09:38:19 +03:00 |
|
Georgi Gerganov
|
f56b8305c4
|
sync : ggml
|
2024-05-14 19:16:32 +03:00 |
|
Georgi Gerganov
|
130f43e4b8
|
scripts : sync ggml-rpc
|
2024-05-14 19:15:35 +03:00 |
|
Georgi Gerganov
|
fe179ae0cc
|
sync : ggml
|
2024-05-13 11:02:26 +03:00 |
|
Georgi Gerganov
|
8f253ef3af
|
sync : ggml
|
2024-04-09 20:27:55 +03:00 |
|
Georgi Gerganov
|
c7dc37f97c
|
license : update copyright notice + add AUTHORS
|
2024-04-09 20:27:44 +03:00 |
|
Georgi Gerganov
|
3b8aade3c2
|
scripts : update sync
|
2024-04-09 20:25:50 +03:00 |
|
Georgi Gerganov
|
52ccd4a3a8
|
files : rename ./extra to ./scripts
|
2024-04-09 20:13:41 +03:00 |
|