Commit Graph

4 Commits

Author SHA1 Message Date
Georgi Gerganov
55e422109b
scripts : add turbo-q8_0 to the benchmark 2024-10-29 19:37:24 +02:00
Georgi Gerganov
8a35b58c4f
scripts : bench v3-turbo 2024-10-05 16:22:53 +03:00
Georgi Gerganov
7094ea5e75
whisper : use flash attention (#2152)
* whisper : use flash attention in the encoder

* whisper : add kv_pad

* whisper : remove extra backend instance (huh?)

* whisper : use FA for cross-attention

* whisper : use FA for self-attention

* whisper : simplify encoder FA

* whisper : add flash_attn runtime parameter

* scripts : add bench log

* scripts : add M1 Pro bench log
2024-05-15 09:38:19 +03:00
Georgi Gerganov
52ccd4a3a8
files : rename ./extra to ./scripts 2024-04-09 20:13:41 +03:00