whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-04 14:32:52 +02:00

Files

Georgi Gerganov 7094ea5e75 whisper : use flash attention (#2152 )

* whisper : use flash attention in the encoder

* whisper : add kv_pad

* whisper : remove extra backend instance (huh?)

* whisper : use FA for cross-attention

* whisper : use FA for self-attention

* whisper : simplify encoder FA

* whisper : add flash_attn runtime parameter

* scripts : add bench log

* scripts : add M1 Pro bench log

2024-05-15 09:38:19 +03:00

bench-all-gg.txt

whisper : use flash attention (#2152 )

2024-05-15 09:38:19 +03:00

bench-all.sh

whisper : use flash attention (#2152 )

2024-05-15 09:38:19 +03:00

bench-wts.sh

files : rename ./extra to ./scripts

2024-04-09 20:13:41 +03:00

bench.py

files : rename ./extra to ./scripts